R Guru      

______________________________________________ 

         Become an R-Guru, Sign up for free R webinars

______________________________________________

Webinars

What is R and Why Should you Learn R? 


Basics

R Programming Style   Tidy Programming Style

  

                                   RStudio Cheat Sheets                                                                  R Reference Card  

Hardware Configuration: Workbench, Connect, Package Manager

Required Steps: 1. Install R 4.0.4 for Windows, 2. Install RStudio, 3. Install RStudio Learnr Package

Videos: Install R Packages

   

Tutorials Point Quick Guide       The Analysis Factor Tutorials

R Examples  

R Programming Style Types

      

Data Access   and Data Management  

     

Data Processing (Wrangling) book                                                              Data Cleaning book

Data Reporting, Analysis and Plotting  

 Common Operations  R Symbols
Variable Assignment <-, <<-, ->, ->>
Arthritic +, -, *, /
Comparison Operators return logic >, <, ==, !=, <=, >=
Logical &, ||, !
 Other :, %in%

        

R Functions                                                                           Common R Functions

______________________________________________

Advance

Advanced R Solutions book    Custom Functions

R4Stats - SAS and R code GitHub-Sheets R Project for Statistical Computing  

             

Advanced R book

______________________________________________ 

Compare with SAS

Run SAS in R: R Markdown, Example 1, Example 2

Run R in SAS:  SAS Programming for R Users book, Example 1, Example 2

Guru99 Blog  EDUCBA Blog   SAS and R Examples  Read/Write SAS Datasets

R Package for SAS Programmers

Example 1: Listing


Example 2: Summary Stats Table


Example 3: Figure

        

flextable() vs Proc Report



 SAS Syntax  R Function
proc print data=d1; var _all_; run;  head(d1) , tail(d1)

proc freq data=d1; tables sex*race; run;

table(d1$sex, d1$race)

prop.table()

proc univariate;

summary(d1)

proc sort data=dm2; by sex race; run;

dm2[order(dm1$sex, dm1$race)]  
proc format;

In Vectors: sex_code <- c(‘M’, ‘M’, ‘M’, ‘F’, ‘M’) # 1. data values in simple vector to store data values

sex_decode <- c(‘M’=’Male’, ‘F’=’Female’) # 2. named vector data = ‘label’ for values similar to proc format

sex <- sex_decode[sex_code] # 3. converts values to labels sex_code vector is subset of sex_code vector


As Functions: age_cat <- vectorize(function(x) { # x is input value

if (x < 18) { # condition

ret <- "< 18" # return label

} else if (x >= 18 & x < 24) {

ret <- "18 to 24"

} else if (x >= 24 & x < 45) {

ret <- "24 to 45"

} else if (x >= 45 & x < 60) {

ret <- "45 to 60"

} else if (x >= 60) {

ret <- "> 60"

} else {

ret <- "Unknown"

}

return(ret) })

df$age_cat <- age_cat(df$age) # apply function to age variable to create age_cat variable

 proc means;

summarise(AllPages = sum(Pages),

AvgLength = mean(Pages),

AvgRating = mean(MyRating),

AvgReadTime = mean(read_time),

ShortRT = min(read_time),

LongRT = max(read_time),

TotalAuthors = n_distinct(Author)) 

 proc contents;

library(hmisc)

content(dm)

 proc compare;

cmp <- comparedf(mockstudy, mockstudy2, by = "case", tol.vars = c("._ ", "case"), int.as.num = TRUE)

n.diffs(cmp) 

 proc report; flextable()

______________________________________________

R and SQL

Libraries: SQLDF(), DPLY, DDPLYR()

CRAN package     SQL Server

SQLDF() rdrr.io

        

SQLDF() in R                                                   SQLDF() Tutorial


   

SQLDF() Examples

  

Introduction to DPLY  

 SQL Feature  SQL Clause
 Select variables select()
 Create new variables definition mutate() & chg=
 Create new variable conditional logic ifelse()
 Where condition filter(),
 Group by group_by()
 Sort by arrange()

 R DataFrame Joins  SQL Function

left join to keep all dm1, dm1 and dm2 with different key variables


merge left join to keep all dm1, dm1 and dm2 with same key variables

merge right join to keep all dm2, dm1 and dm2 with same key variables


merge inner join for only dm1 and dm2 matching with same key variables

merge outer join for all dm1 and dm2, dm1 and dm2 with same key variables, similar to all.x=TRUE, all.y=TRUE

dm3 = left_join(dm1, dm2, by = c(‘usubjid’=’usubjid’, ‘race’=’race1’))

dm3 = merge(x=dm1, y=dm2, by = ‘usubjid’, all.x=TRUE)

dm3 = merge(x=dm1, y=dm2, by = ‘usubjid’, all.y=TRUE)

dm3 = merge(x=dm1, y=dm2, by = ‘usubjid’)

dm3 = merge(x=dm1, y=dm2, by = ‘usubjid’, all=TRUE) 

   
   

______________________________________________ 

Blogs, Classes and References

Listen Data R blog  Guru Blog   Wikipedia  R-blogger   R-Tutor   Joyti's R-blog

DataCamp    CodeAcademy    DataMentor     R-Journal    Data Table   Study Trials blog

YouTube Videos  R Studio (Webinars)   GitHub-Submission   R-Bootcamp   Kickstarting R

R in Clinical Trial Data Analysis [YouTube]

R Consortium   How to Create a Pie Chart in R using GGPLOT2

R Programming in a Clinical Trial Data Analysis blog

TFL programming in R versus SAS blog

Interesting packages taken from R/Pharma blog

UCLA: Introduction to R, SeminarFAQs

A RISK-BASED APPROACH FOR ASSESSING R PACKAGE ACCURACY WITHIN A VALIDATED INFRASTRUCTURE Blog


Quick-R

            

Statistics Globe (Graph Gallery)



Geeks for Geeks - Task based examples

______________________________________________

Pharma Industry

Clinical Reporting with R   R/Pharma   

GitHub: Sample SDTM/ADaM  Pharma Packages   R4CDISC/R4DSXML

Using R in the Pharmaceutical Industry blog

Using the Statistical Programming Language R in the Pharma Industry blog

R Programming in clinical trial data analysis Blog by Shrishaila Patil

The Rise of R-should SAS programmers get up to speed? blog

FDA: R OK for drug trials blog

OpenFDA: Github, FDA SiteR Package

R: Regulatory Compliance and Validation Issues A Guidance Document for the Use of R in Regulated Clinical Trial Environments

R for Biostatisicians [Presentation]

Clinical Trials Package (R Packages for Clinical Trial Design, Monitoring, and Analysis)

Design and Monitoring, Design and Analysis, Analysis for Specific Designs, Analysis in General, Meta-Analysis
  • Atable - Creates Tables for Reporting Clinical Trials
  • compareODM - Comparison of medical forms in CDISC ODM format
  • CRTSize - Sample size estimation in cluster (group) randomized trials
  • Blockrand - Creates randomizations for block random clinical trials
  • DoseFinding - Supports design and analysis of dose-finding experiments
  • Pact - Predictive Analysis of Clinical Trials
  • SASxport - Read and write 'SAS' 'XPORT' Files
  • ADCT - Adaptive Design in Clinical Trials
  • ClinPK, cpk - Clinical Pharmacokinetics Toolkit
  • randomizeR - Randomization for Clinical Trials
  • Greport - Graphical Reporting for Clinical Trials

         Tools for Clinical Data Management Package

        atable: Create Tables for Reporting Clinical Trials Package [Documentation  Article]

R Programming for SAS Viya

    ______________________________________________

    Data Science

    ANOVA in R

    Tidyverse Cheat Sheet

    ______________________________________________

    R Shiny

    R Shiny Examples  

    RShinyTLF   OpenFDA RShiny   DemoRShinyTLF (PhUSE)

    R Shiny Lessions  Towards Data Science blog

    R Shiny Gallery

    ______________________________________________

    R Markdown

    Introduction   PDF Reports   Cheetsheet   Guide


    Gallery

    ______________________________________________

    Books 

    R Packages   R for Data Science   Efficient R   R in Action   Data Cleaning with R

    DataQuest   SAS and R    R and Relational Databases    A Little Book of R For Time Series [HTML]

    Quick Start Guide for R   Learn to Use R   Flextable

    Hands-On Programming with R [Debugging]   Cookbook   Mastering Shiny

    Beginner Level

              

     UC of Riverside                                         University of Georgia

    Glossary: An Introduction to R book (PDF)

    ______________________________________________

    Popular R Packages

    DataCamp FAQs   CRAN Glossary   CRAN FAQs

    Functional Data Analysis (FDA)   PURR (Functional Programming)


    Tidyverse  Manifesto  Design Guide   Introduction to Tidyverse   Book

    Getting Started with Tidyverse


    Tidy Definitions and Functions

    Microsoft  Data.Table   RStudio   

    gt Package

    ______________________________________________

    Debugging

    1. Debug – Essential Principles and Functions that you can’t miss! [Data-Flair]

    2. Debugging in R – How to Easily Overcome Errors in Your Code?

    3. Error Handling in R


    Papers


    1. Exploring Use of R for Clinical Trials, Kalpesh Prajapati

    2. Is R language reliable and efficient tool for programming SAS datasets or just art for art’s sake?, Piotr Podlewsk

    3. Clinical Trial Datasets (CDISC - SDTM/ADaM) Using R, Prasanna Murugesan [Compare Code]

    4. A Gentle Introduction to R From A SAS Programmer's Perspective, Saranya Duraisamy [Beginner Level]

    5. Python and R made easy for the SAS® Programmer, Janet Li [Compare Code]

    6. R for SAS programmers: It’s different, but friendly, Friedrich Schuster

    7. SAS® and R - stop choosing, start combining and get benefits!, Diana Bulaienko

    8. SAS and R Playing Nice Together, David Edwards, Bella Feng, Brian Schultheiss

    9. Can clinical trial data sets (CDSIC - SDTM/ADaM) be generated using R? Blog

    10. R: Validation Hub [FDA]

    11. CRAN Task View: Clinical Trial Design, Monitoring, and Analysis (packages)

    12. SAS® and R Working Together, Matthew Cohen [Date formats]

    13. Best Practices for Reproducible Package Management in R

    14. Techniques for writing robust R programs, Martin Gregory , Merck Serono

    15. How do I select an R package for my clinical workflow?, Sean Lopp & Phil Bowsher

    16. The Challenges of Validating R [Presentation]

    17. Using R to Drive Agility in Clinical Reporting [GSK Presentation]

    18. CDISC Dataset-XML – A new Dataset Structure for Clinical Trial Data Transport for Future Drug Submissions, Jörg Dillert [R4CDISC, R4DSXML]

    19. End to End Interactive TLF using R Shiny, Rohit Banga

    20. Is there any better option than SAS for TLFs? Yes, there R!, Niccolo Bassani [Presentation - ddply(), Proc REPORT]

    21. A quick introduction to plyr, Sean Anderson [PLYR package splits data]

    22. Using R Programming for Clinical Trial Data Analysis Blog

    23. The SAS® Versus R Debate in Industry and Academia, Chelsea Loomis Lofland, Rebecca Ottesen [Compare SAS and R]

    24. Using R in a Regulatory Environment: some FDA perspectives [Presentation]

    25. Using R: Perspectives of a FDA Statistial Reviewer [Presentation]

    26. R-Pharma Papers

    27. R for Clinical Reporting, Yes - Let's Explore It!, Hao Meng, Yating Gu, Yeshashwini Chenna (SASxport, sas7bdat, dplyr)

    28. Generating ADaM Compliant ADSL Dataset Using R, Vipin Kumpawat

    29. Generating TFLs in R - Challenges and Successes compared to SAS, Amol Waykar, Kevin Kramer, Kalyani Komarasetti, Andrew Miskell

    30. Using the R interface in SAS ® to Call R Functions and Transfer Data, Bruce Gilsen

    31. Expand Your Skills from SAS® to R with No Complications, Andrii Korchak

    32. Simulation in SAS with Comparisons to R, Chelsea Loomis Lofland, Rebecca Ottesen

    33. Normal is Boring, Let’s be Shiny: Managing Projects in Statistical Programming Using the RStudio® Shiny® App, Girish Kankipati, Hao Meng

    34. Building Automations for Generating R and SAS Code Supporting Visualizations Across Multiple Therapeutic Areas, Anastasia Alexeeva, William Martersteck, and Mei Zhao

    35. Effective Exposure-Response Data Visualization and Report by Combining the Power of R and SAS Programming, Shuozhi Zuo, Hong Yan

    36. A Brief Introduction to Performing Statistical Analysis in SAS, R & Python, Erica Goodrich, Daniel Sturgeon

    37. The Power of Data Visualization in R, Babych Oleksandr

    38. Open-NCA – R Scripts for CDISC-based Pharmacokinetic Analysis, Peter Schaefer

    39. Statistical Computing Environments in CDER, Paul Schuette

    40. Use of R Script to Create Trial Summary (ts.xpt) Domains for Nonclinical SEND Studies, Bob Friedman, Xybion; Anthony Fata, William Varady, William Houser, Kevin Snyder

    41. R Package Oriented Software Development Life Cycle in Regulated Clinical Trial Environments, Yalin Zhu, Rinki Jajoo, Clare Bai, Sarad Nepal, Daniel Woodie, Keaven Anderson, Yilong Zhang [Presentation]

    42. R for SDTM and ADaM Data [Presentation]

    43. R syntax for SAS programmers, Max Cherny [Beginner, Tidyverse]

    44. Creating Graphs Simply with SAS® or R, John O’Leary, Jaclyn Scholl

    45. Techniques for writing robust R programs, Martin Gregory , Merck Serono

    46. Seamless R And SAS: For Shiny Visualizations, Pragathi Kotha Venkata [Presentation]

    47. R for the Analysis of Clinical Data, Greg Jones [Presentation]

    48. Numerical validation as a critical aspect in bringing R to the Clinical Research, Adrian Olszewski [R and SAS differences]

    ______________________________________________

    Common FAQs

    1. What are common syntax for libname, filename and reading datasets?    See R paper.

    sdtm <- "//product/study/analysis/data/sdtm" # assign libname to object named sdtm

    out <- "//product/study/analysis/data/adam"  # assign out filename to path

    library(haven) # required to read SAS datasets

    dm <- read_sas(file.path(sdtm,"dm.sas7bdat")) # read sas file as a data frame

    #'read_sas' function from the haven package (part of the tidyverse)

    taadmin <- read_sas("H:/rproject/project_y_r2/taadmin.sas7bdt")

    2. What is R? R is a programming language and free software developed by Ross Ihaka and Robert Gentleman in 1993. R possesses an extensive catalog of statistical and graphical methods.

    3. Is there a website to run example R programs?  Yes, see site.

    4. What is R? R is a programming language that uses the concept of functions to create objects to be linked together.  There are many rules to understand and follow.

    5. Why should you learn R?  Since R is so vast and can be confusing quickly, I suggest you identify what you plan to use R for such as data access, data management, data reporting or data analysis.  In the pharma industry, R got a jump start with graphs that were easy to create.  I suggest you have patients learning R since may require more syntax than you may be expecting. 

    6. What are useful methods to learn R? Show and Tell is useful to see and run R syntax on a command by command basis.  Taking your own notes helps to retain and build understaning.  Cheat sheet are only helpful if you recognize the syntax and it's purpose.  Since R is very technical, try to focus on your task and master the those few sets of commands.  How-To step checklist and examples help to remind users how to run R syntax.

    7. What is compariable R syntax for Data step KEEP and WHERE statements? 

    # complete cases and select

    taadmin2 <- taadmin %>% filter(complete.cases(taadmin[["DOSECUN"]])) %>% select(INV, PT, DCMDATE, DOSECUN, DOSETL)

    data taadmin2 (keep=inv pt dcmdate dosecun dosetl flag); set taadmin (where=(dosecun is not null)); run;

    8. How can you get first. and last. records in R?

    taadmin <- taadmin[order(taadmin$PT, taadmin$DCMDATE),]

    taadmin <- cbind(taadmin, flagf=0, flagl=0)

    taadmin[[6]] <- (!duplicated(taadmin$PT))

    taadmin[[7]] <- (!duplicated(taadmin$PT,fromlast=TRUE))

    proc sort data = taadmin2; by pt dcmdate; run;

    data taadmin3; set taadmin2; by pt dcmdate; flagf=0; flagl=0; if first.pt then flagf=1; if last.pt then flagl=1; run; 

    9. What is an example of a simple R plot? Below syntax will create a cty by hwy plot.

    g <- ggplot(data = mpg, aes(x = cty, y = hwy))

    10. How can you read csv file in R?

    data1 <- read.csv("./data/DiastolicBloodPressure_initial.csv")

    11. What are useful R metadata functions to display data frame attributes?

    tg <- ToothGrowth # save sample data frame to tg data frame

    View(tg) # browse tg

    str(tg) # display tg attributes and sample data

    attributes(tg) # display tg attributes

    head(tg) # display tg sample records

    print(tg) # display tg all records

    stats <- summary(tg) # create stats object of continuous vars

    print(stats) # display tg stats object

    freq <- table(tg) # create freq of categorical vars

    print(freq) # display tg freq object 

    12. What is a simple R function?

    info <- function(d) { writeLines("First display the structure of the data frame.")  str(d)

    writeLines("Then print the first 6 observations to see the variables and values.") print(head(d)) }

    # Call the new ‘info’ function:

    info(data1) 

    13. What are R date formats?

    %d for day of month, %a for 3 digit day, %A for full day, %m for short month, %b for 3 digit month, %B for full month, %y for 2 digit year, %Y for 4 digit year

    14. What is the difference between package and library? A package is a like a book, a library is like a library; you use library() to check a package out of the library.

    15. What is your working directory?  R is always pointed at a directory on your computer. Often this will be your home directory. When you work within a RStudio project, the working directory will be the head of that directory. You can find out which directory by running: getwd.

    16. What packages are installed? installed.packages()

    17. What are useful R packages and libraries for SAS programmers to install to read SAS datasets, select variables, merge data frames,  export to SAS datasets and read transport files?  

    install.packages(“SASxport”)

    library(SASxport)

    library(sas7bdat)

    library(dplyr)

    dat1 <- read.sas7bdat(“dm.sas7bdat”)

    select(dat1,”PROJECT”,”SUBJECT”,”SITEID”,”RACE”,”ETHNIC”,”SEX”,”AGE”)

    dm1 <- merge(x=dat2,y=dat3, by=”SUBJECT”,all.x=TRUE)

    write.xport(dm_f2,file=paste(getwd(), "dm.xpt", sep="/"),autogen.formats =FALSE

    library(Hmisc)

    adsl <- sasxport.get("adsl.xpt", lowernames=FALSE)

    18. How do write and run R code?  Writing your first R code. Enter code in code editor window. Click Run to execute current line and Source to execute all statements.

    19. What are examples of applying proc freq in R?  

    table(cars$Type) # proc freq data=cars; tables type; run;

    table(cars$Type,cars$Cylinders) # proc freq data=cars; tables cars*Cylinders; run;

    20. What are examples of appying proc means in R?

    ASL_mean <- ASL %>% group_by(ARMCD) %>% summarise(avg_age = mean(AGE), avg_bmi = mean(BMI))

    # %>% is pipe to connect R functions, see magrittr package

    # proc means data=ASL; by ARMCD; var mean bmi; run; 

    21. What are examples of proc sql in R? See R paper.

    adsl <- dm %>% select(studyid, subjid, age, sex, height, weight, race, scrfl) %>% mutate(bmi = (weight*703)/height^2 ) %>%

    filter(scrfl == “Y”) %>% select(-scrfl) %>% arrange(studyid, subjid)

    # apply %>% to connect R functions

    # select 8 dm variables

    # derive bmi variable from weight and height

    # subset for scrfl = ‘Y’

    # sort by studyid and subjid 

    22. How do you convert excel files to R data frame? library(readxl) # required for read_excel function 

    l_cancer <- read_excel(“C:/data/l_cancer.xlsx”)

    23. How do you display a list of functions once a package/library is opened?  lsf.str("package:dplyr")

    24. What is the syntax for keeping and dropping variables? df = subset(mydata, select = -c(x,z) ) # by variable name

    df <- mydata[ -c(1,3:4) ] # by column index number

    25. How best can you sort data?  newdata <- mtcars[order(mpg, cyl),] # sort by mpg and cyl

    26. What is the syntax to create R arrays?  Array_name = array(data,dim = c(row_size,column_size,matrices), dimnames = list(row_names,column_names,matrices_names))

    27. What are useful metadata functions and to check for missing data?  Does the data frame exists? What is the data frame class? Does the data frame contain values?

    28. How best to create data frames from vectors?  hospital <- c("New York", "California") patients <- c(150, 350) df <- data.frame(hospital, patients)

    29. How do you name values in vectors? Access defined islands named vector  > str(islands) 

    Named num [1:48] 11506 5500 16988 2968 16 ...

    - attr(*, "names")= chr [1:48] "Africa" "Antarctica" "Asia" "Australia" ..

    30. %>% - R Pipe Function, Sequence by passing the left hand side of the operator to the first argument of the right hand side of the operator, requires installing magrittr package, library(magrittr) iris %>% head()


    31. What is R Shiny?  R Shiny is a web application framework for R and R Studio's Shiny server which makes shiny applications available over the web.  See R Studio blog on R ShinySee how to create a simple R Shiny app.

    32. Can you write macros in R?  With R, you can create custom and complex functions that are similar to macros.  This give you flexibility in multiple methods for calling the R function. See Doing Macros in R blogSee Macros In R blog.

    33. What options are there to export R?  Export to CSV, Excel and SAS. See export to RTF.  See other examples.

    34. What interfaces are their between R and SharePoint? See blog.  

    35. How exactly is R an object-oriented programming language?  By using symbols (<-, $, [], etc.), R language directly access data frames, variables and values.  R symbols have built-in meanings that save time in writing code. All R functions are concise and process objects as input and outputs. These features help to make R programs 'cryptic' and more 'syntax heavy'.

    36. Are there any symbols for ending R statements like ';' in SAS?  There is no end of statements in R. Line breaks generally are used for readability of code.  R executes line by line or in batch mode during console window.

    37. Is it possible to run R programs within SAS?  Yes, see SAS Programming for R Users book for examples.

    38. How can you create log file as in SAS?  Load the logr library within the SASSY package

    39. What is R not?  Easy to Learn, Technically Friendly, Only for Plots, Only for Statistical Modeling, Similar to SAS, Only for Data Science, Case Insensitive, Missing Data Friendly.  R has functions and tools for data access, data management, statistical modeling, analysis datasets TLGs and data visualization.

    40. Is there an R function similar to SAS's proc compare?  Yes, see comparedf function.

    41. Does FDA use R? Yes, R is installed at FDA.  FDA does not recommend specific software packages.  FDA only requires sponsors to assure software used is validated.  See Using R: Perspectives of a FDA Statistical RevieweR.

    42. What are the options for saving R data? You can use save() function to save multiple objects as RData object or the saveRDS() function to save one object as RDS object. See examples.

    a <- 1

    b <- 2

    c <- 3

    save(a, b, c, file = "stuff.RData")

    load("stuff.RData")

    saveRDS(a, file = "stuff.RDS")

    a <- readRDS("stuff.RDS")

    43. How does R handle missing data?  

    In most cases, adding the argument, na.rm=TRUE, to NA-sensitive functions will tell that function to remove any NAs when performing calculations, such as mean(myvector, na.rm=TRUE).

    44. What are useful R debugging functions? traceback(), debug(), browser(), trace(), recover()

    45. What are common R errors?

    46. What are possible R error types? syntax, symbols, program logic, parameters (input/output), data (input/output/missing/invalid), object/vector type incompatible, invalid object/data assignments/data type conversions.

    47. What are best practices for debugging R programs? Review data frame metadata, missing/unique/range of data, program logic, display intermediate object values.

    48. What is a useful R package to compare two data frames?  The diffdf package is useful to compare two data frames.

    49. What is a useful R package for project management?  The packrat package is useful to manage projects.


    Powered by Wild Apricot Membership Software