Sept. 7-11, 2015
Rstudio
par()
)lattice
and ggplot2
packagesR is a system for statistical computation and graphics. It consists of a language plus a run-time environment with graphics, a debugger, access to certain system functions, and the ability to run programs stored in script files.
R version 3.2.2 (Fire Safety) has been released on 2015-08-14.
There are R distributions for Linux, Mac, Windows… platforms.
Installation procedure for your favorite platform should be straigthforward if you follow the instructions from CRAN.
On Windows, upgrading R versions may be a pain, but see the R Windows FAQ.
On linux systems, in my experience the most straightforward way to be kept up-to-date, is to register a CRAN repository in your favorite package manager.
R is a lot of things but from a user perspective apart from a language, it is primarily:
First start R if not done already and let's try to open a graphical window :
hist(rnorm(1000))
1 + 1 # 124é"famoi'_(ù*$=+" - Use a hash '#' to add comments that are just ignored a <- "hello" # assigning a value to a symbol/name A <- 1 # If you simply enter the variable name or expression at the command prompt, # R will PRINT its value. a A 2 <- "error" ?Reserved B <- 2 A + B C <- c(A, B) C
mode(C) log(x=64, base=4) # reminder about functions and parameters log(64, 4) # Missing data y <- c(1,2,3,NA) is.na(y) # returns a vector (F F F T) # Not A Number sqrt(-9) # close your session q()
Does this sound familiar to all of you?
The standard R distribution comes with an amazing number of tools but one may need to perform specific tasks.
To extend R functionalities you will want to download and install additional packages and use them.
Packages (libraries, modules, … in other languages) are collections of R functions, data, and compiled code in a well-defined format. The directory where packages are stored is called the library.
The standard distribution comes with several packages (base
, compiler
, datasets
, grDevices
, graphics
, …)
Repositories hold collections of R packages and have mecanisms to download and install them on your system (99% of the time, it is really easy).
"Mainstream" repositories are CRAN, the Omega project and Bioconductor.
install.packages("packageName")
# Try and install: install.packages(c("ape", "devtools"))
Bioconductor is bioinformatics-oriented
source("http://bioconductor.org/biocLite.R") # fetch and execute code of the function: biocLite("packageName")
GitHub is becoming increasingly popular as a repository for R packages:
First install devtools
from CRAN (!…) and then call the devtools::install_github()
function.
To remove packages: remove.packages(pkgs="packageName")
To know what packages are installed on your system: installed.packages()
To use resources from a package, you generally attach it to your working environment with library()
(or require()
). This will load the ressources from the package in memory and attach it to your search()
path.
search() # list of attached R packages and objects ## [1] ".GlobalEnv" "package:stats" "package:graphics" ## [4] "package:grDevices" "package:utils" "package:datasets" ## [7] "package:methods" "Autoloads" "package:base" library(ape) # or library("ape") search() ## [1] ".GlobalEnv" "package:ape" "package:stats" ## [4] "package:graphics" "package:grDevices" "package:utils" ## [7] "package:datasets" "package:methods" "Autoloads" ## [10] "package:base"
session_info() # This function is from devtools but is not loaded ## Error in eval(expr, envir, enclos): could not find function "session_info" devtools::session_info() # loads devtools if not already done and call session_info() ## Session info -------------------------------------------------------------- ## setting value ## version R version 3.2.2 (2015-08-14) ## system x86_64, linux-gnu ## ui X11 ## language en_US:en ## collate en_US.UTF-8 ## tz <NA> ## Packages ------------------------------------------------------------------ ## package * version date source ## ape * 3.3 2015-05-29 CRAN (R 3.2.1) ## curl 0.9.2 2015-08-08 CRAN (R 3.2.2) ## devtools 1.8.0 2015-05-09 CRAN (R 3.2.2) ## digest 0.6.8 2014-12-31 CRAN (R 3.1.2) ## evaluate 0.7 2015-04-21 CRAN (R 3.2.0) ## formatR 1.2 2015-04-21 CRAN (R 3.2.0) ## git2r 0.10.1 2015-05-07 CRAN (R 3.2.0) ## htmltools 0.2.6 2014-09-08 CRAN (R 3.1.2) ## knitr 1.10.5 2015-05-06 CRAN (R 3.2.0) ## lattice 0.20-33 2015-07-14 CRAN (R 3.2.2) ## magrittr 1.5 2014-11-22 CRAN (R 3.2.0) ## memoise 0.2.1 2014-04-22 CRAN (R 3.1.1) ## nlme 3.1-122 2015-08-19 CRAN (R 3.2.2) ## Rcpp 0.12.0 2015-07-25 CRAN (R 3.2.2) ## rmarkdown 0.7 2015-06-13 CRAN (R 3.2.1) ## rversions 1.0.2 2015-07-13 CRAN (R 3.2.2) ## stringi 0.5-5 2015-06-29 CRAN (R 3.2.1) ## stringr 1.0.0 2015-04-30 CRAN (R 3.2.0) ## xml2 0.1.1 2015-06-02 CRAN (R 3.2.1) ## yaml 2.1.13 2014-06-12 CRAN (R 3.2.0) # run ?"::" if you want.
detach("package:ape")
a package does not unload it.unloadNamespace("ape")
We are probably going to need the following packages: "ape", "reshape2", "dplyr", "lattice", "ggplot2", "VennDiagram".
install.packages(c("ape", "reshape2", "dplyr", "lattice", "ggplot2", "VennDiagram"))
source("http://bioconductor.org/biocLite.R") # fetch and execute code of the function: biocLite("Biostrings")
library()
Please, let us know if anything weird happened.
Clearly, one R's distinctive feature is that documentation ressources in broad terms are extremely aboundant:
help("name")
or ?name
: Beware, help()
searches only in loaded packages.help.start()
: starts the HTML version of help()?summary
help.search("topic")
or ??topic
??"\\{" ??DNA
?paste
?iris
example("paste") # Run the code in the "examples" section of a function doc demo("graphics") # some packages offer a demo of their functionalities!
Cheat sheets are very convenient especially when you are not familiar with a language or a package:
https://cran.r-project.org/doc/contrib/Short-refcard.pdf
If you can open it, find functions that may help you find help…
In English:
Hands-On Programming with R http://shop.oreilly.com/product/0636920028574.do
The Art of R Programming https://www.nostarch.com/artofr.htm earlier free version at http://heather.cs.ucdavis.edu/~matloff/132/NSPpart.pdf
Advanced R http://adv-r.had.co.nz/
Datat Manipulation with R http://www.springer.com/us/book/9780387747309
An introduction to R http://cran.stat.auckland.ac.nz/doc/manuals/R-intro.pdf
R Bootcamp http://jaredknowles.com/r-bootcamp/
STATS 782 https://www.stat.auckland.ac.nz/~dscott/782/
R and Bioconductor by Thomas Girke http://manuals.bioinformatics.ucr.edu/
In French:
R pour les débutants de E. Paradis https://cran.r-project.org/doc/contrib/Paradis-rdebuts_fr.pdf
Dia de formation d'E. Paradis http://ape-package.ird.fr/ep/teaching.html
Enseignements de Statistique en Biologie http://pbil.univ-lyon1.fr/R/
Documents pédagogiques de S. Déjean http://perso.math.univ-toulouse.fr/dejean/formation/
Cours de Gilles Hunault http://www.info.univ-angers.fr/pub/gh/
R-help. The ‘main’ R mailing list, for discussion about problems and solutions using R, announcements about the development of R and the availability of new code.
sessionInfo()
(very usefull for diagnostic)print(sessionInfo(), locale = FALSE) ## R version 3.2.2 (2015-08-14) ## Platform: x86_64-pc-linux-gnu (64-bit) ## Running under: Ubuntu 14.04.3 LTS ## ## attached base packages: ## [1] stats graphics grDevices utils datasets methods base ## ## other attached packages: ## [1] ape_3.3 ## ## loaded via a namespace (and not attached): ## [1] Rcpp_0.12.0 lattice_0.20-33 digest_0.6.8 grid_3.2.2 ## [5] nlme_3.1-122 git2r_0.10.1 formatR_1.2 magrittr_1.5 ## [9] evaluate_0.7 stringi_0.5-5 curl_0.9.2 rstudioapi_0.3.1 ## [13] xml2_0.1.1 rmarkdown_0.7 devtools_1.8.0 tools_3.2.2 ## [17] stringr_1.0.0 yaml_2.1.13 rversions_1.0.2 memoise_0.2.1 ## [21] htmltools_0.2.6 knitr_1.10.5
Task views are a great place to start when you want to do something but do not know what tool to use because since you get a fairly comprehensive overview of what’s available on a topic. Let's see for example what is available for "High-Performance and Parallel Computing with R".
Whenever available READ the Package vignette which is a practical and concise guide to your package that illustrates its key functionalities.
Go directly to the package page on CRAN of use functions browseVignettes("packagename")
and vignette(x)
For people interested in high-throughput genomic data, take a look at the Bioconductor website. It provides workflows which are step-by-step guides to certain types of analysis. Bioconductor vignettes are of excellent quality, the first tutorial to try
Tips for efficiently writing understandable and reusable code
No matter what you do, from a quick t-test on your last experiment data or a comprehensive analysis on RNA-seq data from a consortium of labs, save your code for later!
R script files are just plain text files with an .R
extension, that is it!
source()
to execute the code from R scipts and load your functions in your session.If you (or someone else) want to be able to quickly grasp what you wrote a few weeks after you actually wrote it, there are a few tips you should follow:
add comments, # even if you think you will never read it again
use descritive and judicious variable names (e.g. samplingLocations
rather x
), this is a difficult art.
use indentation to reflect code structure (in if
construct, function definitions, …)
use consitent formatting style (e.g. DeletedObservations
or delete_observations
but not both)
keep nesting to a strict minimum (f(g(h(x)))
): hard balance between compactness vs. wordiness
…
For more detailled and R-oriented advices please try to browse one of those guides for tomorrow:
Lets first fetch the zip archive and extract it in a directory named "Rtrainning_201509" in your user home.
Type
history()
Save the commands of your session history into a file with:
savehistory(file = "Rtrainning_20150907.R")
Find out where this file has been created by executing
getwd()
With the file explorer, make sure it is there, move it into the RTrainning folder and open it with a text editor to look at its content.
Run the code contained in the script file sourceMe.R
source(file = "sourceMe.R", echo = FALSE)
Rstudio
Let's start Rstudio on your machines!
For the complete list go there
Description | Windows & Linux | Mac |
---|---|---|
Run current line/selection | Ctrl+Enter | Command+Enter |
Comment/uncomment current line/selection | Ctrl+Shift+C | Command+Shift+C |
Insert assignment operator | Alt+- | Option+- |
Reformat Selection | Ctrl+Shift+A | Command+Shift+A |
Find and Replace | Ctrl+F | Command+F |
Attempt completion | Tab or Ctrl+Space | Tab or Command+Space |
Show help for function at cursor | F1 | F1 |
Show source code for function | F2 | F2 |
Open your "Rtrainning_20150907.R" file.
Select some commands (preferentially not the package installation ones) and run them (use keyboard)
Go to the history tab and run these commands again by clicking 'To console'.
Find help on the function library()
(use keyboard)
Other "Files", "Plot", "Packages", "Viewer" tabs
head(iris)
USArrests
in the package:datasets
environment. What is it?installed.packages() ls() str(iris) ls.str() rm(A)
Source Prof. Daniel Wegmann
6.7
and −56.3
to variables a
and b
, respectively(2*a)/b+a*b
and assign the results to variable x
help.search()
to find out how to compute the square root of variables and compute the square root of a
and b
log(x)
and assign the result to variable y
a
, b
and x
exist, but not y
.75
and 0.1
to the variables u
and v
, respectively, and to print(u, v)
.Copy the line below and paste it in the code editor and execute it.
anticonstitucionalissimamente <- lm(iris)
Write the line below by typing only 1-3 letters (not counting $
signs). Tip use tab…
anticonstitucionalissimamente$model$Petal.Width
head(iris)
in your script and execute the code.iris
dataset in the datasets environment.