Like in Visual Studio you can group several functions and data they operate in the project, which is a folder in filesystem with a several special files. When project is opened those special files are loaded automatically creating environment that is close to the one you used to have when you last close the project.
RStudio projects are associated with R working directories. You can create an RStudio project:
- In a brand new directory
- In an existing directory where you already have R code and data
- By cloning a version control (Git or Subversion) repository
To create a new project use the Create Project command (available on the Projects menu and on the global toolbar):
When a new project is created RStudio:- Creates a project file (with the extension.Rproj ) in the project directory. This file contains various project options and can also be used as a shortcut for opening the project directly from the filesystem.
- Creates a hidden directory (named .Rproj.user ) where project-specific temporary files (e.g. auto-saved source documents, window-state, etc.) are stored. This directory is also automatically added to .Rbuildignore, .gitignore, etc. if required.
- Loads the project into RStudio and display its name in the Projects toolbar (which is located on the far right side of the main toolbar)
There are several ways to open a project:
- Using the Open Project command (available from both the Projects menu and the Projects toolbar) to browse for and select an existing project file (e.g. MyProject.Rproj).
- Selecting a project from the list of most recently opened projects (also available from both the Projects menu and toolbar).
- Double-clicking on the project file within the system shell (e.g. Windows Explorer, OSX Finder, etc.).
When a project is opened within RStudio the following actions are taken:
- A new R session (process) is started
- The .Rprofile file in the project's main directory (if any) is sourced by R
- The .RData file in the project's main directory is loaded (if project options indicate that it should be loaded).
- The .Rhistory file in the project's main directory is loaded into the RStudio History pane (and used for Console Up/Down arrow command history).
- The current working directory is set to the project directory.
- Previously edited source documents are restored into editor tabs
- Other RStudio settings (e.g. active tabs, splitter positions, etc.) are restored to where they were the last time the project was closed.
To run a script pass a string with its name to the source function.
When you are within a project and choose to either Quit, close the project, or open another project the following actions are taken:
- .RData and/or .Rhistory are written to the project directory (if current options indicate they should be)
- The list of open source documents is saved (so it can be restored next time the project is opened)
- Other RStudio settings (as described above) are saved.
- The R session is terminated.
You can work with more than one RStudio project at a time by simply opening each project in its own instance of RStudio. There are two ways to accomplish this:
- Use the Open Project in New Window command located on the Project menu.
- Opening multiple project files via the system shell (i.e. double-clicking on the project file).
There are several options that can be set on a per-project basis to customize the behavior of RStudio. You can edit these options using the Project Options command on the Project menu:
R command line provides access to help via ?[function] or ??[topic]
Sites and free books
Note: An excellent resource as books and websites related to R is 60+ R resources to improve your data skills Computerworld. Please visit it.
A large, cookbook-style collection of material on R contains Stack Overflow site.
Free books (adapted from The R Programming Language - Free Computer, Programming, Mathematics, Technical Books, Lecture Notes and Tutorials)
- An Introduction to R: A Programming Environment for Data Analysis and Graphics ©2009-2012 (William N Venables, David M Smith) covers version 3.2.0.
- The Art of R Programming
- Advanced R by Hadley Wickham (he also teaches R courses)
- Introduction to Probability and Statistics Using R (G. Jay Kerns)
- Using R for Data Analysis and Graphics (J H Maindonald)
- Using R for Introductory Statistics ©2004-2005 (John Verzani)
- Introduction to Statistical Thinking (With R, Without Calculus)
- Statistics with R
- Learning Statistics with R Computational Cognitive Science Lab Download the whole book (version 0.5) here
- The R Inferno ©2011-2012 (Patrick Burns)
R has its own LaTeX-like documentation format, which is used to supply comprehensive documentation, both on-line in a number of formats and in hard copy.
See also R Bookshelf
Packages and CRAN
The power of R is heavily based on a large set available packages which extend the core language. R package structure reminds Perl. Similar to Perl, the main repository is called Comprehensive R Archive Network (CRAN). It contains thousands of packages. A core set of packages is included with the installation of R The set of packages loaded on startup is by default can be displayed using the command:
> getOption("defaultPackages")
All-in-all there are around 6K additional packages for R and 120,000 functions (as of June 2014) available at the CRAN and other repositories. In other words R like Perl is suffering from package glut. the following discussion would resonate with any long-term Perl user (Does R have too many packages R-bloggers)
1 Lack of long term maintenance of packages. This has been a challenge that I have faced when using R packages which I believe will provide the solution to my problem but these packages frequently are not maintained at the same rate as the R base system.
And how could they be? The base system is updated several times a year while there are thousands of packages. To update each of those packages for minor changes in the base system seems foolish and excessive. However, as the current structure of R stands, to fail to update these packages results in packages which previously worked, no longer functioning. This is a problem I have experienced and is frankly very annoying.
One solution might be to limit the number of packages to those which have a sufficient developer base to ensure long term maintenance. However, this would likely stifle the creativity and productivity of the wide R developer base.
Another solution is to limit the number of base system updates in order to limit the likelihood that a package will become outdated and need updating.
A third option, which I believe is the most attractive, is to allow code to specify what version of R it is stable on and for R to act for the commands in that package as though it is running on a previous version of R. This idea is inspired by how Stata handles user written commands. These commands simply specify version number for which the command was written under. No matter what later version of Stata is used, the command should still work.
I understand that such an implementation would require additional work from the R core team for each subsequent update. However, such an investment may be worth it in the long run as it would decrease the maintenance in response to R base updates.
2 The super abundance of R packages. The concern is that there are so many packages that users might find it difficult to wade through them in order to find the right package. I don't really see this as a problem. If someone wanted to learn to use all R packages then of course this task would be nearly impossible. However, with me as I believe with most people, I learn to use new functions within packages to solve specific problems. I don't really care how many packages there are out there. All I care is that when I ask a question on google or StackOverflow about how to do x or y, someone can point me to the package and command combination necessary to accomplish the task.3 The inconsistent quality of individual packages. It is not always clear if user written packages are really doing what they claim to be doing. I know personally I and am constantly on the look out for checks to make sure my code is doing what I think it is doing, yet still I consistently find myself making small errors which only show up through painstaking experimentation and debugging.
CRAN has some automated procedures in which packages are tested to ensure that all of their functions work without errors under normal circumstances. However, as far as I know, there are no automated tests to ensure the commands are not silently giving errors by doing the wrong thing. These kind of error controls are entirely left up to the authors and users. This concern comes to mind because one of my friends recently was running two different Bayesian estimation packages which were supposed to produce identical results yet each returned distinctly different results with one set having significant estimates and the other not. If he had not thought to try two different packages then he would never have thought of the potential errors inherent in the package authorship.
A solution to inconsistent package quality controls may be to have a multitiered package release structure in which packages are first released in "beta form" but require an independent reviewing group to check functionality and write up reports before attaining "full" release status. Such an independent package review structure may be accomplished by developing an open access R-journal specifically geared towards the review, release, and revision of R packages.
4 The lack of hierarchical dependencies. This is a major point mentioned in Kurt Hornik's paper. He looks at package dependencies and found that the majority of packages have no dependencies upon other packages. This indicates that while there are many packages out there, most packages are not building on the work of other packages. This produces the unfortunate situation in which it seems that many package developers are recreating the work of other package developers. I am not really sure if there is anything that can be done about this or if it really is an issue.
It does not bother me that many users recode similar or duplicate code because I think the coding of such code helps the user better understand the R system, the user's problem, and the user's solution. There is however the issue that the more times a problem is coded, the more likely someone will code an error. This beings us back to point 3 in which errors must be rigorously pursued and ruthlessly exterminated through use of an independent error detection system.
5 Insufficient Meta Package Analysis. A point that Kurt Hornik also raises is that there are a lot of R packages out there but not a lot of information about how those packages are being used. In order to further this goal, it might be useful to build into future releases of R the option to report usage statistics on which packages and functions are being used in combination with which other packages. Package developers might find such information useful when evaluating what functions to update.
R packages are installed into libraries, which are directories in the file system containing a subdirectory for each package installed there.
R comes with a single library, R_HOME/library which is the value of the R object .Library containing the standard and recommended packages.
Both sites and users can create others and make use of them (or not) in an R session. At the lowest level .libPaths() can be used to add paths to the collection of libraries or to report the current collection.
R will automatically make use of a site-specific library R_HOME/site-library if this exists (it does not
in a vanilla R installation). This location can be overridden by setting .Library.site in R_HOME/etc/Rprofile.site,
or (not recommended) by setting the environment variable R_LIBS_SITE
. Like .Library, the site libraries
are always included into .libPaths().
Users can have one or more libraries, normally specified by the environment variable R_LIBS_USER
. This has a default
value (to see it, use Sys.getenv("R_LIBS_USER") within an R session), but that is only used if the corresponding directory
actually exists (which by default it will not).
Both R_LIBS_USER
and R_LIBS_SITE
can specify multiple library paths, separated by colons (semicolons on
Windows).
Another strength of R is static graphic which can be produced using ggplot2 package. It can produce publication-quality graphs, including mathematical symbols. Dynamic and interactive graphics are available through additional packages.
Packages allow specialized statistical techniques, graphic output (ggplot2), import/export capabilities, reporting tools (knitr, Sweave), etc. Packages are developed primarily in R. Sometimes C, C++, Fortran are used.
Other R package repositories include R-forge and Bioconductor
- R-Forge is the central platform for the collaborative development of R packages, R-related software, and projects. R-Forge hosts many unpublished beta packages, and development versions of CRAN packages.
- Bioconductor project provides R packages for the analysis of genomic data, such as Affymetrix and cDNA microarray object-oriented data-handling and analysis tools, and has started to provide tools for analysis of data from next-generation high-throughput sequencing methods.
There is also a community site for rating and reviewing all CRAN packages called Crantastic.
Language
Adapted from Programming in R Thomas Girke, UC Riverside
R is C style language that does not do a good job of enhancing C syntax and avoiding it shortcomings. It is stuck in 90 mentality and in comparison with Perl does not extend syntax much.
This is the dynamically typed language with fist class functions, closures, objects, vector operations, pass parameters by value. Has special values for variables such as NULL and NA. Everything is nullable.
R is an interpreted language; users typically access it through a command-line interpreter. If a user types "2+2" at the R command prompt and presses Enter, the computer replies with "4", as shown below:
> 2+2 [1] 4
Variable names in R can contain dot character which serves the role similar to underscore (...
is
used to indicate a variable number of function arguments). R uses $ in a manner analogous to the way other languages use dot.
R has several one-letter reserved words: c, q, s, t, C, D, F, I, and T.
R's data structures include vectors, lists, matrices, arrays, data frames (list of vectors; similar to tables in a relational database). There is no scalar type in R. A scalar is represented as a vector with length one The scalar data type was never a data structure of R.
Vectors are one dimensional collections used to, most frequently, store one sort of data (Numbers, Text, ...). Indices in R
start at 1, not at 0. In this way R resembles FORTRAN. x[1] is the first element of vector x. Vector is an ordered
collection of elements with no other structure. The length of a vector is the number of elements. Operations are applied componentwise.
For example, given two vectors x and y of equal length, x*y would be the vector whose nth component is the product of the nth components
of x and y. Also, log(x) would be the vector whose nth component is the logarithm of the nth component of x. Vectors are created using
the c
function. For example, p <- c(2,3,5,7)
sets
p
to the vector containing the first four prime numbers.
R supports procedural programming with functions. A generic function acts differently depending on the type of arguments passed to it. In other words, the generic function dispatches the function (method) specific to that type of object. There is also OO system for R (actually two).
R has a generic print() function that can print almost every type of object in R with a simple
print(objectname)
Overview of syntax.
- Conditional Executions
- Comparison Operators: ==, !=, >, <, >=, <=
- Logical Operators: & , | , !
if(cond1=true) { cmd1 } else { cmd2 }
If statements operate on length-one logical vectors.ifelse(test, true_value, false_value)
Ifelse statements operate on vectors of variable length.x <- 1:10 # Creates sample data ifelse(x<5 | x>8, x, 0) [1] 1 2 3 4 0 0 0 0 9 10
- Loops The most commonly used loop structures in R are for, while and apply loops. Less common are repeat loops. The break
function is used to break out of loops, and next halts the processing of the current iteration and advances the looping index.
- For Loop For loops are controlled by a looping vector. In every iteration of the loop one value in the looping vector
is assigned to a variable that can be used in the statements of the body of the loop. Usually, the number of loop iterations is
defined by the number of values stored in the looping vector and they are processed in the same order as they are stored in the
looping vector.
for(variable in looping vector) {
statements
}
Examplemydf <- iris
myve <- NULL # Creates empty storage container
for(i in seq(along=mydf[,1])) {
myve <- c(myve, mean(as.numeric(mydf[i, 1:3]))) # Note: inject approach is much faster than append with 'c'. See below for details.
}
myve
[1] 3.333333 3.100000 3.066667 3.066667 3.333333 3.666667 3.133333 3.300000
[9] 2.900000 3.166667 3.533333 3.266667 3.066667 2.800000 3.666667 3.866667
x <- 1:10
z <- NULL
for(i in seq(along=x)) {
if(x[i] < 5) {
z <- c(z, x[i] - 1)
} else {
z <- c(z, x[i] / x[i])
}
}
z
[1] 0 1 2 3 1 1 1 1 1 1
Example: stop on condition and print error message
x <- 1:10
z <- NULL
for(i in seq(along=x)) {
if (x[i]<5) {
z <- c(z,x[i]-1)
} else {
stop("values need to be <5")
}
}
Error: values need to be <5
z
[1] 0 1 2 3
- While Loop Similar to for loop, but the iterations are controlled by a conditional statement.
while(condition) statements
Example
z <- 0
while(z < 5) {
z <- z + 2
print(z)
}
[1] 2
[1] 4
[1] 6
apply(X, MARGIN, FUN, ARGs)
Apply Loop works for two-dimensional data sets. X: array, matrix or data.frame; MARGIN: 1 for rows, 2 for columns, c(1,2) for both; FUN: one or more functions; ARGs: possible arguments for functionExample
## Example for applying predefined mean function apply(iris[,1:3], 1, mean) [1] 3.333333 3.100000 3.066667 3.066667 3.333333 3.666667 3.133333 3.300000 ... ## With custom function x <- 1:10 test <- function(x) { # Defines some custom function if(x < 5) { x-1 } else { x / x } } apply(as.matrix(x), 1, test) # Returns same result as previous for loop* [1] 0 1 2 3 1 1 1 1 1 1 ## Same as above but with a single line of code apply(as.matrix(x), 1, function(x) { if (x<5) { x-1 } else { x/x } }) [1] 0 1 2 3 1 1 1 1 1 1
tapply(vector, factor, FUN)
Applies a function to array categories of variable lengths (ragged array). Grouping is defined by factor. For example:
## Computes mean values of vector agregates defined by factor tapply(as.vector(iris[,i]), factor(iris[,5]), mean) setosa versicolor virginica 0.246 1.326 2.026 ## The aggregate function provides related utilities aggregate(iris[,1:4], list(iris$Species), mean) Group.1 Sepal.Length Sepal.Width Petal.Length Petal.Width 1 setosa 5.006 3.428 1.462 0.246 2 versicolor 5.936 2.770 4.260 1.326 3 virginica 6.588 2.974 5.552 2.026
-
lapply(X, FUN)
andsapply(X, FUN)
Apply a function to vector or list objects. The function lapply returns a list, while sapply attempts to return the simplest data object, such as vector or matrix instead of list.## Creates a sample list mylist <- as.list(iris[1:3,1:3]) mylist $Sepal.Length [1] 5.1 4.9 4.7 $Sepal.Width [1] 3.5 3.0 3.2 $Petal.Length [1] 1.4 1.4 1.3 ## Compute sum of each list component and return result as list lapply(mylist, sum) $Sepal.Length [1] 14.7 $Sepal.Width [1] 9.7 $Petal.Length [1] 4.1 ## Compute sum of each list component and return result as vector sapply(mylist, sum) Sepal.Length Sepal.Width Petal.Length 14.7 9.7 4.1
repeat
Loop is repeated until a break is specified. This means there needs to be a statement in loop body that break from the loop.z <- 0 repeat { z <- z + 1 print(z) if (z > 100) break() }
- Improving Speed Performance of Loops
Looping over very large data sets can become slow in R. However, this limitation can be overcome by eliminating certain operations in loops or avoiding loops over the data intensive dimension in an object altogether. The latter can be achieved by performing mainly vector-to-vecor or matrix-to-matrix computations which run often over 100 times faster than the corresponding for() or apply() loops in R. For this purpose, one can make use of the existing speed-optimized R functions (e.g.: rowSums, rowMeans, table, tabulate) or one can design custom functions that avoid expensive R loops by using vector- or matrix-based approaches. Alternatively, one can write programs that will perform all time consuming computations in C.
- Speed comparison of for loops with an append versus an inject step:
myMA <- matrix(rnorm(1000000), 100000, 10, dimnames=list(1:100000, paste("C", 1:10, sep="")))
results <- NULL
system.time(for(i in seq(along=myMA[,1])) results <- c(results, mean(myMA[i,])))
user system elapsed
39.156 6.369 45.559
results <- numeric(length(myMA[,1]))
system.time(for(i in seq(along=myMA[,1])) results[i] <- mean(myMA[i,]))
user system elapsed
1.550 0.005 1.556
- Speed comparison of apply loop versus rowMeans for computing the mean for each row in a large matrix:
system.time(myMAmean <- apply(myMA, 1, mean))
user system elapsed
1.452 0.005 1.456
system.time(myMAmean <- rowMeans(myMA))
user system elapsed
0.005 0.001 0.006 - Speed comparison of apply loop versus vectorized approach for computing the standard deviation of each row:
system.time(myMAsd <- apply(myMA, 1, sd))
user system elapsed
3.707 0.014 3.721
myMAsd[1:4]
1 2 3 4
0.8505795 1.3419460 1.3768646 1.3005428
system.time(myMAsd <- sqrt((rowSums((myMA-rowMeans(myMA))^2)) / (length(myMA[1,])-1)))
user system elapsed
0.020 0.009 0.028
myMAsd[1:4]
1 2 3 4
0.8505795 1.3419460 1.3768646 1.3005428The vector-based approach in the last step is over 200 times faster than the apply loop.
- Example for computing the mean for any custom selection of columns without compromising the speed performance:
## In the following the colums are named according to their selection in myList
myList <- tapply(colnames(myMA), c(1,1,1,2,2,2,3,3,4,4), list)
myMAmean <- sapply(myList, function(x) rowMeans(myMA[,x]))
colnames(myMAmean) <- sapply(myList, paste, collapse="_")
myMAmean[1:4,]
C1_C2_C3 C4_C5_C6 C7_C8 C9_C10
1 0.0676799 -0.2860392 0.09651984 -0.7898946
2 -0.6120203 -0.7185961 0.91621371 1.1778427
3 0.2960446 -0.2454476 -1.18768621 0.9019590
4 0.9733695 -0.6242547 0.95078869 -0.7245792
## Alternative to achieve the same result with similar performance, but in a much less elegant way
myselect <- c(1,1,1,2,2,2,3,3,4,4) # The colums are named according to the selection stored in myselect
myList <- tapply(seq(along=myMA[1,]), myselect, function(x) paste("myMA[ ,", x, "]", sep=""))
myList <- sapply(myList, function(x) paste("(", paste(x, collapse=" + "),")/", length(x)))
myMAmean <- sapply(myList, function(x) eval(parse(text=x)))
colnames(myMAmean) <- tapply(colnames(myMA), myselect, paste, collapse="_")
myMAmean[1:4,]
C1_C2_C3 C4_C5_C6 C7_C8 C9_C10
1 0.0676799 -0.2860392 0.09651984 -0.7898946
2 -0.6120203 -0.7185961 0.91621371 1.1778427
3 0.2960446 -0.2454476 -1.18768621 0.9019590
4 0.9733695 -0.6242547 0.95078869 -0.7245792
- Speed comparison of for loops with an append versus an inject step:
Functions
most of the R software can be viewed as a series of R functions.myfct <- function(arg1, arg2, ...) {
function_body
}The value returned by a function is the value of the function body, which is usually an unassigned final expression, e.g.: return()
myfct(arg1=..., arg2=...)
Functions are defined by (1) assignment with the keyword function, (2) the declaration of arguments/variables (arg1, arg2, ...) and (3) the definition of operations (function_body) that perform computations on the provided arguments. A function name needs to be assigned to call the function (see below).
Function names can include dot. The usage of names of existing functions should be avoided.It is often useful to provide default values for arguments (e.g.:arg1=1:10). This way they don't need to be provided in a function call. The argument list can also be left empty (myfct <- function() { fct_body }) when a function is expected to return always the same value(s). The argument '...' can be used to allow one function to pass on argument settings to another.
Variables created inside a function exist only for the life time of a function. Thus, they are not accessible outside of the function. To force variables in functions to exist globally, one can use this special assignment operator: '<<-'. If a global variable is used in a function, then the global variable will be masked only within the function.
The actual expressions (commands/operations) are defined in the function body which should be enclosed by braces. The individual commands are separated by semicolons or new lines (preferred).
Functions are called by their name followed by parentheses containing possible argument names. Empty parenthesis after the function name will result in an error message when a function requires certain arguments to be provided by the user. The function name alone will print the definition of a function.
Example: Function basicsmyfct <- function(x1, x2=5) {
z1 <- x1/x1
z2 <- x2*x2
myvec <- c(z1, z2)
return(myvec)
}
myfct # prints definition of function
myfct(x1=2, x2=5) # applies function to values 2 and 5
[1] 1 25
myfct(2, 5) # the argument names are not necessary, but then the order of the specified values becomes important
myfct(x1=2) # does the same as before, but the default value '5' is used in this caseExample: Function with optional arguments
myfct2 <- function(x1=5, opt_arg) {
if(missing(opt_arg)) { # 'missing()' is used to test whether a value was specified as an argument
z1 <- 1:10
} else {
z1 <- opt_arg
}
cat("my function returns:", "\n")
return(z1/x1)
}
myfct2(x1=5) # performs calculation on default vector (z1) that is defined in the function body
my function returns:
[1] 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0
myfct2(x1=5, opt_arg=30:20) # a custom vector is used instead when the optional argument (opt_arg) is specified
my function returns:
[1] 6.0 5.8 5.6 5.4 5.2 5.0 4.8 4.6 4.4 4.2 4.0
Control statements for functions: return, warning and stop- Return The evaluation flow of a function may be terminated at any stage with the return function. This is often used
in combination with conditional evaluations.
Stop. To stop the action of a function and print an error message, one can use the stop function.
- Warning. To print a warning message in unexpected situations without aborting the evaluation flow of a function, one can use the function warning("...").
myfct <- function(x1) {
if (x1>=0) print(x1) else stop("This function did not finish, because x1 < 0")
warning("Value needs to be > 0")
}
myfct(x1=2)
[1] 2
Warning message:
In myfct(x1 = 2) : Value needs to be > 0
myfct(x1=-2)
Error in myfct(x1 = -2) : This function did not finish, because x1 < 0
Debugging
Several debugging utilities are available for R. The most important utilities are: traceback(), browser(), options(error=recover), options(error=NULL) and debug(). The R Debugging page provides more informationRegular Expressions
R's regular expression utilities work similar as in other languages. To learn how to use them in R, one can consult the main help page on this topic with ?regexp. The following gives a few basic examples.
The grep function can be used for finding patterns in strings, here letter A in vector month.name.month.name[grep("A", month.name)]
[1] "April" "August"
Example for using regular expressions to substitute a pattern by another one using the sub/gsub function with a back reference. Remember: single escapes '\' need to be double escaped '\\' in R.
gsub("(i.*a)", "xxx_\\1", "virginica", perl = TRUE)
[1] "vxxx_irginica"
Example for split and paste functions
x <- gsub("(a)", "\\1_", month.name[1], perl=TRUE) # performs substitution with back reference which inserts in this example a '_' character
x
[1] "Ja_nua_ry"strsplit(x, "_") # splits string on inserted character from above
[[1]]
[1] "Ja" "nua" "ry"
paste(rev(unlist(strsplit(x, NULL))), collapse="") # reverses character string by splitting first all characters into vector fields and then collapsing them with paste
[1] "yr_aun_aJ"
Example for importing specific lines in a file with a regular expression. The following example demonstrates the retrieval of specific lines from an external file with a regular expression. First, an external file is created with the cat function, all lines of this file are imported into a vector with readLines, the specific elements (lines) are then retieved with the grep function, and the resulting lines are split into vector fields with strsplit.
cat(month.name, file="zzz.txt", sep="\n")
x <- readLines("zzz.txt")
x <- x[c(grep("^J", as.character(x), perl = TRUE))]
t(as.data.frame(strsplit(x, "u")))
[,1] [,2]
c..Jan....ary.. "Jan" "ary"
c..J....ne.. "J" "ne"
c..J....ly.. "J" "ly"
Interpreting Character String as Expression
Example
mylist <- ls() # generates vector of object names in session
mylist[1] # prints name of 1st entry in vector but does not execute it as expression that returns values of 10th object
get(mylist[1]) # uses 1st entry name in vector and executes it as expression
eval(parse(text=mylist[1])) # alternative approach to obtain similar result
Time, Date and Sleep
Example
system.time(ls()) # returns CPU (and other) times that an expression used, here ls()
user system elapsed
0 0 0
date() # returns the current system date and time
[1] "Wed Dec 11 15:31:17 2012"Sys.sleep(1) # pause execution of R expressions for a given number of seconds (e.g. in loop)
Calling External Software with System Command
The system command allows to call any command-line software from within R on Linux, UNIX and OSX systems.system("...") # provide under '...' command to run external software e.g. Perl, Python, C++ programs
Related utilities on Windows operating systemsx <- shell("dir", intern=T) # reads current working directory and assigns to file
shell.exec("C:/Documents and Settings/Administrator/Desktop/my_file.txt") # opens file with associated program
Batch import and export of many files.
In the following example all file names ending with *.txt in the current directory are first assigned to a list (the '$' sign is used to anchor the match to the end of a string). Second, the files are imported one-by-one using a for loop where the original names are assigned to the generated data frames with the assign function. Consult help with ?read.table to understand arguments row.names=1 and comment.char = "A". Third, the data frames are exported using their names for file naming and appending the extension *.out.
Executing an R script from the R consolefiles <- list.files(pattern=".txt$")
for(i in files) {
x <- read.table(i, header=TRUE, comment.char = "A", sep="\t")
assign(i, x)
print(i)
write.table(x, paste(i, c(".out"), sep=""), quote=FALSE, sep="\t", col.names = NA)
}
Running R Programs
Syntax for running R programs from the command-line. Requires in first line of my_script.R the following statement: #!/usr/bin/env Rscriptsource("my_script.R")
$ Rscript my_script.R # or just ./myscript.R after making file executable with 'chmod +x my_script.R'
Alternatively, one can use the following syntax to run R programs in BATCH mode from the command-line.
$ R CMD BATCH [options] my_script.R [outfile]
The output file lists the commands from the script file and their outputs. If no outfile is specified, the name used is that of infile and .Rout is appended to outfile. To stop all the usual R command line information from being written to the outfile, add this as first line to my_script.R file: options(echo=FALSE). If the command is run like this R CMD BATCH --no-save my_script.R, then nothing will be saved in the .Rdata file which can get often very large. More on this can be found on the help pages: $ R CMD BATCH --help or ?BATCH.
(2.3) Another alternative for running R programs as silently as possible.
Passing Command-Line Arguments to R Programs$ R --slave < my_infile > my_outfile
Argument --slave makes R run as 'quietly' as possible.Create an R script, here named test.R, like this one:
myarg <- commandArgs()
print(iris[1:myarg, ])Then run it from the command-line like this:
$ Rscript test.R 10
In the given example the number 10 is passed on from the command-line as an argument to the R script which is used to return to STDOUT the first 10 rows of the iris sample data. If several arguments are provided, they will be interpreted as one string that needs to be split it in R with the strsplit function.
- For Loop For loops are controlled by a looping vector. In every iteration of the loop one value in the looping vector
is assigned to a variable that can be used in the statements of the body of the loop. Usually, the number of loop iterations is
defined by the number of values stored in the looping vector and they are processed in the same order as they are stored in the
looping vector.
Statistical features
R and its libraries implement a wide variety of statistical and graphical techniques, including linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering, and others. R is easily extensible through functions and extensions, and the R community is noted for its active contributions in terms of packages. There are some important differences, but much code written for S runs unaltered. Many of R's standard functions are written in R itself, which makes it easy for users to follow the algorithmic choices made.
R can link and call at run time C, C++, and Fortran code. Advanced users can write their own C, C++, Java, [NET or Python code to manipulate R objects directly.
Examples
Readers wishing to get a feel for R can start with the introductory session given in A sample session.
From Wikipedia:
Example 1[edit]
The following examples illustrate the basic syntax of the language and use of the command-line interface.
In R, the widely preferred assignment operator is an arrow made from two characters "<-", although "=" can be used instead.[26]
> x <- c(1,2,3,4,5,6) # Create ordered collection (vector) > y <- x^2 # Square the elements of x > print(y) # print (vector) y [1] 1 4 9 16 25 36 > mean(y) # Calculate average (arithmetic mean) of (vector) y; result is scalar [1] 15.16667 > var(y) # Calculate sample variance [1] 178.9667 > lm_1 <- lm(y ~ x) # Fit a linear regression model "y = f(x)" or "y = B0 + (B1 * x)" # store the results as lm_1 > print(lm_1) # Print the model from the (linear model object) lm_1Call:
lm(formula = y ~ x)Coefficients:
(Intercept) x -9.333 7.000> summary(lm_1) # Compute and print statistics for the fit # of the (linear model object) lm_1
Call: lm(formula = y ~ x)Residuals: 1 2 3 4 5 6 3.3333 -0.6667 -2.6667 -2.6667 -0.6667 3.3333Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -9.3333 2.8441 -3.282 0.030453 * x 7.0000 0.7303 9.585 0.000662 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1Residual standard error: 3.055 on 4 degrees of freedom Multiple R-squared: 0.9583, Adjusted R-squared: 0.9478 F-statistic: 91.88 on 1 and 4 DF, p-value: 0.000662> par(mfrow=c(2, 2)) # Request 2x2 plot layout > plot(lm_1) # Diagnostic plot of regression modelDiagnostic graphs produced by plot.lm() function. Features include mathematical notation in axis labels, as at lower left.
Example 2[edit]
Short R code calculating Mandelbrot set through the first 20 iterations of equation z = z² + c plotted for different complex constants c. This example demonstrates: use of community-developed external libraries (called packages), in this case caTools package handling of complex numbers multidimensional arrays of numbers used as basic data type, see variables C, Z and X.
library(caTools) # external package providing write.gif function jet.colors <- colorRampPalette(c("#00007F", "blue", "#007FFF", "cyan", "#7FFF7F", "yellow", "#FF7F00", "red", "#7F0000")) m <- 10000 # define size C <- complex( real=rep(seq(-1.8,0.6, length.out=m), each=m ), imag=rep(seq(-1.2,1.2, length.out=m), m ) ) C <- matrix(C,m,m) # reshape as square matrix of complex numbers Z <- 0 # initialize Z to zero X <- array(0, c(m,m,20)) # initialize output 3D array for (k in 1:20) { # loop with 20 iterations Z <- Z^2+C # the central difference equation X[,,k] <- exp(-abs(Z)) # capture results } write.gif(X, "Mandelbrot.gif", col=jet.colors, delay=800)"Mandelbrot.gif" Graphics created in R with 14 lines of code in Example 2
Example 3[edit]
The ease of function creation by the user is one of the strengths of using R. Objects remain local to the function, which can be returned as any data type.[27] Below is an example of the structure of a function:
functionname <- function(arg1, arg2, ... ){ # declare name of function and function arguments statements # declare statements return(object) # declare object data type }sumofsquares <- function(x){ # a user-created function return(sum(x^2)) # return the sum of squares of the elements of vector x } > sumofsquares(1:3) [1] 14
|
Switchboard | ||||
Latest | |||||
Past week | |||||
Past month |
NEWS CONTENTS
- 20171206* Install R on RedHat errors on dependencies that don't exist ( Dec 06, 2017 , stackoverflow.com ) [Recommended]
- 20171206 : Download RStudio Server -- RStudio ( Dec 06, 2017 , www.rstudio.com )
- 20171206 : The difficulties of moving from Python to R ( Dec 06, 2017 , blog.danwin.com )
- 20160918 : R Weekly ( Sep 18, 2016 , tm.durusau.net )
- 20160918 : Learning R and Perl - Stack Overflow ( Learning R and Perl - Stack Overflow, )
- 20160918 : Getting Started with R RStudio Support ( Getting Started with R RStudio Support, )
- 20150610 : The R Language The Good The Bad And The Ugly - John Cook ( Mar 27, 2013 , YouTube )
- 20150609 : The R Programming Language - Free Computer, Programming, Mathematics, Technical Books, Lecture Notes and Tutorials ( The R Programming Language - Free Computer, Programming, Mathematics, Technical Books, Lecture Notes and Tutorials, Jun 09, 2015 )
- 20150609 : Brought To You By the Letter R: Microsoft Acquiring Revolution Analytics ( Brought To You By the Letter R: Microsoft Acquiring Revolution Analytics, )
Old News ;-)
![]() |
![]() |
![]() |
[Dec 06, 2017] Install R on RedHat errors on dependencies that don't exist
Highly recommended!
Dec 06, 2017 | stackoverflow.com
Jon ,Jul 11, 2014 at 23:55
I have installed R before on a machine running RedHat EL6.5, but I recently had a problem installing new packages (i.e. install.packages()). Since I couldn't find a solution to this, I tried reinstalling R using:sudo yum remove Rand
sudo yum install RBut now I get:
.... ---> Package R-core-devel.x86_64 0:3.1.0-5.el6 will be installed --> Processing Dependency: blas-devel >= 3.0 for package: R-core-devel-3.1.0-5.el6.x86_64 --> Processing Dependency: libicu-devel for package: R-core-devel-3.1.0-5.el6.x86_64 --> Processing Dependency: lapack-devel for package: R-core-devel-3.1.0-5.el6.x86_64 ---> Package xz-devel.x86_64 0:4.999.9-0.3.beta.20091007git.el6 will be installed --> Finished Dependency Resolution Error: Package: R-core-devel-3.1.0-5.el6.x86_64 (epel) Requires: blas-devel >= 3.0 Error: Package: R-core-devel-3.1.0-5.el6.x86_64 (epel) Requires: lapack-devel Error: Package: R-core-devel-3.1.0-5.el6.x86_64 (epel) Requires: libicu-devel You could try using --skip-broken to work around the problem You could try running: rpm -Va --nofiles --nodigestI already checked, and blas-devel is installed, but the newest version is 0.2.8. Checked using:
yum info openblas-devel.x86_64Any thoughts as to what is going wrong? Thanks.
Scott Ritchie ,Jul 12, 2014 at 0:31
A cursory search ofblas-devel
in google shows that the latest version is at least version 3.2. You probably used to have an older version of R installed, and the newer version depends on a version of BLAS not available in RedHat? – Scott Ritchie Jul 12 '14 at 0:31bdemarest ,Jul 12, 2014 at 0:31
Can solve this bysudo yum install lapack-devel
, etc.. until the errors stop. – bdemarest Jul 12 '14 at 0:31Jon ,Jul 14, 2014 at 4:08
sudo yum install lapack-devel
does not work. Returns:No package lapack-devel available.
Scott - you are right that blas-devel is not available in yum. What is the best way to fix this? – Jon Jul 14 '14 at 4:08Owen ,Aug 27, 2014 at 18:33
I had the same issue. Not sure why these packages are missing from RHEL's repos, but they are in CentOS 6.5, so the follow solution works, if you want to keep things in the package paradigm:wget http://mirror.centos.org/centos/6/os/x86_64/Packages/lapack-devel-3.2.1-4.el6.x86_64.rpm wget http://mirror.centos.org/centos/6/os/x86_64/Packages/blas-devel-3.2.1-4.el6.x86_64.rpm wget http://mirror.centos.org/centos/6/os/x86_64/Packages/texinfo-tex-4.13a-8.el6.x86_64.rpm wget http://mirror.centos.org/centos/6/os/x86_64/Packages/libicu-devel-4.2.1-9.1.el6_2.x86_64.rpm sudo yum localinstall *.rpmcheers
UPDATE: Leon's answer is better -- see below.
DavidJ ,Mar 23, 2015 at 19:50
When installing texinfo-tex-5.1-4.el7.x86_654, it complains about requiring tex(epsd.tex), but I've no idea which package supplies that. This is on RHEL7, obviously (and using CentOS7 packages). – DavidJ Mar 23 '15 at 19:50Owen ,Mar 24, 2015 at 21:07
Are you trying to install using rpm or yum? yum should attempt to resolve dependencies. – Owen Mar 24 '15 at 21:07DavidJ ,Mar 25, 2015 at 14:18
It was yum complaining. Adding the analogous CentOS repo to /etc/yum.repos.d temporarily and then installing just the missing dependencies, then removing it and installing R fixed the issue. It is apparently a issue/bug with the RHEL package dependencies. I had to be careful to ensure the all other packages came from the RHEL repos, not CentOS, hence not a good idea to install R itself when the CentOS repo is active. – DavidJ Mar 25 '15 at 14:18Owen ,Mar 26, 2015 at 4:49
Glad you figured it out. When I stumbled on this last year I was also surprised that the Centos repos seemed more complete than RHEL. – Owen Mar 26 '15 at 4:49Dave X ,May 28, 2015 at 19:33
They are in the RHEL optional RPMs. See Leon's answer. – Dave X May 28 '15 at 19:33Leon ,May 21, 2015 at 18:38
Do the following:
- vim /etc/yum.repos.d/redhat.repo
- Change enabled = 0 in [rhel-6-server-optional-rpms] section of the file to enabled=1
- yum install R
DONE!
I think I should give reference to the site of solution:
https://bluehatrecord.wordpress.com/2014/10/13/installing-r-on-red-hat-enterprise-linux-6-5/
Dave X ,May 28, 2015 at 19:31
Works for RHEL7 with [rhel-7-server-optional-rpms] change too. – Dave X May 28 '15 at 19:31Jon ,Aug 4, 2014 at 4:49
The best solution I could come up with was to install from source. This worked and was not too bad. However, now it isn't in my package manager.
![]() |
![]() |
![]() |
[Dec 06, 2017] Download RStudio Server -- RStudio
Dec 06, 2017 | www.rstudio.com
RStudio Server v0.99 requires RedHat or CentOS version 5.4 (or higher) as well as an installation of R. You can install R for RedHat and CentOS using the instructions on CRAN: https://cran.rstudio.com/bin/linux/redhat/README .RedHat/CentOS 6 and 7
To download and install RStudio Server open a terminal window and execute the commands corresponding to the 32 or 64-bit version as appropriate.
64bit
Size: 43.5 MB MD5: 1e973cd9532d435d8a980bf84ec85c30 Version: 1.1.383 Released: 2017-10-09
$ wget https://download2.rstudio.org/rstudio-server-rhel-1.1.383-x86_64.rpm
$ sudo yum install --nogpgcheck rstudio-server-rhel-1.1.383-x86_64.rpm
See the Getting Started document for information on configuring and managing the server.
Read the RStudio Server Professional Admin Guide for more detailed instructions.
![]() |
![]() |
![]() |
[Dec 06, 2017] The difficulties of moving from Python to R
Dec 06, 2017 | blog.danwin.com
This post is in response to: Python, Machine Learning, and Language Wars , by Sebastian Raschka
As someone who's switched from Ruby to Python (because the latter is far easier to teach, IMO) and who has also put significant time into learning R just to use ggplot2, I was really surprised at the lack of relevant Google results for "switching from python to r" – or similarly phrased queries. In fact, that particular query will bring up more results for R to Python , e.g. " Python Displacing R as The Programming Language For Data ". The use of R is so ubiquitous in academia (and in the wild, ggplot2 tends to wow nearly on the same level as D3) that I had just assumed there were a fair number of Python/Ruby developers who have tried jumping into R. But there aren't minimaxir's guides are the most and only comprehensive how-to-do-R-as-written-by-an-outsider guides I've seen on the web.
By and far, the most common shift seems to be that of Raschka's – going from R to Python:
Well, I guess it's no big secret that I was an R person once. I even wrote a book about it So, how can I summarize my feelings about R? I am not exactly sure where this quote is comes from – I picked it up from someone somewhere some time ago – but it is great for explaining the difference between R and Python: "R is a programming language developed by statisticians for statisticians; Python was developed by a computer scientist, and it can be used by programmers to apply statistical techniques." Part of the message is that both R and Python are similarly capable for "data science" tasks, however, the Python syntax simply feels more natural to me – it's a personal taste.
That said, one of the things I've appreciated about R is how it "just works" I usually install R through Homebrew, but installing RStudio via point and click is also straightforward . I can see why that's a huge appeal for both beginners and people who want to do computation but not necessarily become developers. Hell, I've been struggling for what feels like months to do just the most rudimentary GIS work in Python 3 . But in just a couple weeks of learning R – and leveraging however it manages to package GDAL and all its other geospatial dependencies with rgdal – been able to create some decent geospatial visualizations (and queries) :
... ... ...
I'm actually enjoying plotting with Matplotlib and seaborn, but it's hard to beat the elegance of ggplot2 – it's worth learning R just to be able to read and better understand Wickham's ggplot2 book and its explanation of the "Grammar of Graphics" . And there's nothing else quite like ggmap in other languages.Also, I used to hate how
<-
was used for assignment. Now, that's one of the things I miss most about using R. I've grown up with single-equals-sign assignment in every other language I've learned, but after having to teach some programming the difference between==
and=
is a common and often hugely stumping error for beginners. Not only that, they have trouble remembering how assignment even works, even for basic variable assignment I've come to realize that I've programmed so long that I immediately recognize the pattern, but that can't possibly be the case for novices, who if they've taken general math classes, have never seen the equals sign that way. The<-
operator makes a lot more sense though I would have never thought that if hadn't read Hadley Wickham's style guide .Speaking of Wickham's style guide, one thing I wish I had done at the very early stages of learning R is to have read Wickham's Advanced R book – which is free online (and contains the style guide). Not only is it just a great read for any programmer, like everything Wickham writes, it is not at all an "advanced" book if you are coming from another language. It goes over the fundamentals of how the language is designed. For example, one major pain point for me was not realizing that R does not have scalars – things that appear to be scalars happen to be vectors of length one. This is something Wickham's book mentions in its Data structures chapter .
Another vital and easy-to-read chapter: Wickham's explanation of R's non-standard evaluation has totally illuminated to me why a programmer of Wickham's caliber enjoys building in R, but why I would find it infuriating to teach R versus Python to beginners.
(Here's another negative take on non-standard evaluation , by an R-using statistician)
FWIW, Wickham has posted a repo attempting to chart and analyze various trends and metrics about R and Python usage . I won't be that methodical; on Reddit, r/Python seems to be by far the biggest programming subreddit. At the time of writing, it has 122,690 readers . By comparison, r/ruby and r/javascript have 31,200 and 82,825 subscribers, respectively. The R-focused subreddit, r/rstats , currently has 8,500 subscribers.
The Python community is so active on Reddit that it has its own learners subreddit – r/learnpython – with 54,300 subscribers .
From anecdotal observations, I don't think Python shows much sign of diminishing popularity on Hacker News, either. Not just because Python-language specific posts keep making the front page, but because of the general increased interest in artificial intelligence, coinciding with Google's recent release of TensorFlow , which they've even quickly ported to Python 3.x .
![]() |
![]() |
![]() |
[Sep 18, 2016] R Weekly
Sep 18, 2016 | tm.durusau.net
September 12th, 2016 R WeeklyA new weekly publication of R resources that began on 21 May 2016 with Issue 0 .
Mostly titles of post and news articles, which is useful, but not as useful as short summaries, including the author's name.
![]() |
![]() |
![]() |
Learning R and Perl - Stack Overflow
I can recommend Penn University's Introductory Course on R.The ggplot chapter alone is worth reading - I found ggplot very confusing but this is a great explanation.
![]() |
![]() |
![]() |
Getting Started with R RStudio Support
> Garrett Grolemund,New to R?
There are hundreds of websites that can help you learn the language. Here's how you can use some of the best to become a productive R programmer.
Start by downloading R and RStudio.
Learn the basicsVisit Try R to learn how to write basic R code. These interactive lessons will get you writing real code in minutes, and they'll tell you immediately when you go wrong.
Broaden your skillsWork through The Beginner's Guide to R by Computerworld Magazine. This 30 page guide will show you how to install R, load data, run analyses, make graphs, and more.
Practice good habitsRead the Google R Style Guide for advice on how to write readable, maintainable code. This is how other R users will expect your code to look when you share it.
Look up helpWhen you need to learn more about an R function or package, visit Rdocumentation.org, a searchable database of R documentation. You can search for R packages and functions, look at package download statistics, and leave and read comments about R functions.
Ask questionsSeek help at StackOverflow, a searchable forum of questions and answers about computer programming. StackOverflow has answered (and archived) over 40,000 questions related to R programming. You can browse StackOverflow's archives and see which answers have been upvoted by users, or you can ask your own R related questions and wait for a response.
If you a have question that is more about statistical methodology there are also plenty of R users active on the the CrossValidated Q&A community.
Keep tabs on the R communityRead R bloggers, a blog aggregator that reposts R related articles from across the web. A good place to find R tutorials, announcements, and other random happenings.
Deepen your expertiseOnce you've gained some familiarity with R, The R Inferno provides an entertaining roadmap to some of the deeper subtleties of the language and how to work with it most effectively.
This blog post by Noam Ross also provides valuable advice for writing fast R code.
Got R down? Give Shiny a tryNow that you know R, work through our Shiny lessons to learn how to make interactive web apps with R.
![]() |
![]() |
![]() |
[Jun 10, 2015] The R Language The Good The Bad And The Ugly - John Cook
"...May be a bit old, but a good talk on the quirks and niceties of R"
Mar 27, 2013 | YouTube
Here's a description of John's talk from GOTO Aarhus 2012: R is a domain-specific language for analyzing data. Why does data analysis need its own DSL? What does R do well and what does it do poorly? How can developers take advantage of R's strengths and mitigate its weaknesses? This talk will give some answers to these questions.
Jesus M. Castagnetto
May be a bit old, but a good talk on the quirks and niceties of R
![]() |
![]() |
![]() |
[Jun 09, 2015] The R Programming Language - Free Computer, Programming, Mathematics, Technical Books, Lecture Notes and Tutorials
- Learning Statistics with R (Daniel Navarro)
This book takes you on a guided tour of software development with R, from basic types and data structures to advanced topics like closures, recursion, and anonymous functions. No statistical knowledge is required.
- An Introduction to Statistical Learning: with Applications in R
It provides an accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance to marketing to astrophysics in the past twenty years.
- R Succinctly (Barton Poulson)
Begin developing your mastery of the powerful R programming language. Become comfortable with the R environment and learn how to find ways for R to fulfill your data needs.
- The R Inferno (Patrick Burns)
An essential guide to the trouble spots and oddities of R. In spite of the quirks exposed here, R is the best computing environment for most data analysis tasks.
- Statistics with R (Vincent Zoonekynd)
This book provides an elementary-level introduction to R, targeting both non-statistician scientists in various fields and students of statistics.
- The Art of R Programming: A Tour of Statistical Software Design
A guided tour of software development with R, from basic types and data structures to advanced topics like closures, recursion, and anonymous functions. No statistical knowledge is required, and your programming skills can range from hobbyist to pro.
- Advanced R Programming (Hadley Wickham)
The book is designed primarily for R users who want to improve their programming skills and understanding of the language. It should also be useful for programmers coming to R from other languages.
- Introduction to Probability and Statistics Using R (G. Jay Kerns)
This is a textbook for an undergraduate course in probability and statistics. Calculus and some linear algebra knowledge is required.
- Using R for Data Analysis and Graphics (J H Maindonald)
This book guides users through the practical, powerful tools that the R system provides. The emphasis is on hands-on analysis, graphical display, and interpretation of data.
- Introduction to Data Science with Introduction to R (Jeffrey Stanton)
This book provides non-technical readers with a gentle introduction to essential concepts and activities of data science. For more technical readers, the book provides explanations and code for a range of interesting applications using the open source R language for statistical computing and graphics.
- An Introduction to R: A Programming Environment for Data Analysis
This tutorial manual provides a comprehensive introduction to R, an open source software package for statistical computing and graphics
- Introduction to Statistical Thinking, with R, without Calculus
This is an introduction to statistics, with R, without calculus, for students who are required to learn statistics, students with little background in mathematics and often no motivation to learn more.
- Applied Spatial Data Analysis with R (Roger S. Bivand, et al)
This book will be of interest to researchers who intend to use R to handle, visualise, and analyse spatial data.
- A Handbook of Statistical Analyses Using R (Brian S. Everitt, et al)
This book is the perfect guide for newcomers as well as seasoned users of R who want concrete, step-by-step guidance on how to use the software easily and effectively for nearly any statistical analysis.
- Using R for Introductory Statistics (John Verzani)
This book lays the foundation for further study and development in statistics using R - an ideal text for integrating the study of statistics with a powerful computational tool.
- The R Programming Language
This is the previous page of The R Programming Language, we are in the processing to convert all the books there to the new page. Please check this page daily!!!
![]() |
![]() |
![]() |
Brought To You By the Letter R: Microsoft Acquiring Revolution Analytics
timothy from the interesting-choice-of-letter dept.theodp writes Maybe Bill Gates' Summer Reading this year will include The Art of R Programming. Pushing further into Big Data, Microsoft on Friday announced it's buying Revolution Analytics, the top commercial provider of software and services for the open-source R programming language for statistical computing and predictive analytics. "By leveraging Revolution Analytics technology and services," blogged Microsoft's Joseph Sirosh, "we will empower enterprises, R developers and data scientists to more easily and cost effectively build applications and analytics solutions at scale." Revolution Analytics' David Smith added, "Now, Microsoft might seem like a strange bedfellow for an open-source company [RedHat:Linux as Revolution Analytics:R], but the company continues to make great strides in the open-source arena recently." Now that it has Microsoft's blessing, is it finally time for AP Statistics to switch its computational vehicle to R?
Recommended Links
Google matched content |
Softpanorama Recommended
Top articles
[Dec 06, 2017] Install R on RedHat errors on dependencies that don't exist Published on Dec 06, 2017 | stackoverflow.com
Sites
Top articles
Sites
Documentation
Google has a well known R style guide reflecting internal use of the language.
Youtube videos
R Learning Links (Rutgers University)
- Guides and Tutorials
- Starter Kit UCLA Statistical Computing has compiled excellent overviews of not only R, but also, SAS, SPSS, and Stata. Class notes, learning modules, and downloadable books.
- Cookbook for R - wiki style information
- R Programming Wikibook - a practical guide to the R programming language.
- R Tutorial Series - short tutorials on common topics.
- R Videos from Texas A&M, covers all the basics.
- Manuals, FAQs, Listservs and more are available from http://www.r-project.org/. The R Reference Card is a quite handy summary.
- Quick-R is a guide for experienced users of other stats packages like SAS, SPSS, or Stata.
- Springer Use R! series (available from Springerlink) has many useful guides to using R in different fields
- Searching for R on the Internet
- Don't type "R" on Google. "R project", "R statistics", or "R stats" are OK
- Better, use www.rseek.org
- Rsitesearch("your string") is built into R
- http://finzi.psych.upenn.edu/search.html is a custom search site
- http://addictedtor.free.fr/rsitesearch/ offers an R search plug-in for Firefox
- More Information
- Crantastic tags and reviews for the most popular R packages
- Task Views excellent outlines of the packages that are relevant to different disciplines
- Essential R Vocabulary List (Hadley Wickham)
- R Inferno common problems with R
- R Graph Gallery - examples and code of graphics techniques from various R packages
- R-bloggers combining posts from several R blogs
- #rstats on Twitter
Web resources
- FlowingData Modern data visualization
- One R Tip A Day Code examples for graphics and analysis
- Probability and statistics blog Monte Carlo simulations in R
- R Bloggers Daily news and tutorials about R, contributed by R bloggers worldwide.
- R Project group on analyticbridge.com Community and discussion forum
- Statistical Modeling, Causal Inference, and Social Science Andrew Gelman's statistics blog
- The Dataists Innovative and practical data analysis methodology.
- The R & BioConductor manual provides a general introduction to the usage of the R environment and its basic command syntax.
- R Programming for Bioinformatics, by Robert Gentleman
- Advanced R, by Hadley Wickham
- S Programming, by W. N. Venables and B. D. Ripley
- Programming with Data, by John M. Chambers
- R Help & R Coding Conventions, Henrik Bengtsson, Lund University
- Programming in R (Vincent Zoonekynd)
- Peter's R Programming Pages, University of Warwick
- Rtips, Paul Johnsson, University of Kansas
- R for Programmers, Norm Matloff, UC Davis
- The R language, for programmers John D. Cook
- High-Performance R, Dirk Eddelbuettel tutorial presented at useR-2008
- C/C++ level programming for R, Gopi Goswami
Problems with R
- Radford Neal's series on design flaws in R. Part I, II, III.
- The R Inferno
- The R Language The Good The Bad And The Ugly - John Cook - YouTube
- Statistical computation consulting
>
Etc
Society
Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers : Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism : The Iron Law of Oligarchy : Libertarian Philosophy
Quotes
War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda : SE quotes : Language Design and Programming Quotes : Random IT-related quotes : Somerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose Bierce : Bernard Shaw : Mark Twain Quotes
Bulletin:
Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 : Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law
History:
Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds : Larry Wall : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOS : Programming Languages History : PL/1 : Simula 67 : C : History of GCC development : Scripting Languages : Perl history : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history
Classic books:
The Peter Principle : Parkinson Law : 1984 : The Mythical Man-Month : How to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Haters Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite
Most popular humor pages:
Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor
The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D
Copyright © 1996-2018 by Dr. Nikolai Bezroukov. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) in the author free time and without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.
This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...
![]() |
You can use PayPal to make a contribution, supporting development of this site and speed up access. In case softpanorama.org is down you can use the at softpanorama.info |
Disclaimer:
The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.
The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.
Last modified: January, 02, 2018