First published on MSDN on Jul 25, 2017
Guest post by
Slaviana Pavlovich
Microsoft Student Partner
My name is Slaviana Pavlovich. I am an IT and Management student at University College London with a passion for data science. I recently completed the Microsoft Professional Program for Data Science, where I developed core skills to work with data. If you are also interested in this career, but not sure where to start - I strongly encourage you to check it out. I also have a wide range of interests including 3D bioprinting, public speaking, and politics. Additionally, I enjoy swimming and photography to balance out my studies. I became a Microsoft Student Partner at the end of my first year and I absolutely enjoy being part of such a vibrant community. If you have any questions, feel free to ask!
Introduction
In today’s article, I am going to continue talking about R. In the second part of this two-part introduction to R (the first part is available
here
), we are going to consider:
• Matrices
• Data Frames
• Functions
• Conditionals
• Loops
Matrices
Matrix is another data type that we are going to look at. Matrix is a two-dimensional data set. A matrix is created using the function
matrix()
:
> # creating a matrix
> example <- matrix(c(99,45,4,47,2,5), nrow = 3, ncol = 2, byrow = TRUE)
> example
[,1] [,2]
[1,] 99 45
[2,] 4 47
[3,] 2 5
As you can see in the example above,
nrow
and
ncol
are used to define the values for rows and columns. Also,
byrow = TRUE
means that the matrix is filled by rows, while
byrow=FALSE
– by columns. Let’s look at the following example:
> # creating a 2x3 matrix that contains the numbers from 1 to 6 and filled by columns
> example.2 <- matrix(1:6, nrow = 2, ncol = 3, byrow = FALSE)
> example.2
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
To change the names of rows and columns of the matrix use
dimnames
:
> # creating a matrix
> A <- matrix(1:6, nrow = 3, byrow = TRUE)
> # setting row and column names
> A
[,1] [,2]
[1,] 1 2
[2,] 3 4
[3,] 5 6
> dimnames(A) = list(c("1row", "2row", "3row"), c("1col", "2col"))
> A
1col 2col
1row 1 2
2row 3 4
3row 5 6
There are certain operations you can do with matrices. You can transpose a matrix, using a function
t()
:
> M <- matrix(c(14,2,4,3,2,5), nrow = 2, ncol = 3, byrow = TRUE)
> M
[,1] [,2] [,3]
[1,] 14 2 4
[2,] 3 2 5
> t(M)
[,1] [,2]
[1,] 14 3
[2,] 2 2
[3,] 4 5
Furthermore, use
solve()
function to find an inverse of a square matrix:
> X <- matrix(1:4, nrow = 2, byrow = TRUE)
> X
[,1] [,2]
[1,] 1 2
[2,] 3 4
> solve(X)
[,1] [,2]
[1,] -2.0 1.0
[2,] 1.5 -0.5
Arithmetic operations are done element-wise:
> A <- matrix(1:6, nrow = 3, byrow = TRUE)
> B <- matrix(1:6, nrow = 2, byrow = TRUE)
> A
[,1] [,2]
[1,] 1 2
[2,] 3 4
[3,] 5 6
> B
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
> A + 2
[,1] [,2]
[1,] 3 4
[2,] 5 6
[3,] 7 8
> B / 2
[,1] [,2] [,3]
[1,] 0.5 1.0 1.5
[2,] 2.0 2.5 3.0
For matrix multiplication use “%*%”:
> A %*% B
[,1] [,2] [,3]
[1,] 9 12 15
[2,] 19 26 33
[3,] 29 40 51
In R, to select elements of the matrix, do the following:
> K <- matrix(4:7, nrow = 2, byrow = TRUE)
> K
[,1] [,2]
[1,] 4 5
[2,] 6 7
> K[1,2] # element at 1st row and 2rd column
[1] 5
> K[1,] # first row
[1] 4 5
> K[,2] # second column
[1] 5 7
Finally, we can always modify a matrix:
> V <- matrix(1:9, ncol = 3)
> V
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
> V[1,3] <- 0; V # modify a single element at 1st row and 3rd column to 0
[,1] [,2] [,3]
[1,] 1 4 0
[2,] 2 5 8
[3,] 3 6 9
> V[V>2] <- 1; V # change all elements greater than 2 to 1
[,1] [,2] [,3]
[1,] 1 1 0
[2,] 2 1 1
[3,] 1 1 1
Data Frames
After looking at matrices, I suggest learning about data frames. A data frame is a special case of a list (another data object in R that was previously considered in
the first part of the article
). Data frames are used for storing tables. Unlike matrices, each column, also known as a vector, can store different types of data (logical, numeric, character, complex, etc.). The function
data.frame()
is used to create a data frame:
> # creating a data frame
> table <- data.frame(name=c("Jack", "Karan", "Thomas", "Vito", "Kristine"),
+ age=c(19, 20, 19, 19, 19),
+ sex=c("M", "M", "M", "M", "F"),
+ colour=c("yellow", "red", "green", "blue", "pink"))
> table
name age sex colour
1 Jack 19 M yellow
2 Karan 20 M red
3 Thomas 19 M green
4 Vito 19 M blue
5 Kristine 19 F pink
> typeof(table)
[1] "list"
> class(table)
[1] "data.frame"
> # function of a data frame
> names(table)
[1] "name" "age" "sex" "colour"
> nrow(table)
[1] 5
> ncol(table)
[1] 4
There are several ways of accessing an element of a data frame:
> table[2:4] # columns starting from 2nd to 4th of data frame
age sex colour
1 19 M yellow
2 20 M red
3 19 M green
4 19 M blue
5 19 F pink
> table[c("colour","age")] # columns with the titles colour and age from data frame
colour age
1 yellow 19
2 red 20
3 green 19
4 blue 19
5 pink 19
In a similar way to matrices, it is possible to change the values of the elements:
> table[3,"age"] <- 20; table # modify the element at 3st row and column age to 20
name age sex colour
1 Jack 19 M yellow
2 Karan 20 M red
3 Thomas 20 M green
4 Vito 19 M blue
5 Kristine 19 F pink
Functions
There is a straightforward way of creating own functions in R. Let’s consider an example where our function is going to find the difference between two integers:
> example <- function (a, b) {
+ c <- a - b
+ c
+ }
> example(15, 1)
[1] 14
As shown above, the word
function
is used to declare a function in R. Now we are going to create a function that prints a type and class of an argument:
example<-function(X){
+ print(typeof(X))
+ print(class(X))
> example <- function (a, b) {
+ c <- a - b
+ c
+ }
> example(15, 1)
[1] 14
+ print(paste("The type is", typeof(X) , "and class is", class(X)))
+ }
> Y<-c("Vito")
> example(Y)
[1] "character"
[1] "character"
[1] "The type is character and class is character"
> Z<-c(11)
> example(Z)
[1] "double"
[1] "numeric"
[1] "The type is double and class is numeric"
If you want to take an input from the user, use the function
readline()
in R:
read.example <- function()
+ {
+ str <- readline(prompt="Your name: ")
+ return(as.character(str))
+ }
> print(paste("Nice to meet you,", read.example(), "!"))
Your name: Dre
[1] "Nice to meet you, Dre !"
Conditionals
To use conditional execution in R, we are going to use
if…else
statement:
> x <- 4
> if (x < 0) {
+ print("It is a negative number!")
+ } else if (x > 0) {
+ print("It is a positive number!")
+ } else
+ print("Zero!")
[1] "It is a positive number!"
> x <- -10
> if (x < 0) {
+ print("It is a negative number!")
+ } else if (x > 0) {
+ print("It is a positive number!")
+ } else
+ print("Zero!")
[1] "It is a negative number!"
> x <- 0
> if (x < 0) {
+ print("It is a negative number!")
+ } else if (x > 0) {
+ print("It is a positive number!")
+ } else
+ print("Zero!")
[1] "Zero!
Loops
Now we are going to consider the control statements in R, such as for{}, repeat{} and while{}.
· A
for{}
loop in the example below is going to print the first three numbers in the vector Y:
> Y <- c(17, 25, 19, 33, 11, 51, 55)
> for(i in 1:3) {
+ print(Y[i])
+ }
[1] 17
[1] 25
[1] 19
· In this example, a
repeat{}
loop is going to print “task” and after 3 loops it is going to break:
> task <- c("R is great!")
> i <- 3
> repeat {
+ i <- i + 1
+ print(task)
+ if(i > 5) {
+ break
+ }
+ }
[1] "R is great!"
[1] "R is great!"
[1] "R is great!"
· A
while{}
loop is going to follow the commands as long as the condition is true:
> i <- 1
> while(i < 5) {
+ print(i)
+ i <- i + 1
+ }
[1] 1
[1] 2
[1] 3
[1] 4
Resources
There are so many interesting resources online that can help you further with R. I strongly recommend checking them out. In the following article, I am going to cover data visualisation, stay updated!
https://academy.microsoft.com/en-us/professional-program/
Microsoft professional programmes, Big Data, Data Science
https://imagine.microsoft.com/en-us/Catalog
R Server Download for Students & Academics via Imagine Access
https://docs.microsoft.com/en-us/r-server/
R Server and R Documentation
https://www.microsoft.com/en-gb/cloud-platform/r-server
Microsoft R Server