Microsoft

# Matrices, Data Frames, Functions, Conditionals, Loops with R

First published on MSDN on Jul 25, 2017

Guest post by Slaviana Pavlovich Microsoft Student Partner

My name is Slaviana Pavlovich. I am an IT and Management student at University College London with a passion for data science. I recently completed the Microsoft Professional Program for Data Science, where I developed core skills to work with data. If you are also interested in this career, but not sure where to start - I strongly encourage you to check it out. I also have a wide range of interests including 3D bioprinting, public speaking, and politics. Additionally, I enjoy swimming and photography to balance out my studies. I became a Microsoft Student Partner at the end of my first year and I absolutely enjoy being part of such a vibrant community. If you have any questions, feel free to ask!

### Introduction

In today’s article, I am going to continue talking about R. In the second part of this two-part introduction to R (the first part is available here ), we are going to consider:

• Matrices

• Data Frames

• Functions

• Conditionals

• Loops

### Matrices

Matrix is another data type that we are going to look at. Matrix is a two-dimensional data set. A matrix is created using the function matrix() :

> # creating a matrix

> example <- matrix(c(99,45,4,47,2,5), nrow = 3, ncol = 2, byrow = TRUE)

> example

[,1] [,2]

[1,] 99 45

[2,] 4 47

[3,] 2 5

As you can see in the example above, nrow and ncol are used to define the values for rows and columns. Also, byrow = TRUE means that the matrix is filled by rows, while byrow=FALSE – by columns. Let’s look at the following example:

> # creating a 2x3 matrix that contains the numbers from 1 to 6 and filled by columns

> example.2 <- matrix(1:6, nrow = 2, ncol = 3, byrow = FALSE)

> example.2

[,1] [,2] [,3]

[1,] 1 3 5

[2,] 2 4 6

To change the names of rows and columns of the matrix use dimnames :

> # creating a matrix

> A <- matrix(1:6, nrow = 3, byrow = TRUE)

> # setting row and column names

> A

[,1] [,2]

[1,] 1 2

[2,] 3 4

[3,] 5 6

> dimnames(A) = list(c("1row", "2row", "3row"), c("1col", "2col"))

> A

1col 2col

1row 1 2

2row 3 4

3row 5 6

There are certain operations you can do with matrices. You can transpose a matrix, using a function t() :

> M <- matrix(c(14,2,4,3,2,5), nrow = 2, ncol = 3, byrow = TRUE)

> M

[,1] [,2] [,3]

[1,] 14 2 4

[2,] 3 2 5

> t(M)

[,1] [,2]

[1,] 14 3

[2,] 2 2

[3,] 4 5

Furthermore, use solve() function to find an inverse of a square matrix:

> X <- matrix(1:4, nrow = 2, byrow = TRUE)

> X

[,1] [,2]

[1,] 1 2

[2,] 3 4

> solve(X)

[,1] [,2]

[1,] -2.0 1.0

[2,] 1.5 -0.5

Arithmetic operations are done element-wise:

> A <- matrix(1:6, nrow = 3, byrow = TRUE)

> B <- matrix(1:6, nrow = 2, byrow = TRUE)

> A

[,1] [,2]

[1,] 1 2

[2,] 3 4

[3,] 5 6

> B

[,1] [,2] [,3]

[1,] 1 2 3

[2,] 4 5 6

> A + 2

[,1] [,2]

[1,] 3 4

[2,] 5 6

[3,] 7 8

> B / 2

[,1] [,2] [,3]

[1,] 0.5 1.0 1.5

[2,] 2.0 2.5 3.0

For matrix multiplication use “%*%”:

> A %*% B

[,1] [,2] [,3]

[1,] 9 12 15

[2,] 19 26 33

[3,] 29 40 51

In R, to select elements of the matrix, do the following:

> K <- matrix(4:7, nrow = 2, byrow = TRUE)

> K

[,1] [,2]

[1,] 4 5

[2,] 6 7

> K[1,2] # element at 1st row and 2rd column

[1] 5

> K[1,] # first row

[1] 4 5

> K[,2] # second column

[1] 5 7

Finally, we can always modify a matrix:

> V <- matrix(1:9, ncol = 3)

> V

[,1] [,2] [,3]

[1,] 1 4 7

[2,] 2 5 8

[3,] 3 6 9

> V[1,3] <- 0; V # modify a single element at 1st row and 3rd column to 0

[,1] [,2] [,3]

[1,] 1 4 0

[2,] 2 5 8

[3,] 3 6 9

> V[V>2] <- 1; V # change all elements greater than 2 to 1

[,1] [,2] [,3]

[1,] 1 1 0

[2,] 2 1 1

[3,] 1 1 1

### Data Frames

After looking at matrices, I suggest learning about data frames. A data frame is a special case of a list (another data object in R that was previously considered in the first part of the article ). Data frames are used for storing tables. Unlike matrices, each column, also known as a vector, can store different types of data (logical, numeric, character, complex, etc.). The function data.frame() is used to create a data frame:

> # creating a data frame

> table <- data.frame(name=c("Jack", "Karan", "Thomas", "Vito", "Kristine"),

+ age=c(19, 20, 19, 19, 19),

+ sex=c("M", "M", "M", "M", "F"),

+ colour=c("yellow", "red", "green", "blue", "pink"))

> table

name age sex colour

1 Jack 19 M yellow

2 Karan 20 M red

3 Thomas 19 M green

4 Vito 19 M blue

5 Kristine 19 F pink

> typeof(table)

[1] "list"

> class(table)

[1] "data.frame"

> # function of a data frame

> names(table)

[1] "name" "age" "sex" "colour"

> nrow(table)

[1] 5

> ncol(table)

[1] 4

There are several ways of accessing an element of a data frame:

> table[2:4] # columns starting from 2nd to 4th of data frame

age sex colour

1 19 M yellow

2 20 M red

3 19 M green

4 19 M blue

5 19 F pink

> table[c("colour","age")] # columns with the titles colour and age from data frame

colour age

1 yellow 19

2 red 20

3 green 19

4 blue 19

5 pink 19

In a similar way to matrices, it is possible to change the values of the elements:

> table[3,"age"] <- 20; table # modify the element at 3st row and column age to 20

name age sex colour

1 Jack 19 M yellow

2 Karan 20 M red

3 Thomas 20 M green

4 Vito 19 M blue

5 Kristine 19 F pink

### Functions

There is a straightforward way of creating own functions in R. Let’s consider an example where our function is going to find the difference between two integers:

> example <- function (a, b) {

+ c <- a - b

+ c

+ }

> example(15, 1)

[1] 14

As shown above, the word function is used to declare a function in R. Now we are going to create a function that prints a type and class of an argument:

example<-function(X){

+ print(typeof(X))

+ print(class(X))

> example <- function (a, b) {

+ c <- a - b

+ c

+ }

> example(15, 1)

[1] 14

+ print(paste("The type is", typeof(X) , "and class is", class(X)))

+ }

> Y<-c("Vito")

> example(Y)

[1] "character"

[1] "character"

[1] "The type is character and class is character"

> Z<-c(11)

> example(Z)

[1] "double"

[1] "numeric"

[1] "The type is double and class is numeric"

If you want to take an input from the user, use the function readline() in R:

+ {

+ return(as.character(str))

+ }

> print(paste("Nice to meet you,", read.example(), "!"))

[1] "Nice to meet you, Dre !"

### Conditionals

To use conditional execution in R, we are going to use if…else statement:

> x <- 4

> if (x < 0) {

+ print("It is a negative number!")

+ } else if (x > 0) {

+ print("It is a positive number!")

+ } else

+ print("Zero!")

[1] "It is a positive number!"

> x <- -10

> if (x < 0) {

+ print("It is a negative number!")

+ } else if (x > 0) {

+ print("It is a positive number!")

+ } else

+ print("Zero!")

[1] "It is a negative number!"

> x <- 0

> if (x < 0) {

+ print("It is a negative number!")

+ } else if (x > 0) {

+ print("It is a positive number!")

+ } else

+ print("Zero!")

[1] "Zero!

Loops

Now we are going to consider the control statements in R, such as for{}, repeat{} and while{}.

· A for{} loop in the example below is going to print the first three numbers in the vector Y:

> Y <- c(17, 25, 19, 33, 11, 51, 55)

> for(i in 1:3) {

+ print(Y[i])

+ }

[1] 17

[1] 25

[1] 19

· In this example, a repeat{} loop is going to print “task” and after 3 loops it is going to break:

> task <- c("R is great!")

> i <- 3

> repeat {

+ i <- i + 1

+ if(i > 5) {

+ break

+ }

+ }

[1] "R is great!"

[1] "R is great!"

[1] "R is great!"

· A while{} loop is going to follow the commands as long as the condition is true:

> i <- 1

> while(i < 5) {

+ print(i)

+ i <- i + 1

+ }

[1] 1

[1] 2

[1] 3

[1] 4

Resources

There are so many interesting resources online that can help you further with R. I strongly recommend checking them out. In the following article, I am going to cover data visualisation, stay updated!

https://academy.microsoft.com/en-us/professional-program/ Microsoft professional programmes, Big Data, Data Science