Usually data used for statistical analysis are to be organized as vectors or matrices. Here, we will learn how to create simple vectors or arrays and matrice. We will also learn to perform some operations with them.
Here are a couple of simple examples for creating and storing vectors in a variable. The first vector \(x\) is created by typing in a set of values and the second vector \(y\) is created using the seq() function.
# To create a vector of elements
x<- c(1, 3, 5.6, 7, 8.9, 10, 2, 1.1, 8, 7)
y<- seq(from=1, to=20, length=10) # creates a vector of equispaced values between 1 and 20 of length 10
print(x)
## [1] 1.0 3.0 5.6 7.0 8.9 10.0 2.0 1.1 8.0 7.0
print(y)
## [1] 1.000000 3.111111 5.222222 7.333333 9.444444 11.555556 13.666667
## [8] 15.777778 17.888889 20.000000
Note: The seq() function is one of the many in-built functions available in R. This function creates a vector of equispaced values of a given length between to end points. Help on a function in R can be accessed using the help() command as illustrated below.
help(seq)
Some common functions applicable to vectors are illustrated below.
#sum
sum(x)
## [1] 53.6
#simple mean or average
mean(x)
## [1] 5.36
# standard deviation
sd(x)
## [1] 3.34139
#minimum
min(x)
## [1] 1
#maximum
max(x)
## [1] 10
#Percentile or quantile: Loosely speaking 80th percentile would be a particular value, say q, such that 80 percent of the values in the vector are lesser than or equal to q. If there is no exact match then the quantile is computed through interpolation. Use help(quantile)for more details.
quantile(x, prob=.8)
## 80%
## 8.18
Performing logical operations on vectors can be very useful for many computations. Here are a few illustrations.
# To check a particular condition on every element of a vector
print(x)
## [1] 1.0 3.0 5.6 7.0 8.9 10.0 2.0 1.1 8.0 7.0
print(y)
## [1] 1.000000 3.111111 5.222222 7.333333 9.444444 11.555556 13.666667
## [8] 15.777778 17.888889 20.000000
z1 <- (x>2.5)
print(z1)
## [1] FALSE TRUE TRUE TRUE TRUE TRUE FALSE FALSE TRUE TRUE
z2<- (x>2.5)*1
print(z2)
## [1] 0 1 1 1 1 1 0 0 1 1
# How many elements of x are less than 5?
sum(x<5)
## [1] 4
If we have two vectors of the same length, then we can perform element-wise addition, subtraction, multiplication, division, exponentiation and also logical operations as shown below.
#element-wise addition of two vectors
print(x+y)
## [1] 2.000000 6.111111 10.822222 14.333333 18.344444 21.555556 15.666667
## [8] 16.877778 25.888889 27.000000
#element-wise subtraction of two vectors
print(x-y)
## [1] 0.0000000 -0.1111111 0.3777778 -0.3333333 -0.5444444 -1.5555556
## [7] -11.6666667 -14.6777778 -9.8888889 -13.0000000
#element-wise multiplication of two vectors
print(x * y)
## [1] 1.000000 9.333333 29.244444 51.333333 84.055556 115.555556
## [7] 27.333333 17.355556 143.111111 140.000000
#element-wise division of two vectors
print(x/y)
## [1] 1.00000000 0.96428571 1.07234043 0.95454545 0.94235294 0.86538462
## [7] 0.14634146 0.06971831 0.44720497 0.35000000
#element-wise exponentiation
print(x^y)
## [1] 1.000000e+00 3.050544e+01 8.076146e+03 1.575381e+06 9.256806e+08
## [6] 3.593814e+11 1.300399e+04 4.498675e+00 1.429804e+16 7.979227e+16
# To check a particular condition element-wise between two vectors
z3<- (x>=y)*1
print(z3)
## [1] 1 0 1 0 0 0 0 0 0 0
Matrices can be created by either appending rows or columns. They can also be created by first creating a dummy array of appropriate dimensions and then specifying the values for each entry. For matrices of same dimensions (i.e. number of rows , number of columns), element-wise summation , subtraction , multiplication , division and logical operations can be performed similar to vectors. Matrix multiplication needs to be done using the %*% operator.
#creating matrix by binding columns
A<- cbind( c(1,2,3), c(4,5,6))
#creating matrix by binding rows
B<- rbind( c(2,3), c(5,6), c(9,10))
print(A)
## [,1] [,2]
## [1,] 1 4
## [2,] 2 5
## [3,] 3 6
print(B)
## [,1] [,2]
## [1,] 2 3
## [2,] 5 6
## [3,] 9 10
#Accessing an element from a matrix. e.g. the elemnt from 3nd row and 2nd column of A
A[3,2]
## [1] 6
#Accessing the 2nd column of A
v<- A[,2]
print(A)
## [,1] [,2]
## [1,] 1 4
## [2,] 2 5
## [3,] 3 6
# Accessing 3rd row of A
w<- A[3,]
print(w)
## [1] 3 6
# Accessing all rows except the 2nd row of A
print(A[-2,])
## [,1] [,2]
## [1,] 1 4
## [2,] 3 6
#elementwise addition
print(A+B)
## [,1] [,2]
## [1,] 3 7
## [2,] 7 11
## [3,] 12 16
#elementwise multiplication
print(A*B)
## [,1] [,2]
## [1,] 2 12
## [2,] 10 30
## [3,] 27 60
#elementwise division
print(A/B)
## [,1] [,2]
## [1,] 0.5000000 1.3333333
## [2,] 0.4000000 0.8333333
## [3,] 0.3333333 0.6000000
#elementwise exponentiation
print(A^B)
## [,1] [,2]
## [1,] 1 64
## [2,] 32 15625
## [3,] 19683 60466176
# element wise logical operations
print((A>B))
## [,1] [,2]
## [1,] FALSE TRUE
## [2,] FALSE FALSE
## [3,] FALSE FALSE
#creating matrix using array() command
C<- array( c(1,2,3,4,5,6), dim=c(2,3) )
print(C)
## [,1] [,2] [,3]
## [1,] 1 3 5
## [2,] 2 4 6
# Matrix multiplication
#dimension of A
print(dim(A))
## [1] 3 2
#dmension of C
print(dim(C))
## [1] 2 3
print(A%*%C)
## [,1] [,2] [,3]
## [1,] 9 19 29
## [2,] 12 26 40
## [3,] 15 33 51
Summary Operations such as mean, sum, sd, prod etc can be performed over all elements of the matrix. We can even perform the operations by row or column as illustrated below.
D<- cbind(c(1,2,3), c(2,5,6), c(3,6,10))
print(D)
## [,1] [,2] [,3]
## [1,] 1 2 3
## [2,] 2 5 6
## [3,] 3 6 10
#sum of all entries
print(sum(D))
## [1] 38
#product of all entries
print(prod(D))
## [1] 64800
# sum by columns
apply(D,MARGIN=2,FUN=sum)
## [1] 6 13 19
# product of rows
apply(D, MARGIN=1, FUN=prod)
## [1] 6 60 180
For square matrices some of the standard matrix functions such as transpose, determinant, eigen value decomposition can be done as follows
# transpose
tr_A <- t(A)
#inverse of a square matrix
D_inv<- solve(D)
print(D)
## [,1] [,2] [,3]
## [1,] 1 2 3
## [2,] 2 5 6
## [3,] 3 6 10
#check that product of D and D inverse gives the identity matrix
print(round(D%*%D_inv,1))
## [,1] [,2] [,3]
## [1,] 1 0 0
## [2,] 0 1 0
## [3,] 0 0 1
# diagonal elements of a suqare matrix
diag(D)
## [1] 1 5 10
# Trace of a square matrix: i.e. sum of diagonal elements
sum(diag(D))
## [1] 16
# determinant of a square matrix
print(det(D))
## [1] 1
#eigen value decomposition
out<- eigen(D)
#eigen values
out$values
## [1] 14.93303437 1.00000000 0.06696563
#eigen vectors
round(out$vectors,4)
## [,1] [,2] [,3]
## [1,] -0.2505 0.0000 0.9681
## [2,] -0.5370 -0.8321 -0.1390
## [3,] -0.8055 0.5547 -0.2084