R List

WHAT IS LIST
A list is an object which can store any object of any dimension.
A list can store a vector, matrix, array, data frame or list inside it. There is no limitation of size. A list will return a list when accessed by single square bracket. i.e [ ]. But when accessed via [[ ]], then it simplifies the output i.e we get output in the form of a vector or matrix. (in the form it was inserted in list)
METHOD OF CREATION
##A list can be created by list function()
l1<-list(name="Amy",num=1001,marks=c(70,75,80,68,79),mat=matrix(1:10,2,5))
l2<-list(name="Sam",num=1001,marks=c(56,78,76,69,89),mat=matrix(1:10,2,5))
l3<-list(name="Dan",num=1001,marks=c(69,86,75,87,65),mat=matrix(1:10,2,5))
## we have created three list, each one with four elements, name, number, and marks and at. Name is character type, num is a number, marks is a vector and mat is a matrix. It is like a structure which helps to store different data types in one place. Now we can make a composite list consisting of above three list.
l<-list(Amy=l1,Sam=l2,Dan=l3)
ACCESSING LIST
l1[1] ## return first element of list l1 in the form of list.
$name
[1] "Amy"

l1[2] ## return second element of list l1in the form of list.
$num
[1] 1001

l1[3] ## return third element of list l1 in the form of list.
$marks
[1] 70 75 80 68 79

## Now if we want to access the first element of marks( marks is a vector). Then first of all we have to use [[]]. By placing [[3]] output get simplified and we will get a vector. Now by placing another set of square bracket with element number 1,we will get first element of third element of list.
l1[[3]]   ## return third element of list in the form of vector.
[1] 70 75 80 68 79
l1[[3]][1]  ## return first element of third element of list in the form of vector.
[1] 70
l1[[4]]  ## return fourth element of list in the form of matrix.
       [,1] [,2] [,3] [,4] [,5]
[1,]    1    3    5    7    9
[2,]    2    4    6    8   10


 
 

R timeseries data

WHAT IS TIME SERIES DATA
Any data which is aligned with time is time series data. For example sales data of 12 months of year is time series data.
Time  Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Data  45   67  89    34   12    56  78  89   91   92   68   72
TIME SERIES DATA CREATION
R has a special data structure for storing the time series data. It is created by ts() function.
ts(data, start , end , frequency)
data : data is passed in the form of vector
start : start of time series data
end   : end of time series data (optional)
frequency : Decide the time difference between two readings.
                  : 1 for yearly data i.e 1 observation will be allocated to entire year.
                  : 2 for bimonthly data i.e 2 observation will be allocated to entire year.
                  : 4 for quarterly data i.e 4 observations will be allocated to entire year
                  : 12 for monthly data i.e 12 observations will be allocated to entire year
                  : 52 for weekly data i.e 52 observations will be allocated to entire year
                 : 24 for 15 day data
                 : 365 for daily data
EXAMPLES
1) ts(1:10,start=2000, frequency=1) ## yearly data
Time Series:
Start = 2000
End = 2009
Frequency = 1
 [1]  1  2  3  4  5  6  7  8  9 10

2) ts(1:12,start=2000, frequency=4) ## quarterly data
Qtr1 Qtr2 Qtr3 Qtr4
2000    1    2    3    4
2001    5    6    7    8
2002    9   10   11   12

3) ts(1:12,start=2000, frequency=12 ) ## monthly data
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2000   1   2   3   4   5   6   7   8   9  10  11  12

4) ts(1:24,start=2000, frequency=24) ## fortnight data
Time Series:
Start = c(2000, 1)
End = c(2000, 24)
Frequency = 24
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

5) ts(1:52,start=2000, frequency=52) ## weekly data
Time Series:
Start = c(2000, 1)
End = c(2000, 52)
Frequency = 52
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
[38] 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52




R Data Frame

WHAT IS DATA FRAME
A data frame is 2 dimensional data structure which can store any type of data. It can store number, integer, character, boolean or complex. Whenever we load a file in R, it creates a data frame. Data frame helps in creating a table like structure in R where we store relational data base structure. But R does not follow any key constraints.
DATA FRAME CREATION METHOD
## Method 1: By function data.frame
child<-c("Joe","Amy","John")
age<-c(8,9,10)
class<-c(4,5,6)
childdata<-data.frame(child,age,class,stringsAsFactors=FALSE)
childdata
     child age class
1   Joe     8     4
2   Amy   9     5
3  John  10     6


## Method2: By loading data file.
## Download ozone.csv from following link and save it in location C:/R with name ozone.csv. Upload this data in R using following code.
airquality<-read.table("C:/R/ozone.csv",header=TRUE, sep=",")
## we will get a dataframe with name airquality.
ACCESSING DATA FRAME
A dataframe is like a 2 dimensional structure matrix. Only difference is a matrix can store single type of data, but data frame can store any type of data.
airquality[1,1]  ## returns the 1 element of 1st row of dataframe.
airquality[1, ]   ## returns the 1st row of data frame.
airquality[ ,1]   ## returns the 1st column of data frame.
airquality[1:2,1:4] ## returns first 2 two rows and first four columns of data.
COLUMN NAMES OF DATA FRAME
While uploading data in R, if first row in file contains header then set flag header=TRUE. Each column of data frame can be accessed by directly placing $ ahead of column name.
airquality$Ozone  ## returns the ozone column of data
airquality$Solar.R  ## returns the solar.R column of data
airquality$Temp     ## returns the Temp column of data
QUESTIONS-1
1)Extract first two rows of data frame
airquality[1:2,]
2)How may observations are in this data frame
dim(airquality)
3)What is the value of Ozone in 47th row?
airquality$Ozone[[47]]
4)Extract the rows where Ozone value is above 31 and  temp value are above 90.
airquality[airquality$Temp>90&&airquality$Ozone>31,1:6]
5)Take the mean of Solar.R, use function mean.
mean(airquality$Solar.R)
QUESTIONS-2
Download another file crime.csv from link. 
This file has robbery and murder data for 50 states of U.S of year 2005.
crime<-read.table("C:/R/crime.csv",header=TRUE, sep=",")
1) Extract those rows where population>5000000
Crime[crime$Population>5000000,]
2) Extract the name of states where murder>6
Crime[crime$Murder>6,1]
3) Extract the name of states where the number of murder is between 3 and 6.
    Crime[Crime$Murder>3 & Crime$Murder <6 ,1]
4) Extract the average murder rate of all states where population>5000,000.
 mean(Crime[Crime$Population>5000000,2])
5) The name of state with maximum number of crime.
Crime[Crime$Murder==max(Crime$Murder),1]
6) The name of state with maximum number of robbery.
Crime[Crime$Robbery==max(Crime$Robbery),1]


R Array

An array is an object helps in storing 3 dimensional data in R. When it is used to store 2 dimensional data becomes equivalent to matrix. When it is used to store 1 dimensional data, it becomes equivalent to vector. The first dimension in array is number of rows, second one is number of columns and third is number of matrices. So we can assume array as collection of matrices with similar number of rows and columns.

ARRAY CREATION METHOD

 ## Method-1 : It can be created by function, array(). It needs two input- data, dimensions of array which is passed in the form of vector. Dimnames is optional and is used for passing dimension names.
a<-array(1:40, c(2,2,10), dimnames=list(c("A","B"),c("Science","Maths"),c(2001:2010")))
a<-array(c(1,2,3,4,5,6,7,8) c(2,2,2))
 ## Method-2: It can be created by passing matrix.
m<-(1:8,4,2)
a<-array(m,c(2,2,2))
 ## Method-3: It can be created by changing dimensions of array,
v<-c(1,2,3,4,5,6,7,8,9,10,11,12)
dim(v)<-c(2,2,3)
 ## Method-4: It can be created by making blank array.
v<-array(,c(2,2,))

ARRAY ATTRIBUTES


a<-array(1:16,c(2,4,2))
dim(a)                                   ## return the dimensions of array
dim(a)<-c(2,2,8)                   ##update the dimensions of array
rownames<-c("Amy","Ben") ## update the rownames of an array
colnames<-c("A","B")           ##  update the colnames of an array
dimnames(a)<-list(c("A","B"),c("1","2"),2001:2008) ## update the names of all dimensions. i.e rownames, colnames, matrixnames

ARRAY OPERATIONS

a<-array(1:8,c(2,2,2))
b<-array(9:16,c(2,2,2))
a+b  ## Addition of two array
a-b    ## subtraction of two array
a*b   ## multiplication of two array
a/b    ## division of two array


R Matrix

What are Matrices?

Matrices are the another type of R object which arranges data in 2 dimensional layout. They are like mathematical matrix with a defined set of row and column. These matrices can store, number, character, Boolean, integer or complex.

 MATRIX CREATION METHOD

## Method 1. Use function matrix. This function needs three inputs. First one is data which can be passed in the form of vector.Second one is number of rows and third is number of columns.
m<-matrix(c(1,2,3,4),nrow=2, ncol=2)
m<-matrix(1:4 ,nrow=2, ncol=2)
m<-matrix(seq(1,4,by=1),nrow=2, ncol=2)
v<-c(1,2,3,4)
m<-matrix(v ,nrow=2, ncol=2)

       [,1]  [,2]
[1,]   1     3
[2,]   2     4
It will create a vector of 2 rows and 2 columns
## Method 2.Change the dimensions of vector. A vector is one dimensional set of data. If we change the dimensions of vector it can take form of matrix.
v<-c(1,2,3,4,5,6)
dim(v)<-c(2,3)
## Method 3. Use cbind(), rbind() function. 
v1<-c(1,2,3,4)
v2<-c(5,6,7,8)
Now these two vectors can be binded horizontally or vertically for form a matrix.
cbind(v1,v2)                     rbind(v1,v2)
       [,1] [,2]                             [,1] [,2] [,3] [,4]
[1,]   1    5                        [1,]   1    2    3    4
[2,]   2    6                        [2,]   5    6    7    8
[3,]   3    7
[4,]   4    8
## Method 4. create a blank matrix and update the matrix when needed.
m<-matrix(,nrow=2,ncol=2)
m
      [1,]  [2,]
[,1]  NA NA
[,2]  NA NA
Now we can update the matrix
m[1,1] = 1, m[1,2]=2,
m[2,1] =3,  m[2,2]=3

MATRIX ATTRIBUTES

## dim function give the detail of dimensions of matrix
m<-m(1:16, 4 4)
dim(m)<-c(8,2)                                     ## update dimensions of matrix
dim(m)                                                  ## Return dimension of matrix
[1]  4 4
colnames(m)<-c("Q1","Q2","Q3","Q4") ##update column name of matrix m
colnames(m)                                         ##Return column name of matrix m
rownames(m)>-c(1,2,3,4)                     ##update row names of matrix m
rownames(m)                                        ##update col names of matrix m


ACCESSING AND MODIFYING MATRIX

## An element of matrix can be modified by its row number and column number
m[1,1]   ## return the first element of first row of matrix m
m[1,]    ## return all elements of first row of matrix
m[,1]    ## return all elements of first column of matrix
m[2,3:4] ## return third and fourth column of second row
m[2:3,1]  ##return the second and third element of first column. 
m[,]         ## return all the elements of the matrix.

 MATRIX OPERATIONS  

## Addition of matrix
m1<-matrix(1;4,2,2)
m2<-matirx(5:8,2,2)
m1+m2                   ## addition of matrix
m2-m1                    ## subtraction of matrix
m1*m2                   ## product of matrix
m2/m1                    ##division of matrix
m1%*%m2             ##matrix multiplication
t(m)                        ## transpose of matrix
diag(m)                   ##diagonal of matrix
eigen(m)                 ## eigen value and eigen vectors of matrix
det(m)                     ## determinant of matrix
tr(m)                       ## trace of matrix

SOLVING EQUATIONS BY MATRIX 

x+2y=7
3x+y=11
##create a matrix of coefficients of x,y
a=matrix(c(1,3,2,1),2,2) 
b=matrix(c(7,11),2,1)
solve(a,b)
When b is not passed the solve(a) will return the inverse of a.

QUESTIONS

## Q1 Give the general expression to create a matrix in R.
The general expression to create a matrix in R is - matrix(data, nrow, ncol, byrow, dimnames)

## Q2 How do you access the element in the 2nd column and 4th row of a matrix named M?
The expression M[4,2] gives the element at 4th row and 2nd column.
## Q3 The sales percentage of two branches for 4 weeks is as follows ( Week start from Monday and end on Sunday).
1)40,45,34,67,56,87,45,23,45,27,37,87,98,45,25,35,54,56,76,84,65,35,56,45,67,67,77,87
2)34,37,39,41,45,49,51,46,45,49,52,55,58,60,67,55,54,58,65,69,70,74,75,65,64,68,69,74
3.1. Find  Average, max, min sales of both stores?
3.2 which day was best and worst of both stores?
3.3 Week average, week min, week max sales of both store?
3.4 Average sales of both the stores in the form of matrix?
3.5 Which store was performing better for each daty. Answer should be in the form of matrix value 1 or 2?


Translate

Monte Carlo Simulation with R

Stochastic Modeling A stochastic model is a tool for modeling data where uncertainty is present with the input. When input has cert...