Connect R with Google Analytics

Google Analytics is being used by analyst for various purposes, like who all are accessing their websites and at what time of day. What are the prominent keywords being entered in search criteria of webpage. It will be very helpful for analysts/professionals if they can directly import data from GA into for further analysis.
Method 1

install.package("RGoogleAnalytics")
require(RGoogleAnalytics)

## It need not be executed in each session as the token is saved in the working directory of R on your computer

token <- Auth(client.id="client Id",client.secret="Client Secret")
save(token,file="token_file")
## In future sessions it can be loaded as follows
 load("./token_file") ,
ValidateToken(token)
query.list<-Init(start.date="2017-5-30",
                         end.date ="2017-5-31",
                         dimensions = "ga:date,ga:hour",
                         metrics = "ga:sessions,ga:pageviews",
                         max.results=100000,
                         sort = "-ga:date",
                         table.id = "ga:table.id")
## Table ID is in the URL of your Google Analyics page. It is everything past the “p” in the URL. Example,  
 https://www.google.com/analytics/web/?hl=en#management/Setting/a48963421w80588688pTABLE_ID_NUMBER

ga.query <- QueryBuilder(query.list)
ga.data <- GetReportData(ga.query, token, split_daywise = T, delay = 5)

The data get saved in data fram ga.data.

 
 

R connectivity with Oracle

 R can be connected with different databases like Oracle, Teradata, Netezza.
 Here I am explaining connectivity with Oracle.

How to connect R with Oracle


##Step1: Install RJDBC package in R

install.packages('RJDBC')
library(RJDBC)

##Step 2: Download Oracle RJDBC Driver.
##Go to http://www.oracle.com/technetwork/database/enterprise-edition/jdbc-112010-090769.html. 


Download the ojdbc6.jar file. Place it in a permanent directory.

##Step 3: Create a Driver Object in R. 

jdbcDriver =JDBC("oracle.jdbc.OracleDriver",classPath="/directory/ojdbc6.jar")

##Step 4: Create a Connection to the Oracle Database . 
 jdbcConnection =dbConnect(jdbcDriver, "jdbc:oracle:thin:@//database.hostname.com:port/service_name/sid", "username", "password")

##Step 5: Run Oracle SQL Query.
##dbReadTable: read a table into a data frame

df1=dbReadTable(con,'PC_ITEM')

# dbGetQuery: read the result from a SQL statement to a data frame

df2=dbGetQuery(con,'select * from tabl where to_number(colname)<10')

# dbWriteTable: write a data frame to the schema. It is typically very slow with large tables.

 dbWriteTable(con,'TableName',dataframe)

Functions used in R

R has a huge list of functions. Below are very commonly used functions which are used in day to day life while working in R.
We will use following object to test the functionality of functions.
 m<-matrix(1:12,6,2)
 a<-array(1:8,c(2,2,2))
 d<-data.frame("Amy",1001,c(78,45,89,78,67))

##1. dim function 

It is used to check the dimensions of an object like matrix, array or data frame. Dim function is not applicable on vectors.

dim(m)
[1] 6 2

dim(a)
[1] 2 2 2

dim(d)
[1] 5 3

##2. head(obj,n) function 

It is used to print the first n lines of an object like matrix or array or data frame. By default n is 5. So we write head(m), it will show first five lines of matrix.
head(m, 2)
        [,1] [,2]
[1,]    1    7
[2,]    2    8

##3.tail (obj,n) function 

It is used to print the last n lines of an object like matrix or array or data frame. By default n is 5. So we write tail(m), it will show last five lines of matrix.
tail(m,2)
[,1] [,2]
[5,]    5   11
[6,]    6   1
2

##4. Str(Object) 

It is used to check the structure of any new object. Like for m it has returned that m is an integer matrix with 6 rows and 2 column. Apart from this it also display the data stored in structure.
str(m)
int [1:6, 1:2] 1 2 3 4 5 6 7 8 9 10 ...

##5 sort(object, decreasing=FALSE/TRUE) 

Sort object is to sort the data of an object in ascending or descending order.
v<-c(9,1, 3, -4,0,-9)
sort(v)
[1] -9 -4  0  1  3  9

##6 order(object,decreasing=FALSE)

order object returns the index number of the object in ascending or descending order 
order(c(4,2,7,1,3,9,10,16,13))
[1] 4 2 5 1 3 6 7 9 8

##7 split(x,f) 

##split function divides the data into groups as defined by f.
data(energy)
expand stature
9.21  Obese
7.53  lean
7.48  lean
8.08  lean
8.09  lean
10.15  Obese

split(energy$expand, energy$stature)
 $lean
7.53 7.48.....
$obese
9.2110.15....

 ## 8) unique(object). 

##Unique function returns the unique value inside a object unique(c(1,1,1,2,2,3,3,3,4,4,4))
[1] 1 2 3

## 9) paste(vector1, vector2, sep= , collapse=). 

Paste concatenates the two vectors according to their index number. First element of vector1 gets concatenated with first element of vector2 and value passed in sep will be placed between them. Now all these elements are collapsed togaeher with value of collapse placed between them The output of paste function is a one element vector which has all elements concatenated together.
part1<-c("M","na","i", "Te")
part2<-c("y","me","s","st")
paste(part1,part2,sep="" ,collapse=" ")

[1] "My name is Test"
paste(part1,part2,sep="." ,collapse="-")
[1] "M.y-na.me-i.s-Te.st"

part1<-c(1,3,5,7)
part2<-c(2,4,6,8)
paste(part1,part2,sep="" ,collapse="")
[1] "12345678"



Longitudinal Data Analysis

What is Longitudinal data
It is the collection of few observations over time from various sources such a blood pressure measurement during a marathon (1 hour) for many people. It is different from time series data in duration and source. Time series data is collection of lot of observation for one source.

Case Study
install.package("nlme")
library(nlme)
## We will do the analysis on Orthodont Data. It is a study on 27 children (16 boys and 11 girls). Data is the distance of centre of pituitary gland to the pterygomaxillary fissure. There are four measurement at age 8, 10, 12, 14.
head(Orthodont,10)
  distance age subject gender
1  26         8        M01    Male
2  25         10      M01    Male
3  29         12      M01    Male
4  31         14      M01    Male

## Questions to answer:
1)  Whether distances over time are larger for boys than for girl.
2)  Determine whether rate of change of distance over time is similar for boys and girls.

Step 1: Plot(Orthodont)
Step 2:## Create Scatter plot
           plot(distance~age, data=Orthodont,
                  ylab="distance"
                  xlab="age")
Step 3: ## create scatter plot with smother
          with(Orthodont, scatter.smooth(distance, age, col="blue",
                  ylab="distance", xlab="age", lpars=list(col="red",lwd=3)))

Step 4: fm1<-lmList(distance ~ age | subject, Orthodont)
Step 5: plot(intervals(fm1))

Step 6:## Create Box plot
            library(lattice)
            bwplot(distance~as.factor(age)|Sex, data=Orthodont,
            ylab="Distance",
            xlab="6 year duration-8,1012,14")


Analysis:
1) The trajectory of distance is approximately a linear function of age.
2) The trajectories vary between child.
3) The distance measurement increases with age.
4) The distance trajectories for boys are higher  on an average than girls.
5) There is a population trend as well as subject specific variation in the data.






R List

WHAT IS LIST
A list is an object which can store any object of any dimension.
A list can store a vector, matrix, array, data frame or list inside it. There is no limitation of size. A list will return a list when accessed by single square bracket. i.e [ ]. But when accessed via [[ ]], then it simplifies the output i.e we get output in the form of a vector or matrix. (in the form it was inserted in list)
METHOD OF CREATION
##A list can be created by list function()
l1<-list(name="Amy",num=1001,marks=c(70,75,80,68,79),mat=matrix(1:10,2,5))
l2<-list(name="Sam",num=1001,marks=c(56,78,76,69,89),mat=matrix(1:10,2,5))
l3<-list(name="Dan",num=1001,marks=c(69,86,75,87,65),mat=matrix(1:10,2,5))
## we have created three list, each one with four elements, name, number, and marks and at. Name is character type, num is a number, marks is a vector and mat is a matrix. It is like a structure which helps to store different data types in one place. Now we can make a composite list consisting of above three list.
l<-list(Amy=l1,Sam=l2,Dan=l3)
ACCESSING LIST
l1[1] ## return first element of list l1 in the form of list.
$name
[1] "Amy"

l1[2] ## return second element of list l1in the form of list.
$num
[1] 1001

l1[3] ## return third element of list l1 in the form of list.
$marks
[1] 70 75 80 68 79

## Now if we want to access the first element of marks( marks is a vector). Then first of all we have to use [[]]. By placing [[3]] output get simplified and we will get a vector. Now by placing another set of square bracket with element number 1,we will get first element of third element of list.
l1[[3]]   ## return third element of list in the form of vector.
[1] 70 75 80 68 79
l1[[3]][1]  ## return first element of third element of list in the form of vector.
[1] 70
l1[[4]]  ## return fourth element of list in the form of matrix.
       [,1] [,2] [,3] [,4] [,5]
[1,]    1    3    5    7    9
[2,]    2    4    6    8   10


 
 

Translate

Monte Carlo Simulation with R

Stochastic Modeling A stochastic model is a tool for modeling data where uncertainty is present with the input. When input has cert...