Sankey

February 15, 2018 by admin - Comments Disabled

Very interesting.

http://sankeymatic.com/build/

Tagged with: sankey

Some R tips

by admin - Comments Disabled

Splitting a column to many columns / Text-to-Columns
Package tibble.
mtcars %>% tibble::rownames_to_column('Car') %>% slice(1:3)

mtcars %>% tibble::rownames_to_column('Car') %>% tidyr::separate('Car',c('Brand','Model'), remove = F) %>% slice(1:5)

Source: datascience: An online community for showcasing R & Python tutorials. It operates as a networking platform for data scientists to promote their talent and get hired. Our mission is to empower data scientists by bridging the gap between talent and opportunity.

EU, EU Research

Postaus tutkijakammiosta

February 7, 2018 by admin - Comments Disabled

Hei,

joskus on hyvä hengähtää ja lukea vaikka EU-tutkimuksen ajankohtaisista asioista. EU-research -julkaisusta löytyy tuoreimmat tutkimuksen aikaansaannokset eri tieteenaloilta.

Tagged with: eu research

artificial intelligence

AI järjestöjä meillä ja muualla

January 26, 2018 by admin - Comments Disabled

Pohdin tänään, että miten järjestäytynyt AI aihepiiri on Suomen järjestökentässä. Ts. montako yhdistystä on AI aiheen tiimoilta perustettu tai perusteilla. Entä moniko opinahjo harjoittaa tutkimusta aiheen tiimoilta. Tässä muutama linkki toimijoihin:

Yhdistykset:

Contact info

Yliopistojen osastot ja yksiköt:

https://fcai.squarespace.com/partners/

Tagged with: ai yhdistys

GitHub

R tips part1

December 27, 2017 by admin - Comments Disabled

GitHub repository (source: https://rstudio.github.io/shiny/tutorial/#deployment-local)
If your project is stored in a git repository on GitHub, then others can download and run your app directly. An example repository is at https://github.com/rstudio/shiny_example. The following command will download and run the application:

shiny::runGitHub('shiny_example', 'rstudio')
In this example, the GitHub account is ‘rstudio’ and the repository is ‘shiny_example’; you will need to replace them with your account and repository name.

Pros
Source code is easily visible by recipient (if desired)
Easy to run (for R users)
Very easy to update if you already use GitHub for your project
Git-savvy users can clone and fork your repository
Cons
Developer must know how to use git and GitHub
Code is hosted by a third-party server

Shiny

December 26, 2017 by admin - Comments Disabled

Pääsin kokeilemaan r-ohjelman shiny-ohjelmaa. Ensin asennettaan paketti:

install.packages(“rsconnect”)
install.packages(“RCurl”)

install.packages(“shiny”)

Installing package into ‘/home/marko/R/i686-pc-linux-gnu-library/3.4’
(as ‘lib’ is unspecified)
also installing the dependencies ‘httpuv’, ‘xtable’, ‘R6’, ‘sourcetools’

Sitten ajetaan ohjelma:
library(shiny)

runExample(“01_hello”)
Esimerkki löytyy täältä shiny sivuiltani.

Tässä koodia:

> #initialize
> library(datasets)
> library(ggplot2)
>
> #helper function (convert vector to named list)
> namel<-function (vec){
+ tmp<-as.list(vec)
+ names(tmp)<-as.character(unlist(vec)) + tmp + } > # shiny server side code for each call
> shinyServer(function(input, output, session){
+ #update variable and group based on dataset
+ output$variable <- renderUI({
+ obj<-switch(input$dataset,
+ “iris” = iris,
+ “mtcars” = mtcars)
+ var.opts<-namel(colnames(obj))
+ selectInput(“variable”,”Variable:”, var.opts) # uddate UI
+ })
+
+ output$group <- renderUI({
+ obj<-switch(input$dataset,
+ “iris” = iris,
+ “mtcars” = mtcars)
+ var.opts<-namel(colnames(obj))
+ selectInput(“group”,”Groups:”, var.opts) # uddate UI
+ })
+
+ output$caption<-renderText({
+ switch(input$plot.type,
+ “boxplot” = “Boxplot”,
+ “histogram” = “Histogram”,
+ “density” = “Density plot”,
+ “bar” = “Bar graph”)
+ })
+
+
+ output$plot <- renderUI({
+ plotOutput(“p”)
+ })
+
+ #plotting function using ggplot2
+ output$p <- renderPlot({
+
+ plot.obj<<-list() # not sure why input$X can not be used directly?
+ plot.obj$data<<-get(input$dataset)
+ plot.obj$variable<<-with(plot.obj$data,get(input$variable))
+ plot.obj$group<<-with(plot.obj$data,get(input$group))
+
+ #dynamic plotting options
+ plot.type<-switch(input$plot.type,
+ “boxplot” = geom_boxplot(),
+ “histogram” = geom_histogram(alpha=0.5,position=”identity”),
+ “density” = geom_density(alpha=.75),
+ “bar” = geom_bar(position=”dodge”)
+ )
+
+ require(ggplot2)
+ #plotting theme
+ .theme<- theme(
+ axis.line = element_line(colour = ‘gray’, size = .75),
+ panel.background = element_blank(),
+ plot.background = element_blank()
+ )
+ if(input$plot.type==”boxplot”) { #control for 1D or 2D graphs
+ p<-ggplot(plot.obj$data,
+ aes(
+ x = plot.obj$group,
+ y = plot.obj$variable,
+ fill = as.factor(plot.obj$group)
+ )
+ ) + plot.type
+
+ if(input$show.points==TRUE)
+ {
+ p<-p+ geom_point(color=’black’,alpha=0.5, position = ‘jitter’)
+ }
+
+ } else {
+
+ p<-ggplot(plot.obj$data,
+ aes(
+ x = plot.obj$variable,
+ fill = as.factor(plot.obj$group),
+ group = as.factor(plot.obj$group),
+ #color = as.factor(plot.obj$group)
+ )
+ ) + plot.type
+ }
+
+ p<-p+labs(
+ fill = input$group,
+ x = “”,
+ y = input$variable
+ ) +
+ .theme
+ print(p)
+ })
+ })

artificial intelligence, data mining, datascience, IoT, LUT, SciBlog

My -SciBlog- start today!

December 14, 2017 by admin - No comments

Welcome to MySci -page. Here I introduce some innovative and most interesting things about my trip into doctoral carreer. Now Im doctoral student in LUt and I wonder every day how amazing things you could do nowadays with your laptop. Data mining, AI and IoT will be in some role in my SciBlog. So… if you dare, just follow these pages!

Tagged with: ai data machine learning mining

Uncategorized

Here I will update new post very soon

July 17, 2017 by admin - Comments Disabled

Hi, its long time. Now I start again to post new articles. Cant wait….

First, I push some new tips related R-studio etc.

Kind Regards,
Marko

R and maps

Finding Centroid point from shape file and labelling map

July 31, 2012 by admin - No comments

Finding Centroid point from shape file and labelling map

Finding Centroid point from shape file and map data labelling

In this test I try to find centroid point in shp file. In additionally I will show how to use this centroid point information when to labelling area name into map.
R Library PBSmapping offers wide range of functions to do that job.

Needed Library

require(rgdal)

## Loading required package: rgdal

## Loading required package: sp

## Warning: package 'sp' was built under R version 2.15.1

## Geospatial Data Abstraction Library extensions to R successfully loaded
## Loaded GDAL runtime: GDAL 1.9.0, released 2011/12/29 Path to GDAL shared
## files: C:/Program Files/R/R-2.14.2/library/rgdal/gdal Loaded PROJ.4
## runtime: Rel. 4.7.1, 23 September 2009, [PJ_VERSION: 470] Path to PROJ.4
## shared files: C:/Program Files/R/R-2.14.2/library/rgdal/proj

require(PBSmapping)

## Loading required package: PBSmapping

## ----------------------------------------------------------- PBS Mapping
## 2.62.34 -- Copyright (C) 2003-2012 Fisheries and Oceans Canada
## 
## PBS Mapping comes with ABSOLUTELY NO WARRANTY; for details see the file
## COPYING. This is free software, and you are welcome to redistribute it
## under certain conditions, as outlined in the above file.
## 
## A complete user guide 'PBSmapping-UG.pdf' is located at C:/Program
## Files/R/R-2.14.2/library/PBSmapping/doc/PBSmapping-UG.pdf
## 
## To see demos, type '.PBSfigs()'.
## 
## Packaged on 2012-03-01 Pacific Biological Station, Nanaimo
## -----------------------------------------------------------

require(maptools)

## Loading required package: maptools

## Loading required package: foreign

## Loading required package: lattice

## Checking rgeos availability: TRUE

Importing Shapefile

FIN_adm2.pb <- importShapefile("FIN_adm2")

ImportShapefile reads the .prj file if it exists, but it
does not adopt the proj4 format

proj.abbr <- attr(FIN_adm2.pb, "projection")
# abbreviated projection info

print(proj.abbr)

Generate map using PBSmapping plotting functions

plotPolys(FIN_adm2.pb, projection = proj.abbr, 
    border = "gray", xlab = "Longitude", ylab = "Latitude")

plot of chunk unnamed-chunk-4

Calculate the centroids of adm2 area

FIN.centroids2 <- calcCentroid(FIN_adm2.pb, rollup = 1)

Check results

str(FIN.centroids2)

## Classes 'PolyData' and 'data.frame': 21 obs. of  3 variables:
##  $ PID: int  1 2 3 4 5 6 7 8 9 10 ...
##  $ X  : num  29.9 27.5 26 27.9 26.3 ...
##  $ Y  : num  62.9 63.2 61.4 61.9 67.7 ...
##  - attr(*, "projection")= chr "LL"

Plot the centroids against the source polygons

In this purpose we use the PBSMapping package graphics routines. If you want you can plot these centroid point in to the map

plotPolys(FIN_adm2.pb, projection = proj.abbr, 
    border = "gray", xlab = "Longitude", ylab = "Latitude")
addPoints(FIN.centroids2, col = "blue", pch = 19)

plot of chunk unnamed-chunk-7

check the variables from imported shape file

str(FIN_adm2.pb)

## Classes 'PolySet' and 'data.frame':  136271 obs. of  5 variables:
##  $ PID: int  1 1 1 1 1 1 1 1 1 1 ...
##  $ SID: int  1 1 1 1 1 1 1 1 1 1 ...
##  $ POS: int  1 2 3 4 5 6 7 8 9 10 ...
##  $ X  : num  28.5 28.5 28.5 28.5 28.6 ...
##  $ Y  : num  63.9 63.9 63.9 63.9 63.9 ...
##  - attr(*, "PolyData")=Classes 'PolyData' and 'data.frame':  21 obs. of  12 variables:
##   ..$ PID      : int  1 2 3 4 5 6 7 8 9 10 ...
##   ..$ ID_0     : int  78 78 78 78 78 78 78 78 78 78 ...
##   ..$ ISO      : Factor w/ 1 level "FIN": 1 1 1 1 1 1 1 1 1 1 ...
##   ..$ NAME_0   : Factor w/ 1 level "Finland": 1 1 1 1 1 1 1 1 1 1 ...
##   ..$ ID_1     : int  1 1 1 1 2 3 3 4 4 4 ...
##   ..$ NAME_1   : Factor w/ 5 levels "Eastern Finland",..: 1 1 1 1 2 3 3 4 4 4 ...
##   ..$ ID_2     : int  1 2 3 4 5 6 7 8 9 10 ...
##   ..$ NAME_2   : Factor w/ 19 levels "Central Finland",..: 8 9 13 17 7 5 10 3 6 13 ...
##   ..$ NL_NAME_2: Factor w/ 0 levels: NA NA NA NA NA NA NA NA NA NA ...
##   ..$ VARNAME_2: Factor w/ 19 levels "Etelä-Karjala|Södra Karelen",..: 12 14 15 3 10 5 13 4 9 15 ...
##   ..$ TYPE_2   : Factor w/ 1 level "Maakunta|landskap": 1 1 1 1 1 1 1 1 1 1 ...
##   ..$ ENGTYPE_2: Factor w/ 1 level "Region": 1 1 1 1 1 1 1 1 1 1 ...
##  - attr(*, "parent.child")= num  1 1 1 1 0 1 0 0 0 0 ...
##  - attr(*, "shpType")= int 5
##  - attr(*, "prj")= __truncated__
##  - attr(*, "projection")= chr "LL"

To read dbf file we need foreign library

library(foreign)

reading shape file data from dbf file (ESRI files include normally: shp, dbf, prj, csv and shx data files)

labelData <- read.dbf("FIN_adm2.dbf", as.is = TRUE)

Data checking

head(labelData)

##   ID_0 ISO  NAME_0 ID_1          NAME_1 ID_2            NAME_2 NL_NAME_2
## 1   78 FIN Finland    1 Eastern Finland    1     North Karelia      <NA>
## 2   78 FIN Finland    1 Eastern Finland    2     North Savonia      <NA>
## 3   78 FIN Finland    1 Eastern Finland    3 Päijänne Tavastia      <NA>
## 4   78 FIN Finland    1 Eastern Finland    4  Southern Savonia      <NA>
## 5   78 FIN Finland    2         Lapland    5           Lapland      <NA>
## 6   78 FIN Finland    3            Oulu    6            Kainuu      <NA>
##                         VARNAME_2            TYPE_2 ENGTYPE_2
## 1   Pohjois-Karjala|Norra Karelen Maakunta|landskap    Region
## 2      Pohjois-Savo|Norra Savolax Maakunta|landskap    Region
## 3 Päijät-Häme|Päijänne Tavastland Maakunta|landskap    Region
## 4        Etelä-Savo|Södra Savolax Maakunta|landskap    Region
## 5                  Lappi|Lappland Maakunta|landskap    Region
## 6               Kainuu|Kajanaland Maakunta|landskap    Region

merging shape files area name and centroid points (x and y coordinates) labelData as data frame

labelData <- as.data.frame(labelData)

merging labelData and centroid data table fields

label <- data.frame(labelData$ID_2, FIN.centroids2$X, 
    FIN.centroids2$Y, labelData$NAME_2)

Event data function needs right column names

names(label) <- c("EID", "X", "Y", "label")

Note: Above routine used in eventdata function

#######names(label)<- c(“PID”, “X”, “Y”, “label”) –> USED IN POLYDATA FUNCTION

do the right data format into labelling

data <- as.EventData(label, projection = NULL, 
    zone = NULL)

Labelling into map

#col = 1 –> font color 1=black, 2=….
#font = 3 –> font size
#cex = 0.5 –>expansion factor

plotPolys(FIN_adm2.pb, projection = proj.abbr, 
    border = "gray", xlab = "Longitude", ylab = "Latitude")

addLabels(data, placement = "DATA", polys = FIN_adm2.pb, 
    col = 1, font = 3, cex = 0.5)

## Warning: The data's 'projection' attribute (NULL) differs from the
## projection of the plot region (LL).

plot of chunk unnamed-chunk-16

Have fun with mapping and RStudio

Marko

ps. This post is produced by using RStudio and knitr. Thanks to Yihui Xie for amazing work with this issue.

Tagged with: centroid Knitting to HTML maptools PBSmapping rgdal RStudio

DEA, Forecasting, Malmquist indices, R and data visualization, R sample code with data

Linear Programming for the Malmquist Productivity Growth Index

July 25, 2012 by admin - No comments

Data Envelopment Analysis

Within couple of days I have been testing DEA modelling (Data Envelopment Analysis) with different R-package. Finally,…. I have found such a comprehensive way to calculate Malmquist indices.

Productivity 1991_2009_Forecast Cubic_Smoothing_Spline10y2 — Chart 1

My primary purpose is to show how to use nonparametric methods for measuring efficiency and productivity by using R-programs nonparaeff -package.

At this same time I will show you how to present Malmquist indices within googleVis -world map and finally I will introduse how to make forecast chart for productivity (Chart 1), effectiveness and technical effectiveness indices. In this work I used R-program forecast -library.

Used R-library documentation:
nonparaeff
googleVis
forecast

First of all I would like to thank Author Dong-hyun Oh for nice work with this nonparaeff-package.

In this working paper I will use example of faremalm2. Like nonpraeff -package documentation we calculate Malmquist productivity growth index of OECD countries
(productivity, technical efficiency and efficiency). As data source we have used Penn World Table (like original sources) with following version:
OECD Timeseries 1980-1990 version pwt5.6
OECD Timeseries 1990-2009 version pwt7.0
In pwt7.0 I cannot find capital stock data, so I downloaded it from here (http://www.ifw-kiel.de/forschung/datenbanken/netcap)

Productivity calculation 1980-1990 As output variables is used Total GDP of a country. This variables is calculated using GDP per capita (rgdpl)
and amount of total population (pop). As input variables is used Total labor force (gdp/rgdpwok) and Total capital stock (kapw * labor)

In second productivity computing calculation 1990-2009 I only use one input and one output variables (labor and gdp). This is because there is no
capital stock data available in pwt7.0 (please correct this If I missed something). I will re-calculate this period asap when I get capital data from all countries.

In third productivity computing 1992-2002 I use pwt7.0 and get capital stock data from different OECD datasource
(http://www.ifw-kiel.de/forschung/datenbanken/netcap). As input I used labor and capital variables and as output variables gdp.

When use faremalm2 function you will get the following outputs:
pc.c –> Productivity growth for each country
pc.y –> A trend of productivity growth of SELECTED countries (in this work we use OECD)

tc.c –> Technical efficiency for each country
tc.y –> A trend of technical efficiency growth of SELECTED countries

ec.c –> Efficiency change for each country
ec.y –> A trend of efficiency change of SELECTED countries

Useage of faremalm2

faremalm2(dat = NULL, noutput = 1, id = “id”, year = “year”)
dat –> The format of this data frame is data.frame(id, year, outputs, inputs).
noutput –> number of outputs
id –> column name for DMU:s
year –> column name for the time index

## Malmquist productivity growth index of OECD countries #First install data from OECD install.packages("pwt") # (installing a package can take a couple of minutes)

#Setting the library into use library(pwt) ## Use Penn World Table library(nonparaeff) # DEA modelling

More information about pwt data you can find there (http://pwt.econ.upenn.edu/php_site/pwt_index.php)

#my.dat <- pwt5.6 #used in 1980-1990 #my.dat <- pwt6.3 # my.dat <- pwt7.0 #used in 1990-2009 and 1992-2002 #head(my.dat) summary(my.dat) #my.dat$country

#capital stock data missing so we get it from there # http://www.ifw-kiel.de/forschung/datenbanken/netcap #picking up OECD countries my.oecd.ctry <- c("AUS", "AUT", "BEL", "CAN", "CHE", "DNK", "ESP", "FIN", "FRA", "GBR", "GER", "GRC", "IRL", "ISL", "ITA", "JPN", "KOR", "LUX", "MEX", "NLD", "NOR", "NZL", "PRT", "SWE", "TUR", "USA", "DEU")
#NOTE WPCODE USED IN OLDER DATASET #adding country code #my.dat <- my.dat[my.dat$wbcode %in% my.oecd.ctry,] #use this with pwt5.6 my.dat <- my.dat[my.dat$isocode %in% my.oecd.ctry,] #use this with pwt7.0 summary(my.dat)

#selecting appropriate years #my.dat <- my.dat[my.dat$year %in% 1980:1990,] #my.dat <- my.dat[my.dat$year %in% 1950:1992,] my.dat <- my.dat[my.dat$year %in% 1990:2009,] #my.dat <- my.dat[my.dat$year %in% 1992:2002,] summary(my.dat) my.dat

Note! I encounter problem (NA or other reason) thats why I do this out of box. my.dat is transformed into data.frame and after
that modified with Excel. This is only used in timeseries 1992-2002 productivity calculation.

#uploading my.dat #my.dat_df <- as.data.frame(my.dat) #out #write.table(my.dat_df, file = "c:/data/my.dat_df.txt", sep = "\t", col.names = NA, qmethod = "double")

Capital stock data inserted in this phase. source: http://www.ifw-kiel.de/forschung/datenbanken/netcap

#...and in both source
 my.dat  <- read.table(".../mydat1990_2009.csv",  header=TRUE, 
sep=";", na.strings="NA", dec=",", strip.white=TRUE)
my.dat <- read.table("http://ekqvist.goeuropeinfo.com/rbloggerqvist/
data/oecd/mydat1990_2009.csv", header=TRUE, sep=";", na.strings="NA", 
dec=",", strip.white=TRUE)

#INSERTING some variables to my.dat (ie. INPUT and OUTPUT -variables calculation) my.dat$rgdpl <- as.numeric(my.dat$rgdpl) ## GDP per capita my.dat$pop <- as.numeric(my.dat$pop) ## total population (1000) my.dat$rgdpwok <- as.numeric(my.dat$rgdpwok) ## GDP per labor

my.dat$gdp <- my.dat$rgdpl * my.dat$pop ## Total GDP of a country my.dat$labor <- with(my.dat, gdp/rgdpwok) ## Total labor force

my.dat$capital <- with(my.dat, kapw * labor) ## Total capital stock           MISSING pwt7 note: this is used in pwt5.6
 #my.dat$capital <- as.numeric(my.dat$capital)  
## Total capital stock  ADDED FROM DIFF. SOURCE now used in time 
series 1993-2002

my.dat$kapw <- as.numeric(my.dat$kapw) ## Capital stock per labor         MISSING pwt7 note: this is used in pwt5.6
 #my.dat$kapw <- with(my.dat, my.dat$capital/my.dat$labor) ## 
Capital stock per labor now used in time series 1993-2002

#variable used in 1990-2009 Malmquist productivity calculation

#1980-1990
#1990-2009
#1992-2002

#these variables used in timeseries 1980-1990 and 1992-2002 #oecd <- my.dat[, c("country", "wbcode", "year", "gdp", "labor", "capital")] #used in time series 1980-1990 oecd <- my.dat[, c("country", "isocode", "year", "gdp", "labor")] # 1990-2009 #oecd <- my.dat[, c("country", "isocode", "year", "gdp", "labor", "capital")] #used in time series 1992-2002

summary(oecd) head(oecd)
#removing NA-rows oecd <- oecd[!is.na(oecd$capital),] summary(oecd)

#Now calcuatel productivity (pc), effiency (ec) and technical efficiency (tc) #huom. tämä on tehty eri aikasarjoille ja hyvä niin koska mahdollistaa eri aikajaksojen tarkastelun library(nonparaeff) # DEA modelling re.oecd <- faremalm2(dat = oecd, noutput = 1, id = "isocode", year = "year") #re.oecd <- faremalm2(dat = oecd, noutput = 1, id = "wbcode", year = "year")

summary(re.oecd)

###################################################### #ISOCODE USED IN #1990-2009 and 1992-2002 #note:

## productivity growth for each country pc.c <- tapply(re.oecd$pc, re.oecd$isocode, geometric.mean) #pc.c <- tapply(re.oecd$pc, re.oecd$wbcode, geometric.mean) ## a trend of productivity growth of SELECTED countries pc.y <- tapply(re.oecd$pc, re.oecd$year, geometric.mean)

## technical efficiency for each country
tc.c <- tapply(re.oecd$tc, re.oecd$isocode, geometric.mean)
#tc.c <- tapply(re.oecd$tc, re.oecd$wbcode, geometric.mean)

## a trend of technical efficiency growth of SELECTED countries
tc.y <- tapply(re.oecd$tc, re.oecd$year, geometric.mean)

## efficiency change for each country
ec.c <- tapply(re.oecd$ec, re.oecd$isocode, geometric.mean)
#ec.c <- tapply(re.oecd$ec, re.oecd$wbcode, geometric.mean)
## a trend of efficiency change of SELECTED countries
ec.y <- tapply(re.oecd$ec, re.oecd$year, geometric.mean)
######################################################
pc.c

#1980-1990, 1990-2009 #country -variable(we use it in google vis plotting) ## productivity growth for each country pc.c <- tapply(re.oecd$pc, re.oecd$country, geometric.mean)

## technical efficiency for each country
tc.c <- tapply(re.oecd$tc, re.oecd$country, geometric.mean)

## efficiency change for each country
ec.c <- tapply(re.oecd$ec, re.oecd$country, geometric.mean)

#missing time series 1992-2002 TURKEY, KOREA, LUXEMBOURG AND MEKSIKO capital stock data #VUOSITTAINEN startyear-endyear KESKIARVO TEHOKKUUSKERTOIMESSA (MALMQUIST - FARREL) #ec: efficiency change #TEHOKKUUSMUUTOS ec.c_df <- as.data.frame(ec.c) summary(ec.c_df) ec.c_df #tc: technical change #TEKNISEN TEHOKKUUDEN MUUTOS tc.c_df <- as.data.frame(tc.c)

#pc: productivity change #TUOTTAVUUDEN MUUTOS
pc.c_df <- as.data.frame(pc.c)

#LITTLE TRICK #ec: efficiency change #TEHOKKUUSMUUTOS write.table(ec.c_df, file = "c:/data/ec.c_df.txt", sep = "\t", col.names = NA, qmethod = "double") ec.c_df2 <- read.table("c:/data/ec.c_df.txt", header=TRUE, sep="\t", na.strings="NA", dec=".", strip.white=TRUE) names(ec.c_df2)<- c("Country", "Effiency")

#tc: technical change #TEKNISEN TEHOKKUUDEN MUUTOS write.table(tc.c_df, file = "c:/data/tc.c_df.txt", sep = "\t", col.names = NA, qmethod = "double") tc.c_df2 <- read.table("c:/data/tc.c_df.txt", header=TRUE, sep="\t", na.strings="NA", dec=".", strip.white=TRUE) names(tc.c_df2)<- c("Country", "Effiency")

#pc: productivity change #TUOTTAVUUDEN MUUTOS
write.table(pc.c_df, file = “c:/data/pc.c_df.txt”, sep = “\t”, col.names = NA, qmethod = “double”)
pc.c_df2 <- read.table(“c:/data/pc.c_df.txt”, header=TRUE, sep=”\t”, na.strings=”NA”, dec=”.”, strip.white=TRUE)
names(pc.c_df2)<- c(“Country”, “Effiency”)

#REMOVING NA-VALUES ec.c_df2 <- ec.c_df2[!is.na(ec.c_df2$Effiency),] tc.c_df2 <- tc.c_df2[!is.na(tc.c_df2$Effiency),] pc.c_df2 <- pc.c_df2[!is.na(pc.c_df2$Effiency),] summary(ec.c_df2)

#just cecking country names ec.c_df2$Country

#replace some country name with new one…
ec.c_df3 <- gsub(“United States of America”, “United States”, ec.c_df2$Country)
ec.c_df3 <- gsub(“Germany, West”, “Germany”, ec.c_df3)
ec.c_df3 <- gsub(“Korea, Republic”, “South Korea”, ec.c_df3)

tc.c_df3 <- gsub(“United States of America”, “United States”, tc.c_df2$Country)
tc.c_df3 <- gsub(“Germany, West”, “Germany”, tc.c_df3)
tc.c_df3 <- gsub(“Korea, Republic”, “South Korea”, tc.c_df3)

pc.c_df3 <- gsub(“United States of America”, “United States”, pc.c_df2$Country)
pc.c_df3 <- gsub(“Germany, West”, “Germany”, pc.c_df3)
pc.c_df3 <- gsub(“Korea, Republic”, “South Korea”, pc.c_df3)

#as a dataframe MUUNNETAAN DATA FRAMEKSI
ec.c_df3 <- as.data.frame(ec.c_df3)
tc.c_df3 <- as.data.frame(tc.c_df3)
pc.c_df3 <- as.data.frame(pc.c_df3)

#check total number of rows KATSOTAAN RIVINUMEROINNIT MERGELLE
summary(ec.c_df2)
nrow(ec.c_df2)
nrow(ec.c_df3)

summary(tc.c_df2)
nrow(tc.c_df2)
nrow(tc.c_df3)

summary(pc.c_df2)
nrow(pc.c_df2)
nrow(pc.c_df3)

#used in time series #1980-1990 #1990-2009 #giving row id ...NUMEROIDDAAN ID MERGEÄ VARTEN ec.c_df2$id1=c(1:26) ec.c_df3$id1=c(1:26) tc.c_df2$id1=c(1:26) tc.c_df3$id1=c(1:26) pc.c_df2$id1=c(1:26) pc.c_df3$id1=c(1:26)
#...and for time series #1991-2002 #giving row id ...NUMEROIDDAAN ID MERGEÄ VARTEN ec.c_df2$id1=c(1:22) ec.c_df3$id1=c(1:22) tc.c_df2$id1=c(1:22) tc.c_df3$id1=c(1:22) pc.c_df2$id1=c(1:22) pc.c_df3$id1=c(1:22)
#merge data table and renamed country by id ..... ec.c_df4 <-merge(ec.c_df2, ec.c_df3, by.x="id1", by.y="id1", all = TRUE) tc.c_df4 <-merge(tc.c_df2, tc.c_df3, by.x="id1", by.y="id1", all = TRUE) pc.c_df4 <-merge(pc.c_df2, pc.c_df3, by.x="id1", by.y="id1", all = TRUE)

#check the data head(ec.c_df4) head(tc.c_df4) head(pc.c_df4) summary(ec.c_df4)

#selecting columns to the gvisGeoMapping ..... ec.c_df5 <- data.frame(ec.c_df4$ec.c_df3, ec.c_df4$Effiency ) tc.c_df5 <- data.frame(tc.c_df4$tc.c_df3, tc.c_df4$Effiency ) pc.c_df5 <- data.frame(pc.c_df4$pc.c_df3, pc.c_df4$Effiency )

#check the data head(ec.c_df5) head(tc.c_df5) head(pc.c_df5)

summary(ec.c_df5)
summary(tc.c_df5)
summary(pc.c_df5)

#naming trick names(ec.c_df5)<- c("Country", "Effiency change in OECD 1980-1990") names(tc.c_df5)<- c("Country", "Technical Effiency change in OECD 1980-1990") names(pc.c_df5)<- c("Country", "Productivity change in OECD 1980-1990")

#naming time series 1991-2009 names(ec.c_df5)<- c("Country", "Effiency change in OECD 1991-2009") names(tc.c_df5)<- c("Country", "Technical Effiency change in OECD 1991-2009") names(pc.c_df5)<- c("Country", "Productivity change in OECD 1991-2009")

#1991-2002 names(ec.c_df5)<- c("Country", "Effiency change in OECD 1991-2002") names(tc.c_df5)<- c("Country", "Technical Effiency change in OECD 1991-2002") names(pc.c_df5)<- c("Country", "Productivity change in OECD 1991-2002")

head(ec.c_df5) head(tc.c_df5) head(pc.c_df5)

#set library
library(googleVis)

#1980-1990 #me/230712 #1 #KMEAN VALUE OF MALMQUIST INDICES 1980-1990 Efficency change Geo1=gvisGeoMap(ec.c_df5, locationvar="Country", numvar="Effiency change in OECD 1980-1990", options=list (height=450, width=800, dataMode='regions')) #plot(Geo1) cat(Geo1$html$chart, file=".../oecd_mean_eff_1980_1990_malmquist2.html")

#2 #KESKIARVO MEAN VALUE OF MALMQUIST INDICES 1980-1990 Efficency change Geo2=gvisGeoMap(tc.c_df5, locationvar="Country", numvar="Technical Effiency change in OECD 1980-1990", options=list(height=450, width=800, dataMode='regions')) #plot(Geo2) cat(Geo2$html$chart, file=".../oecd_mean_tech_eff_1980_1990_malmquist2.html")

#3 #KESKIARVO MEAN VALUE OF MALMQUIST INDICES 1980-1990 Efficency change Geo3=gvisGeoMap(pc.c_df5, locationvar="Country", numvar="Productivity change in OECD 1980-1990", options=list(height=450, width=800, dataMode='regions')) #plot(Geo3) cat(Geo3$html$chart, file=".../oecd_mean_prod_1980_1990_malmquist2.html")

#1991-2009 (capital var dropped) #me/240712 #1 #KESKIARVO MEAN VALUE OF MALMQUIST INDICES Geo1=gvisGeoMap(ec.c_df5, locationvar="Country", numvar="Effiency change in OECD 1991-2009", options=list (height=450, width=800, dataMode='regions')) plot(Geo1) cat(Geo1$html$chart, file=".../oecd_mean_eff_1991_2009_malmquist2.html")

#2 #VALUE OF MALMQUIST INDICES Geo2=gvisGeoMap(tc.c_df5, locationvar="Country", numvar="Technical Effiency change in OECD 1991-2009", options=list(height=450, width=800, dataMode='regions')) plot(Geo2) cat(Geo2$html$chart, file=".../oecd_mean_tech_eff_1991_2009_malmquist2.html")

#3 #KESKIARVO MEAN VALUE OF MALMQUIST INDICES Geo3=gvisGeoMap(pc.c_df5, locationvar="Country", numvar="Productivity change in OECD 1991-2009", options=list(height=450, width=800, dataMode='regions')) plot(Geo3) cat(Geo3$html$chart, file=".../oecd_mean_prod_1991_2009_malmquist2.html")

#1991-2002 (capital as input) #me/240712 #1 #KESKIARVO MEAN VALUE OF MALMQUIST INDICES Geo1=gvisGeoMap(ec.c_df5, locationvar="Country", numvar="Effiency change in OECD 1991-2002", options=list (height=450, width=800, dataMode='regions')) plot(Geo1) cat(Geo1$html$chart, file=".../oecd_mean_eff_1991_2002_malmquist2.html")

#2 #KESKIARVO MEAN VALUE OF MALMQUIST INDICES Geo2=gvisGeoMap(tc.c_df5, locationvar="Country", numvar="Technical Effiency change in OECD 1991-2002", options=list(height=450, width=800, dataMode='regions')) plot(Geo2) cat(Geo2$html$chart, file=".../oecd_mean_tech_eff_1991_2002_malmquist2.html")

#3 #KESKIARVO MEAN VALUE OF MALMQUIST INDICES Geo3=gvisGeoMap(pc.c_df5, locationvar="Country", numvar="Productivity change in OECD 1991-2002", options=list(height=450, width=800, dataMode='regions')) plot(Geo3) cat(Geo3$html$chart, file=".../oecd_mean_prod_1991_2002_malmquist2.html")

World Map (producitivity, efficiency)

The cubic smoothing spline model

Now we plot productivity, effectiveness and technical effectiveness mean values historical trend and forecast by using
The cubic smoothing spline model. It is equivalent to an ARIMA(0,2,2) model.

############################################################### library(forecast) #library(ggplot2) #1980-1990 #1990-2009 #1991-2002 ##################################################################### #1 #1980-1990 ##################################################################### #PRODUCTIVITY pc.y fcast_pc <- splinef(pc.y,h=10, fan=T) #NOTE confidence leve 50, 99 plot(fcast_pc, main="Productivity, forecast from Cubic Smoothing Spline", ylab="Malmquist indices - Productivity", xlab="Year(0=1980, 10=1990, 20=2000)")

# example – output graph to jpeg file
jpeg(“…/Productivity 1980_1990_Forecast Cubic_Smoothing_Spline10y2.jpg”)
plot(fcast_pc, main=”Productivity, forecast from Cubic Smoothing Spline”, ylab=”Malmquist indices – Productivity”, xlab=”Year(0=1980, 10=1990, 20=2000)”)
dev.off()

#TECHNICAL EFFICIENCY
fcast_tc <- splinef(tc.y,h=10, fan=T)
plot(fcast_tc, main=”Technical Efficiency, forecast from Cubic Smoothing Spline”, ylab=”Malmquist indices – Technical Efficiency”, xlab=”Year(0=1980, 10=1990, 20=2000)”)

# example – output graph to jpeg file
jpeg(“…/Technical efficiency 1980_1990_Forecast Cubic_Smoothing_Spline10y2.jpg”)
plot(fcast_tc, main=”Technical Efficiency, forecast from Cubic Smoothing Spline”, ylab=”Malmquist indices – Technical Efficiency”, xlab=”Year(0=1980, 10=1990, 20=2000)”)
dev.off()

#EFFICIENCY
fcast_ec <- splinef(ec.y,h=10, fan=T)
plot(fcast_ec, main=”Efficiency, forecast from Cubic Smoothing Spline”, ylab=”Malmquist indices – Efficiency”, xlab=”Year(0=1980, 10=1990, 20=2000)”)

# example – output graph to jpeg file
jpeg(“G:/data/home/2012/marko/blogi_rbloggerqvist/tekstiaihiot/40/Efficiency 1980_1990_Forecast Cubic_Smoothing_Spline10y2.jpg”)
plot(fcast_ec, main=”Efficiency, forecast from Cubic Smoothing Spline”, ylab=”Malmquist indices – Efficiency”, xlab=”Year(0=1980, 10=1990, 20=2000)”)
dev.off()

summary(fcast_pc)
summary(fcast_tc)
summary(fcast_ec)

#####################################################################
#2
#1991-2009
#####################################################################
#PRODUCTIVITY
pc.y
fcast_pc <- splinef(pc.y,h=10, fan=T) #NOTE confidence leve 50, 99
plot(fcast_pc, main=”Productivity, forecast from Cubic Smoothing Spline”, ylab=”Malmquist indices – Productivity”, xlab=”Year(0=1990, 20=2010)”)

# example – output graph to jpeg file
jpeg(“…/Productivity 1991_2009_Forecast Cubic_Smoothing_Spline10y2.jpg”)
plot(fcast_pc, main=”Productivity, forecast from Cubic Smoothing Spline”, ylab=”Malmquist indices – Productivity”, xlab=”Year(0=1990, 20=2010)”)
dev.off()

# example – output graph to jpeg file
jpeg(“G:/data/home/2012/marko/blogi_rbloggerqvist/tekstiaihiot/40/Technical efficiency 1991_2009_Forecast Cubic_Smoothing_Spline10y2.jpg”)
plot(fcast_tc, main=”Technical Efficiency, forecast from Cubic Smoothing Spline”, ylab=”Malmquist indices – Technical Efficiency”, xlab=”Year(0=1990, 20=2010)”)
dev.off()

# example – output graph to jpeg file
jpeg(“…/Efficiency 1991_2009_Forecast Cubic_Smoothing_Spline10y2.jpg”)
plot(fcast_ec, main=”Efficiency, forecast from Cubic Smoothing Spline”, ylab=”Malmquist indices – Efficiency”, xlab=”Year(0=1990, 20=2010)”)
dev.off()

summary(fcast_pc)
summary(fcast_tc)
summary(fcast_ec)
######################################################################3
#1991-2002 (capital included)
#
#####################################################################
#PRODUCTIVITY
fcast_pc <- splinef(pc.y,h=10, fan=T) #NOTE confidence leve 50, 99
plot(fcast_pc, main=”Productivity, forecast from Cubic Smoothing Spline”, ylab=”Malmquist indices – Productivity”, xlab=”Year(1=1991, 10=2000, 20=2012)”)

# example – output graph to jpeg file
jpeg(“…/Productivity 1991_2002_Forecast Cubic_Smoothing_Spline10y2.jpg”)
plot(fcast_pc, main=”Productivity, forecast from Cubic Smoothing Spline”, ylab=”Malmquist indices – Productivity”, xlab=”Year(1=1991, 10=2000, 20=2012)”)
dev.off()

# example – output graph to jpeg file
jpeg(“…/Technical efficiency 1991_2002_Forecast Cubic_Smoothing_Spline10y2.jpg”)
plot(fcast_tc, main=”Technical Efficiency, forecast from Cubic Smoothing Spline”, ylab=”Malmquist indices – Technical Efficiency”, xlab=”Year(1=1991, 10=2000, 20=2012)”)
dev.off()

# example – output graph to jpeg file
jpeg(“…/Efficiency 1991_2002_Forecast Cubic_Smoothing_Spline10y2.jpg”)
plot(fcast_ec, main=”Efficiency, forecast from Cubic Smoothing Spline”, ylab=”Malmquist indices – Efficiency”, xlab=”Year(1=1991, 10=2000, 20=2012)”)
dev.off()

summary(fcast_pc)
summary(fcast_tc)
summary(fcast_ec)
#####################################################################

Ok, that’s it,
Have fun with Linear Programming for the Malmquist Productivity Growth Index and its wide range of applications…

Marko

Tagged with: DEA Malmquist R