R and Eurostat bulk data (Heatmap example)

R and Eurostat bulk data

In this Exercise I am testing Eurostat bulk data source and plot these data into Heatmap. Let’s try with this data:
“Harmonised unemployment rates (%) – monthly data (ei_lmhr_m)”

#You will also find Eurostat Data source from here:
http://epp.eurostat.ec.europa.eu/portal/page/portal/statistics/bulk_download


#I will automate this data downloading and extracting, but just now this is semiautomatic
# create download directory and set it
.exdir = 'c:/data/tmp2' # put there your own data folder
dir.create(.exdir)
.file = file.path(.exdir, 'ei_lmhr_m.tsv.gz') # change this

# download file
url = 'http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing?sort=1&downfile=data

%2Fei_lmhr_m.tsv.gz'
download.file(url, .file)

# untar it (Note: I do not know why I got error message: Error in getOct(block, 100, 8) : invalid octal digit)
untar(.file, compressed = 'gzip', exdir = path.expand(.exdir))

# Argh...something going wrong with this step, so I have to manipulate just downloaded data. First I remove comma

# from very first variables and etc... I always use Notetab light as a Text editor in this kind of task.

# Reading file into R. Please refer here your own data folder...
input <- read.table("c:/data/tmp2/ei_lmhr_m.tsv", header=TRUE, sep="\t", na.strings=":", dec=".", strip.white=TRUE)

#just checking
head(input)


# LM-UN-T-TOT = Unemployment rate according to ILO definition - Total rate

# NSA = not seasonally adjusted
input<- input[which(input$indic=="LM-UN-T-TOT"),]
input<- input[which(input$s_adj=="NSA"),]

#giving appropriate names in to the heatmap (without this manouver there will be only row id)
row.names(input) <- input$geo.time

#just checking
head(input)

#Column selection. We will get data between time period 05/2008 - 05/2012

input2 <- input[,5:53]

# data frame must change into data matrix  to produce heatmap.
input_matrix <- data.matrix(input2)

#heatmap is almost here
input_heatmap <- heatmap(input_matrix, Rowv=NA, Colv=NA, col = heat.colors(256), scale="column", margins=c

(5,10), xlab = "Harmonised unemployment rates (%) - monthly data", ylab= "Country or Area")


#saving heatmap into folder
jpeg("G:/data/home/2012/marko/blogi_rbloggerqvist/data/eurostat/Harmonised unemployment rates percent

monthly data.jpg")
input_heatmap <- heatmap(input_matrix, Rowv=NA, Colv=NA, col = heat.colors(256), scale="column", margins=c

(5,10), xlab = "Harmonised unemployment rates (%) - monthly data", ylab= "Country or Area")
dev.off()

Have fun,
Marko

Copyright MySci 2019
Tech Nerd theme designed by Siteturner