Search

The Rio Package in Detail: Step Eleven in Learning R Programming for Free

I hope you are enjoying the “Learning R Programming for Free” series; here are links to the previous segments (Step One, Step Two, Step Three, Step Four, Step Five, Step Six, Step Seven, Step Eight, Step Nine, Step Ten) to provide some helpful background.


In the previous installment, we learned about setting up R on RedHat Linux 6 and a little bit about converting files with the “rio package.”  We installed R on a RedHat Linux server and added the rio package.  Installing the rio package also showed us how we can add packages manually if we cannot directly connect to an R mirror.


In this installment, we will discuss the rio package in more detail since it is such a powerful tool to have on hand. The rio package is a great tool to convert data from one format to another format.


First, to test our rio installation in RedHat Linux, if everything went all right with the package installation, we created a small Excel file like this (your file can contain anything interesting for you):


ID FullName Hometown Interest

1 Mouse, Mickey Tugboat Minnie

2 Duck, Donald Swim Pond Bugs

3 Mouse, Minnie Disney Studios Mickey

4 Bunny, Bugs Rabbit Hole Carrots

Save it as an XLS file and move it to the Linux server using a utility such as sftp. I saved my file as: myfile.xlsx


We then used Rscript (command line R) and a call to rio to covert XLSX format to CSV (comma separated format):


Rscript -e “rio::convert(‘myfile.xlsx’,’myfile.csv’)”

[riotest]$ ls -1

myfile.csv 

myfile.xlsx

Let’s look at our CSV file:

[riotest]$ cat *.csv


ID,FullName,Hometown,Interest

1,”Mouse, Mickey”,Tugboat,Minnie

2,”Duck, Donald”,Swim Pond,Bugs

3,”Mouse, Minnie”,Disney Studios,Mickey

4,”Bunny, Bugs”,Rabbit Hole,Carrots


Let’s move on to a discussion of the usefulness of the rio package.  The most interesting functions in the rio package are “export,” “import,” and “import_list.”


Using Export()

Exporting data is handled with one function, export():

library(“rio”)

export(mtcars, “mtcars.csv”) # comma-separated values

export(mtcars, “mtcars.rds”) # R serialized

export(mtcars, “mtcars.sav”) # SPSS


A particularly useful feature of rio is the ability to import from and export to compressed (e.g., zip) directories, saving users the extra step of compressing a large exported file:


export(mtcars, “mtcars.tsv.zip”)


As of rio v0.5.0, “export()” can also write multiple data frames to respective sheets of an Excel workbook or an HTML file:


export(list(mtcars = mtcars, iris = iris), file = “mtcars.xlsx”)


Open “mtcars.xlsx” and we find multiple tabs!


Using Import()

Importing data is handled with one function, import():