I hope you are enjoying the “Learning R Programming for Free” series; here are links to the previous segments (Step One, Step Two, Step Three, Step Four, Step Five, Step Six, Step Seven, Step Eight, Step Nine, Step Ten) to provide some helpful background.
In the previous installment, we learned about setting up R on RedHat Linux 6 and a little bit about converting files with the “rio package.” We installed R on a RedHat Linux server and added the rio package. Installing the rio package also showed us how we can add packages manually if we cannot directly connect to an R mirror.
In this installment, we will discuss the rio package in more detail since it is such a powerful tool to have on hand. The rio package is a great tool to convert data from one format to another format.
First, to test our rio installation in RedHat Linux, if everything went all right with the package installation, we created a small Excel file like this (your file can contain anything interesting for you):
ID FullName Hometown Interest
1 Mouse, Mickey Tugboat Minnie
2 Duck, Donald Swim Pond Bugs
3 Mouse, Minnie Disney Studios Mickey
4 Bunny, Bugs Rabbit Hole Carrots
Save it as an XLS file and move it to the Linux server using a utility such as sftp. I saved my file as: myfile.xlsx
We then used Rscript (command line R) and a call to rio to covert XLSX format to CSV (comma separated format):
Rscript -e “rio::convert(‘myfile.xlsx’,’myfile.csv’)”
[riotest]$ ls -1
Let’s look at our CSV file:
[riotest]$ cat *.csv
2,”Duck, Donald”,Swim Pond,Bugs
3,”Mouse, Minnie”,Disney Studios,Mickey
4,”Bunny, Bugs”,Rabbit Hole,Carrots
Let’s move on to a discussion of the usefulness of the rio package. The most interesting functions in the rio package are “export,” “import,” and “import_list.”
Exporting data is handled with one function, export():
export(mtcars, “mtcars.csv”) # comma-separated values
export(mtcars, “mtcars.rds”) # R serialized
export(mtcars, “mtcars.sav”) # SPSS
A particularly useful feature of rio is the ability to import from and export to compressed (e.g., zip) directories, saving users the extra step of compressing a large exported file:
As of rio v0.5.0, “export()” can also write multiple data frames to respective sheets of an Excel workbook or an HTML file:
export(list(mtcars = mtcars, iris = iris), file = “mtcars.xlsx”)
Open “mtcars.xlsx” and we find multiple tabs!
Importing data is handled with one function, import():