This tutorial is going to illustrate how to sort or order a data frame in R.
1. Preparation
Let’s assume that we have a small data frame about books as follows, which will be used for all examples in this tutorial:
1 2 3 4 5 |
title <- c('Data Smart','Orientalism','False Impressions','The Age of Wrath','Making Software') author <- c('Foreman, John','Said, Edward','Archer, Jeffery','Eraly, Abraham','Oram, Andy') height <- c('235','197','177','238','235') year <- c('2010','2011','2012','1999','1998') bookDF <- data.frame(title, author, height, year) |
And let’s print out the dataset:
2. Sort Or Order A Data Frame In R Using The Order Function
To order a data frame in R, we can use the order function of the base package.
2.1. Order A Data Frame By Column Name
To sort or order any column by name, we just need to pass it into the order function. For example, let’s order the title column of the above data frame:
1 |
sortedBookDF <- bookDF[order(title),] |
Let’s see the sorted data frame:
1 2 3 4 5 6 7 |
> head(sortedBookDF) title author height year 1 Data Smart Foreman, John 235 2010 3 False Impressions Archer, Jeffery 177 2012 5 Making Software Oram, Andy 235 1998 2 Orientalism Said, Edward 197 2011 4 The Age of Wrath Eraly, Abraham 238 1999 |
We can see that it was sorted ascending by the title column.
Another variant syntax we can use to sort a data frame in R by column name is:
1 |
sortedBookDF <- bookDF[order(bookDF$title),] |
Printing out the sortedBookDF gives us the same sorted result.
2.2. Order A Data Frame By Column Index
To sort or order a data frame by column index, we just simply need to pass the desired sorting column(s) to the order function. For example, to sort the 4th column:
1 |
sortedBookDF <- bookDF[order(bookDF[, 4]),] |
Let’s see the sorted result:
2.3. Order By Ascending and Descending
By default, the order function sorts the keys by ascending. To specify the sorting should be in ascending or descending, we can pass the decreasing parameter into the order function. For example, to sort the title column in descending order:
1 |
sortedBookDF <- bookDF[order(title, decreasing = TRUE),] |
Let’s see the sorted result:
To specify the sort order with the column index:
1 |
sortedBookDF <- bookDF[order(bookDF[, 1], decreasing = TRUE),] |
2.4. Order A Data Frame By Multiple Columns
Ordering a data frame with multiple columns is similar to ordering it by one column. The order function accepts multiple arguments, so we can give it multiple sort keys.
Continue the above example, we can sort the data frame on both column “title” and column “year” by giving both those columns to the order function:
1 |
sortedBookDF <- bookDF[order(title, year),] |
or with a variant syntax:
1 |
sortedBookDF <- bookDF[order(bookDF$title, bookDF$year),] |
or to sort a data frame in R on multiple column indexes:
1 |
sortedBookDF <- bookDF[order(bookDF[,1], bookDF[,4]),] |
3. Order A Data Frame In R Using The plyr Package
To order a data frame in R, besides the built-in order function of R, we can use the arrange function of the plyr package. The arrange function is much easier to use but does require the external package to be installed.
Firstly, let’s install the plyr package by issuing the following command:
1 |
install.packages("plyr") |
Secondly, let’s include the package:
1 |
library(plyr) |
Next, let’s get to how to sort a data frame in R using the plyr package.
3.1. Order By Column Name
To order a data frame using the arrange function of the plyr package, we simply pass it and the column we want to sort as arguments of the function. For example, to sort the above one by the column “title“:
1 |
orderedBookDF <- arrange(bookDF, title) |
3.2. Order In Descending
The arrange function orders the data frame in ascending by default, to reserve the order or change the order by descending, we can combine the arrange function with the desc function of the plyr package. For example, to sort the above one by the column “title” in the descending order:
1 |
orderedBookDF <- arrange(bookDF, desc(title)) |
3.3. Order A Data Frame By Multiple Columns
Ordering by multiple columns with the arrange function is similar to ordering by a single column, we just need to pass the data frame and all desired columns as arguments of the function. For example, to sort the above one by both the column “title” and the column “year“:
1 |
orderedBookDF <- arrange(bookDF, title, year) |
4. Order A Data Frame In R Using The dplyr Package
Next, let’s get to how to order a data frame in R using the arrange method of the dplyr package. At first, let’s take a look at the data frame that is going to be used for all examples in this section:
1 2 3 4 5 6 7 |
id <- c('S1002','S1003','S1005','S1008','S1011','S1015') age <- c(58, 67,64, 34, 30, 37) gender <- c('female','male','male','male','male','male') height <- c(61, 67, 68, 71, 69, 59) weight <- c(256, 119, 183, 190, 191, 170) # Create a data frame subjectDfrm <- data.frame(id, age, gender, weight) |
Here is how it is printed out:
1 2 3 4 5 6 7 8 |
> subjectDfrm id age gender weight 1 S1002 58 female 256 2 S1003 67 male 119 3 S1005 64 male 183 4 S1008 34 male 190 5 S1011 30 male 191 6 S1015 37 male 170 |
Secondly, let’s include the dplyr library into the environment:
1 |
library(dplyr) |
4.1. Order By Column Name
To order or sort a data frame by column name, we just need to pass it and the desired column into the arrange method of the dplyr package, which is used to arrange (or re-order) rows, for example:
1 |
<span class="GNKRCKGCMRB ace_keyword">sortedSubjectDfrm <- arrange(subjectDfrm, id)</span> |
We have sorted the subjectDfrm data frame by the id column. Let’s see the result:
1 2 3 4 5 6 7 8 9 |
> sortedSubjectDfrm <- arrange(subjectDfrm, id) > sortedSubjectDfrm id age gender weight 1 S1002 58 female 256 2 S1003 67 male 119 3 S1005 64 male 183 4 S1008 34 male 190 5 S1011 30 male 191 6 S1015 37 male 170 |
As the dplyr supports the pipe operator (%>%), let’s use it for sorting:
1 |
subjectDfrm %>% arrange(id) %>% head |
We get the same result with the above sorting function:
1 2 3 4 5 6 7 8 |
> subjectDfrm %>% arrange(id) %>% head id age gender weight 1 S1002 58 female 256 2 S1003 67 male 119 3 S1005 64 male 183 4 S1008 34 male 190 5 S1011 30 male 191 6 S1015 37 male 170 |
4.2. Order In Descending
In similar to the desc() function of plyr package, we can use the desc() function of the dplyr package to sort a data frame in descending order, for example:
1 |
sortedSubjectDfrm <- arrange(subjectDfrm, desc(age)) |
or sorting through the pipe operator:
1 |
subjectDfrm %>% arrange(desc(age)) %>% head |
The sorted result now is:
1 2 3 4 5 6 7 8 |
> subjectDfrm %>% arrange(desc(age)) %>% head id age gender weight 1 S1003 67 male 119 2 S1005 64 male 183 3 S1002 58 female 256 4 S1015 37 male 170 5 S1008 34 male 190 6 S1011 30 male 191 |
We can the that the column “age” is sorted in descending order.
4.3. Order A Data Frame By Multiple Columns
To order or sort a data frame in R by multiple columns with the dplyr package, we simply pass the desired columns to the arrange method, for example, to sort the columns “id“, “age” and “weight” of the above data frame:
1 |
sortedSubjectDfrm <- arrange(subjectDfrm, id, age, weight) |
Or with the pipe operator:
1 |
subjectDfrm %>% arrange(id, age, weight) %>% head |
The sorted result now is:
1 2 3 4 5 6 7 8 |
> subjectDfrm %>% arrange(id, age, weight) %>% head id age gender weight 1 S1002 58 female 256 2 S1003 67 male 119 3 S1005 64 male 183 4 S1008 34 male 190 5 S1011 30 male 191 6 S1015 37 male 170 |
To sort multiple columns in the descending order:
1 |
subjectDfrm %>% arrange(id, desc(age), desc(weight)) %>% head |
5. Order A Data Frame In R Using The doBy Package
And next, in this section, we’re going to use the orderBy function of the doBy package to order a data frame in R. Firstly, let’s install and make the library ready in our R environment:
1 2 |
install.packages("doBy") library(doBy) |
Secondly, let’s take a look at the syntax of the orderBy function:
1 |
orderBy(formula, data) |
Where:
- formula
The right hand side of a formula - data
A data frame
And the sign of the terms in the formula determines whether sorting should be ascending or decreasing.
Thirdly, let’s create a data frame for all examples in this section:
1 2 3 4 5 6 |
id <- c('S1022','S1023','S1024','S1025','S1026','S1027') glyhb <- c(4.64, 4.63, 7.72, 4.81, 4.84, 3.94) chol <- c(132, 228, 228, 181, 249, 248) frame <- c('large','large','medium','small','small','medium') # Create a data frame testDfrm <- data.frame(id, glyhb, chol, frame) |
And print out its content:
1 2 3 4 5 6 7 8 |
> testDfrm id glyhb chol frame 1 S1022 4.64 132 large 2 S1023 4.63 228 large 3 S1024 7.72 228 medium 4 S1025 4.81 181 small 5 S1026 4.84 249 small 6 S1027 3.94 248 medium |
5.1. Order A Data Frame By Column Name
Let’s see the following example which shows how we sort the data frame by the column “chol“:
1 |
orderBy(~chol, data=testDfrm) |
The sorted result is:
1 2 3 4 5 6 7 8 |
> orderBy(~chol, data=testDfrm) id glyhb chol frame 1 S1022 4.64 132 large 4 S1025 4.81 181 small 2 S1023 4.63 228 large 3 S1024 7.72 228 medium 6 S1027 3.94 248 medium 5 S1026 4.84 249 small |
5.2. Order In Descending
To sort or order a data frame in descending order by using the orderBy function of the doBy package, we need to specify the sign (+ or -) in the formula passed to the function, for example, let’s sort the column “chol” in ascending order:
1 |
orderBy(~-chol, data=testDfrm) |
Let’s see the sorted result:
1 2 3 4 5 6 7 8 |
> orderBy(~-chol, data=testDfrm) id glyhb chol frame 5 S1026 4.84 249 small 6 S1027 3.94 248 medium 2 S1023 4.63 228 large 3 S1024 7.72 228 medium 4 S1025 4.81 181 small 1 S1022 4.64 132 large |
We can see that the column “chol” was sorted in reserving order compared to above example.
5.3. Order A Data Frame By Multiple Columns
To order a data frame by multiple columns using the orderBy function of the doBy package, we will need to create a formula which combines the desired columns with sorting orders, for example, let’s sort the data frame by both the columns “chol” and “id” , where the column “chol” is in ascending and the “id” is in ascending order:
1 |
orderBy(~-chol+id, data=testDfrm) |
And see the sorted result:
1 2 3 4 5 6 7 8 |
> orderBy(~-chol+id, data=testDfrm) id glyhb chol frame 5 S1026 4.84 249 small 6 S1027 3.94 248 medium 2 S1023 4.63 228 large 3 S1024 7.72 228 medium 4 S1025 4.81 181 small 1 S1022 4.64 132 large |
Let’s reserve the order of the column “id” to descending:
1 |
orderBy(~-chol-id, data=testDfrm) |
And see the sorted result:
1 2 3 4 5 6 7 8 |
> orderBy(~-chol-id, data=testDfrm) id glyhb chol frame 5 S1026 4.84 249 small 6 S1027 3.94 248 medium 3 S1024 7.72 228 medium 2 S1023 4.63 228 large 4 S1025 4.81 181 small 1 S1022 4.64 132 large |
Observe that the records S1023 and S1024 are reserved in the second example.
6. Conclusion
The tutorial has shown us how to sort or order a data frame in R by using the order, an R’s built-in function and the arrange function of the plyr, dplyr package as well. We can see that the order function provides us flexible ways to sort a data frame in R while the arrange, orderBy functions are much easier.
Below are other R related tutorials for your references:
- R – Rename Column of Data Frame
- Read CSV File in R
- Write CSV File in R
- R – Add New Column To A Data Frame
- R – Remove Or Delete A Column Of A Data Frame