This tutorial is going to cover how to create a data frame in R. We’re going to get through how to:
- Create data frames from Vectors.
- Create data frames from existing data frames.
- Create data frame a List.
1. Create A Data Frame In R From Vectors
This section is going to describe how to create data frames from R Vectors, a very popular and “natural” way.
1.1. Quick Syntax
Let’s see a quick syntax to create a data frame in R from a list of vectors using the data.frame function
1 |
df <- data.frame(v1, v2, v3,...) |
Where v1, v2, v3,… is the list of Vectors in R. For example, let’s assume we have following vectors represented for variables of books:
1 2 3 |
title <- c('Data Smart','Orientalism','False Impressions','Making Software') author <- c('Foreman, John','Said, Edward','Archer, Jeffery','Oram, Andy') year <- c('2010','2011','2012','1998') |
We can create a data frame by using the following command:
1 |
df <- data.frame(title, author, year) |
As mentioned in the definition of the Data Frame in R, it is just a “list of variables of the same number of rows”, we simply provide the list of vectors/variables for the data.frame function. Now, let’s print the data frame into console to see how it looks like:
1 2 3 4 5 6 |
> df title author year 1 Data Smart Foreman, John 2010 2 Orientalism Said, Edward 2011 3 False Impressions Archer, Jeffery 2012 4 Making Software Oram, Andy 1998 |
We can see that the data frame has 3 columns, each of which corresponds to an above vector/variable.
1.2. Full Syntax
Below is the full syntax to create a data frame in R:
1 2 3 |
data.frame(…, row.names = NULL, check.rows = FALSE, check.names = TRUE, fix.empty.names = TRUE, stringsAsFactors = default.stringsAsFactors()) |
Here are the details of the syntax:
Arguments | Descriptions |
… | values, or functions which can be evaluated as a list of vectors |
row.names | A single integer or character string specifying a column to be used as row names, or a character or integer vector giving the row names for the data frame |
check.names | Logical, if TRUE then the rows are checked for consistency of length and names. |
fix.empty.names | Logical, indicating if arguments which are “unnamed” (in the sense of not being formally called as someName = arg) get an automatically constructed name or rather name “”. Needs to be set to FALSE even when check.names is false if “” names should be kept. |
stringsAsFactors | Logical: should character vectors be converted to factors? |
2. Creating Data Frames From Existing Data Frames
In some cases, we just want to create a new data frames from existing ones, where we take just some interested columns, or subset some columns based on some conditions.
Let’s say we have a data frame df as follows:
1 2 3 4 5 6 |
> df <- data.frame(x = 1:3, y = 3:1, z = letters[1:3]) > df x y z 1 1 3 a 2 2 2 b 3 3 1 c |
And now we want to create a new data frame from it, or we simply want to copy it into a new data frame. Let’s see how we can do that by passing it to the data.frame function:
1 |
dfNew <- data.frame(df) |
Let’s see the the new data frame dfNew:
1 2 3 4 5 |
> dfNew x y z 1 1 3 a 2 2 2 b 3 3 1 c |
In some situation, we just want only several columns of the existing data frame in our new data frame, we can do that by selecting a subset of our interested columns, as follows:
1 |
dfNew <- df[,c("x", "y")] |
On the above example, we have just created a new data frame dfNew from 2 columns x and y of the data frame df. Now, let’s see its content:
1 2 3 4 5 |
> dfNew x y 1 1 3 2 2 2 3 3 1 |
3. Creating Data Frames From A Matrix
To create data frames from a Matrix in R, we can use the function as.data.frame which has the syntax as follows:
1 2 3 |
as.data.frame(x, row.names = NULL, optional = FALSE, make.names = TRUE, …, stringsAsFactors = default.stringsAsFactors()) |
Let’s create a matrix with 2 rows and 3 columns as below:
1 2 3 4 |
mt = matrix(c(1, 2, 3, 4, 5, 6), # the data elements nrow=2, # number of rows ncol=3, # number of columns byrow = TRUE) # fill matrix by rows |
And print out its content:
1 2 3 4 |
> mt [,1] [,2] [,3] [1,] 1 2 3 [2,] 4 5 6 |
Now let’s create a new data frame from the above matrix:
1 |
df <- as.data.frame(mt); |
And print out the content of the created data frame:
1 2 3 4 |
> df V1 V2 V3 1 1 2 3 2 4 5 6 |
4. Create An Empty Data Frame In R
Sometimes, we just want to create an empty data frame. This section is going to cover several approach to do such.
4.1. By Initializing A List Empty Vectors
The easiest way to create an empty data frame is to provide a list of empty vectors for the data.frame function. Let’s see a below example where we will create an empty data frame with 3 columns:
1 2 3 4 |
df <- data.frame(col1 = character(), col2 = numeric(), col3 = factor(), stringsAsFactors = FALSE) |
Let’s print the structure of the above data frame:
1 2 3 4 5 |
> str(df) 'data.frame': 0 obs. of 3 variables: $ col1: chr $ col2: num $ col3: Factor w/ 0 levels: |
It has 3 columns and 0 rows.
4.2. By Initializing A Matrix with Zero Rows
Another way to create an empty data frame is to providing the data.frame function a matrix with zero rows. Let’ see an example below:
1 2 3 4 5 6 |
/* Create a data frame*/ df <- data.frame(matrix(ncol = 3, nrow = 0)) /* Change column names of the data frame*/ x <- c("col1", "col2", "col3") colnames(df) <- x |
At the first command, we create a data frame by providing the data.frame with a matrix with 3 columns and 0 rows. From the 2nd row, we change the column names of the data frame.
Refer to R – Rename Column of Data Frame to change or rename columns of data frame in R.
Or we can do in one-shot as follows:
1 |
df <- setNames(data.frame(matrix(ncol = 3, nrow = 0)), c("col1", "col2", "col3")) |
Let’s see the structure of the data frame:
1 2 3 4 5 |
> str (df) 'data.frame': 0 obs. of 3 variables: $ name : logi $ age : logi $ gender: logi |
5. Conclusion
We have just gotten through how to create a data frame from a list of vectors/variables. And finally, data frame is a very fundamental data structure in R, how to manipulate it can be found at following links:
Add New Column To A Data Frame in R
Remove Or Delete A Column Of A Data Frame In R
How To Order A Data Frame In R