In this article, I’d like to introduce some basic information about the **R data frame** and some its basic operations as well. Hope this will bring you some references when working with it in R.

## 1. Introduction to R data frame

In R, data frame is one of the most basic data structure. Generally speaking, data frame is a matrix-like data structure. It has rows and columns. This may give us an imagine about a CSV or Spreadsheet/Excel file which can be represented by a data frame in R. If we have 4 vectors with the same length n , we can combine them info a data frame which has 4 columns and n rows.

For example:

Let’s say we have a vector of names which contains names of some people:

1 |
> names <- c("John", "Mary", "Daisy", "David") |

We also have a vector of ages

1 |
> ages <- c(1984, 1985,1986, 1987) |

And a vector of cities

1 |
> cities <- c("New York", "London", "Sydney", "Toronto") |

Now we can combine above vectors into a single data frame.

1 |
> df <- data.frame(names, ages, cities) |

We will try to print out the content of the data frame df:

1 2 3 4 5 6 7 |
> df --------------------------- names ages cities 1 John 1984 New York 2 Mary 1985 London 3 Daisy 1986 Sydney 4 David 1987 Toronto |

What we could see about above example are:

- Data frame is a list of vectors (we can call them variables) have the same length.
- Those vectors don’t need to have the same data type. The names, cities vectors are character types while the ages is numerical type.

Next, we will get to know some basic functions of a R data frame.

## 2. Some basic functions of a R data frame

In this section, we will get through some very basic functions of a R data frame that often to be used when we manipulate data.

### 2.1. To create a data frame

Assume that we have a list of vectors: V1, V2, V3, V4…

Below command will create a data frame from those vectors.

1 |
my_data_frame <- data.frame( V1, V2, V3, V4) |

### 2.2. The overall structure of data

1 2 |
> class (df) [1] "data.frame" |

### 2.3. Gets all column names of a data frame

Below method will return a character vector of column names

1 2 |
> names(df) [1] "names" "ages" "cities" |

### 2.4. To count the number of rows and number of columns of a R data frame (dimensions)

1 2 |
> dim(df) [1] 4 3 |

Above method returns a vectors which indicates that the data frame has 4 rows and 3 columns

### 2.5. To count only the number of rows (observations)

1 2 |
> nrow(df) [1] 4 |

### 2.6. To count only the number of columns(variables)

1 2 |
> ncol(df) [1] 3 |

1 2 |
> length(df) [1] 3 |

It’s based on the definition:

Data Frame is a list of vectors

So, if we get length of the list, it will return the number of vectors( columns or variables)

### 2.7. To preview top rows of a data frame

1 2 3 4 5 6 |
> head(df) names ages cities 1 John 1984 New York 2 Mary 1985 London 3 Daisy 1986 Sysney 4 David 1987 Toronto |

### 2.8. To preview top n rows of a data frame

1 2 3 4 5 |
> head(df, 3) names ages cities 1 John 1984 New York 2 Mary 1985 London 3 Daisy 1986 Sysney |

### 2.9. To preview last rows of a data frame

1 2 3 4 5 6 |
> tail(df) names ages cities 1 John 1984 New York 2 Mary 1985 London 3 Daisy 1986 Sysney 4 David 1987 Toronto |

### 2.10. To preview last n rows of a data frame

1 2 3 4 |
> tail(df, 2) names ages cities 3 Daisy 1986 Sysney 4 David 1987 Toronto |

### 2.11. To summarize about a R data frame

Some info can be previewed includes: how each variables(columns) is distributed, how much of the dataset is missing (have NA value), etc

1 2 3 4 5 6 7 8 |
> summary(df) names ages cities Daisy:1 Min. :1984 London :1 David:1 1st Qu.:1985 New York:1 John :1 Median :1986 Sysney :1 Mary :1 Mean :1986 Toronto :1 3rd Qu.:1986 Max. :1987 |

### 2.12. To display the internal structure of a data frame (object in R)

1 2 3 4 5 |
> str(df) 'data.frame': 4 obs. of 3 variables: $ names : Factor w/ 4 levels "Daisy","David",..: 3 4 1 2 $ ages : num 1984 1985 1986 1987 $ cities: Factor w/ 4 levels "London","New York",..: 2 1 3 4 |

The * str *method has a lot of parameters, options. You can take a look by go to this link

## 3. Summary

We have just learned about **R data frame** and some basic functions that we often need when we work with data frame in R. Data frame is very popular, efficiency and often to be used to manipulate data in R.

Below are other articles related to R data frame, you can refer to them if you’re interested in:

R – Rename Column of Data Frame

R – Add New Column To A Data Frame