Introduction
So far, we have learned about
Vector in R which is similar to a one-dimensional array. It contains only elements of only one data type, i.e., all elements of a vector will be of a similar data type.
Now, we will learn about Data Frame in R. Let's start now.
Data Frame in R
Data Frame in R is a kind of data type similar to other data types, such as numeric, character, vector, etc. It is a two-dimensional object. It can contain different data type elements like numeric, character or logical in different columns. Here, one thing we need to remember is that while creating it, in any single column, all the elements should be of the same type i.e., either numeric, character, logic or something else.
In general, we can say the data frame is the more general form of the matrix as well as the collection of different vectors.
Let's see the below example of Data Frame in the image. Here, as you can see, there are seven columns containing different data type elements. But, in any single column, all elements are of the same type which verifies the definition of Data Frame in R as said above.
Creating Data Frame in R Studio
To create a Data Frame in R, we use the method data.frame(). Now, we shall see the below steps to create a Data Frame.
First, we create a vector of Rank, Country, 2019 Population, 2018 Population, and Growth Rate and then, we will use these vectors to create the above-shown vector.
We will use c() function to create vector as shown below. Other way of creating vector in R using colon (:). We will use both methods here.
-
- Rank <- 1:10
-
- Country <- c("China", "India", "United States", "Indonesia", "Pakistan", "Brazil",
- "Nigeria", "Bangladesh", "Russia", "Mexico")
-
- Population.2019 <- c(1433783686, 1366417754, 329064917, 270625568, 216565318,
- 211049527, 200963599, 163046161, 145872256, 127575529)
-
- Population.2018 <- c(1427647786, 1352642280, 327096265, 267670543, 212228286,
- 209469323, 195874683, 161376708, 145734038, 126190788)
-
- Growth.Rate <- c("0.43%", "1.02%", "0.60%", "1.10%", "2.04%", "0.75%", "2.60%",
- "1.03%", "0.09%", "1.10%")
-
-
- DataFrame.WorldPopulation <- data.frame(Rank, Country, Population.2019, Population.2018,
- Growth.Rate)
-
- DataFrame.WorldPopulation
Below image shows the above steps involved in R studio for creating Data Frame.
The below image is the output of newly created Data Frame in R studio. It shows same as shown in first image which was our goal to create as a Data Frame in R.
Filtering/Selection Elements of R Data Frame
So, far we have already created Data Frame in R. Now we shall see and learn how to filter elements of Data Frame in R. We can filter elements of Data Frame in R in the same way as we select elements of matrix in R.
Filtering Element By Index
If we want to filter all countries then we can use the index 2 which can be passed in square bracket [] as shown below. The index in R Data Frame start with 1 not 0. So, to select all country we will pass 2 as index.
- DataFrame.WorldPopulation[2]
Filtering Element By Column Name
To filter Data Frame elements in R using column name we simply pass the column name in square brackets as shown below. It is shown below.
To select all the elements of Growth Rate we use the following code.
- DataFrame.WorldPopulation["Growth.Rate"]
Filtering Elements Of Multiple Columns By Column Name
To filter multiple columns element we can pass vector in square bracket as shown below. Select country, 2019 population and growth rate.
- > DataFrame.WorldPopulation [c("Country", "Population.2019", "Growth.Rate")]
Filtering Elements Of Particular Row By Passing Criteria
To select all the elements i.e., all columns of Rank 2 we use the following condition.
- DataFrame.WorldPopulation [DataFrame.WorldPopulation$Rank == 2, ]
The output is as shown below.
Filtering Data Frame Elements By And (&) Condition
We pass single ampersand (&) for supplying and providing and condition to fulfill multiple data selection criteria as shown below. The below code shows that we select all the elements of Data Frame whose growth rate is equal to 1.10% and rank is greater than 5.
- DataFrame.WorldPopulation [DataFrame.WorldPopulation$Growth.Rate == "1.10%" & DataFrame.WorldPopulation$Rank > 5, ]
You can see the output as shown in the below image.
Remember to create Data Frame in R the number of elements in each column should be the same otherwise it will give the following error which is a differing number of rows. When I created Rank using 1:11 and used all the other vector for creating Data Frame, it shows the error as shown below.
Summary
In this article, we have learned about the Data Frame in R. We have seen what is the Data Frame in R, how to create Data Frame and also how to filter Data Frame elements in R.
I hope you learned and enjoyed it. I look forward to seeing your feedback.