How To Access Data Of Predefined Datasets In R

Introduction

Information can emerge from numerous sources. Various built-in packages are included inside the R directory. These packages are composed of predefined datasets from which data can be extracted for further analysis. Data can be read in R from a broad range of sources and this information can be perused in a large number of formats.

In this article, I will discuss how to access data of a predefined single dataset as well as multiple datasets belonging to different packages in R. I will discuss different syntaxes that can be used to access either a dataset of a single package or all the datasets belonging to different datasets in R.

Accessing predefined datasets in R

The directory of R is loaded with various predefined datasets which are packed inside a package called datasets. The availability of different varieties of datasets ensures that different kinds of datasets can be used in different projects. These datasets can be used to apply different kinds of analysis techniques.

In R a wide variety of datasets are available in different R packages. The data function data() can be used to list and display datasets that are available inside a particular loaded package.

To access the datasets package dataset we can use the syntax given below.

data()

Data sets in package ‘datasets’.

The above syntax will give the following output.

Data sets in package ‘datasets’:

AirPassengers            Monthly Airline Passenger Numbers 1949-1960
BJsales                  Sales Data with Leading Indicator
BJsales.lead (BJsales)   Sales Data with Leading Indicator
BOD                      Biochemical Oxygen Demand
CO2                      Carbon Dioxide Uptake in Grass Plants
ChickWeight              Weight versus age of chicks on different diets
DNase                    Elisa assay of DNase
EuStockMarkets           Daily Closing Prices of Major European Stock
                         Indices, 1991-1998
Formaldehyde             Determination of Formaldehyde
HairEyeColor             Hair and Eye Color of Statistics Students
Harman23.cor             Harman Example 2.3
Harman74.cor             Harman Example 7.4
Indometh                 Pharmacokinetics of Indomethacin
InsectSprays             Effectiveness of Insect Sprays
JohnsonJohnson           Quarterly Earnings per Johnson & Johnson Share
LakeHuron                Level of Lake Huron 1875-1972
LifeCycleSavings         Intercountry Life-Cycle Savings Data
Loblolly                 Growth of Loblolly pine trees
Nile                     Flow of the River Nile
Orange                   Growth of Orange Trees
OrchardSprays            Potency of Orchard Sprays
PlantGrowth              Results from an Experiment on Plant Growth
Puromycin                Reaction Velocity of an Enzymatic Reaction
Seatbelts                Road Casualties in Great Britain 1969-84
Theoph                   Pharmacokinetics of Theophylline
Titanic                  Survival of passengers on the Titanic
ToothGrowth              The Effect of Vitamin C on Tooth Growth in
                         Guinea Pigs
UCBAdmissions            Student Admissions at UC Berkeley
UKDriverDeaths           Road Casualties in Great Britain 1969-84
UKgas                    UK Quarterly Gas Consumption
USAccDeaths              Accidental Deaths in the US 1973-1978
USArrests                Violent Crime Rates by US State
USJudgeRatings           Lawyers' Ratings of State Judges in the US
                         Superior Court
USPersonalExpenditure    Personal Expenditure Data
UScitiesD                Distances Between European Cities and Between
                         US Cities
VADeaths                 Death Rates in Virginia (1940)
WWWusage                 Internet Usage per Minute
WorldPhones              The World's Telephones
ability.cov              Ability and Intelligence Tests
airmiles                 Passenger Miles on Commercial US Airlines,
                         1937-1960
airquality               New York Air Quality Measurements
anscombe                 Anscombe's Quartet of 'Identical' Simple Linear
                         Regressions
attenu                   The Joyner-Boore Attenuation Data
attitude                 The Chatterjee-Price Attitude Data
austres                  Quarterly Time Series of the Number of
                         Australian Residents
beaver1 (beavers)        Body Temperature Series of Two Beavers
beaver2 (beavers)        Body Temperature Series of Two Beavers
cars                     Speed and Stopping Distances of Cars
chickwts                 Chicken Weights by Feed Type
co2                      Mauna Loa Atmospheric CO2 Concentration
crimtab                  Student's 3000 Criminals Data
discoveries              Yearly Numbers of Important Discoveries
esoph                    Smoking, Alcohol and (O)esophageal Cancer
euro                     Conversion Rates of Euro Currencies
euro.cross (euro)        Conversion Rates of Euro Currencies
eurodist                 Distances Between European Cities and Between
                         US Cities
faithful                 Old Faithful Geyser Data
fdeaths (UKLungDeaths)   Monthly Deaths from Lung Diseases in the UK
freeny                   Freeny's Revenue Data
freeny.x (freeny)        Freeny's Revenue Data
freeny.y (freeny)        Freeny's Revenue Data
infert                   Infertility after Spontaneous and Induced
                         Abortion
iris                     Edgar Anderson's Iris Data
iris3                    Edgar Anderson's Iris Data
islands                  Areas of the World's Major Landmasses
ldeaths (UKLungDeaths)   Monthly Deaths from Lung Diseases in the UK
lh                       Luteinizing Hormone in Blood Samples
longley                  Longley's Economic Regression Data
lynx                     Annual Canadian Lynx trappings 1821-1934
mdeaths (UKLungDeaths)   Monthly Deaths from Lung Diseases in the UK
morley                   Michelson Speed of Light Data
mtcars                   Motor Trend Car Road Tests
nhtemp                   Average Yearly Temperatures in New Haven
nottem                   Average Monthly Temperatures at Nottingham,
                         1920-1939
npk                      Classical N, P, K Factorial Experiment
occupationalStatus       Occupational Status of Fathers and their Sons
precip                   Annual Precipitation in US Cities
presidents               Quarterly Approval Ratings of US Presidents
pressure                 Vapor Pressure of Mercury as a Function of
                         Temperature
quakes                   Locations of Earthquakes off Fiji
randu                    Random Numbers from Congruential Generator
                         RANDU
rivers                   Lengths of Major North American Rivers
rock                     Measurements on Petroleum Rock Samples
sleep                    Student's Sleep Data
stack.loss (stackloss)   Brownlee's Stack Loss Plant Data
stack.x (stackloss)      Brownlee's Stack Loss Plant Data
stackloss                Brownlee's Stack Loss Plant Data
state.abb (state)        US State Facts and Figures
state.area (state)       US State Facts and Figures
state.center (state)     US State Facts and Figures
state.pision (state)   US State Facts and Figures
state.name (state)       US State Facts and Figures
state.region (state)     US State Facts and Figures
state.x77 (state)        US State Facts and Figures
sunspot.month            Monthly Sunspot Data, from 1749 to "Present"
sunspot.year             Yearly Sunspot Data, 1700-1988
sunspots                 Monthly Sunspot Numbers, 1749-1983
swiss                    Swiss Fertility and Socioeconomic Indicators
                         (1888) Data
treering                 Yearly Treering Data, -6000-1979
trees                    Diameter, Height and Volume for Black Cherry
                         Trees
uspop                    Populations Recorded by the US Census
volcano                  Topographic Information on Auckland's Maunga
                         Whau Volcano
warpbreaks               The Number of Breaks in Yarn during Weaving
women                    Average Heights and Weights for American Women

In R, datasets available in each and every package can be listed and displayed using the following syntax,

data(package = .packages(all.available = TRUE))

To list the data sets in all available packages we can use the above syntax.

The above syntax will display a complete list of all the datasets that are loaded in all kinds of different packages that are available and preinstalled in the directory of R.

The above syntax will give the following output.

Data sets in package ‘boot’:  
  
acme                    Monthly Excess Returns  
aids                    Delay in AIDS Reporting in England and Wales  
aircondit               Failures of Air-conditioning Equipment  
aircondit7              Failures of Air-conditioning Equipment  
amis                    Car Speeding and Warning Signs  
aml                     Remission Times for Acute Myelogenous Leukaemia  
beaver                  Beaver Body Temperature Data  
bigcity                 Population of U.S. Cities  
brambles                Spatial Location of Bramble Canes  
breslow                 Smoking Deaths Among Doctors  
calcium                 Calcium Uptake Data  
cane                    Sugar-cane Disease Data  
capability              Simulated Manufacturing Process Data  
catsM                   Weight Data for Domestic Cats  
cav                     Position of Muscle Caveolae  
cd4                     CD4 Counts for HIV-Positive Patients  
cd4.nested              Nested Bootstrap of cd4 data  
channing                Channing House Data  
city                    Population of U.S. Cities  
claridge                Genetic Links to Left-handedness  
cloth                   Number of Flaws in Cloth  
co.transfer             Carbon Monoxide Transfer  
coal                    Dates of Coal Mining Disasters  
darwin                  Darwin's Plant Height Differences  
dogs                    Cardiac Data for Domestic Dogs  
downs.bc                Incidence of Down's Syndrome in British  
                        Columbia  
ducks                   Behavioral and Plumage Characteristics of  
                        Hybrid Ducks  
fir                     Counts of Balsam-fir Seedlings  
frets                   Head Dimensions in Brothers  
grav                    Acceleration Due to Gravity  
gravity                 Acceleration Due to Gravity  
hirose                  Failure Time of PET Film  
islay                   Jura Quartzite Azimuths on Islay  
manaus                  Average Heights of the Rio Negro river at  
                        Manaus  
melanoma                Survival from Malignant Melanoma  
motor                   Data from a Simulated Motorcycle Accident  
neuro                   Neurophysiological Point Process Data  
nitrofen                Toxicity of Nitrofen in Aquatic Systems  
nodal                   Nodal Involvement in Prostate Cancer  
nuclear                 Nuclear Power Station Construction Data  
paulsen                 Neurotransmission in Guinea Pig Brains  
poisons                 Animal Survival Times  
polar                   Pole Positions of New Caledonian Laterites  
remission               Cancer Remission and Cell Activity  
salinity                Water Salinity and River Discharge  
survival                Survival of Rats after Radiation Doses  
tau                     Tau Particle Decay Modes  
tuna                    Tuna Sighting Data  
urine                   Urine Analysis Data  
wool                    Australian Relative Wool Prices  
  
Data sets in package ‘cluster’:  
  
agriculture             European Union Agricultural Workforces  
animals                 Attributes of Animals  
chorSub                 Subset of C-horizon of Kola Data  
flower                  Flower Characteristics  
plantTraits             Plant Species Traits Data  
pluton                  Isotopic Composition Plutonium Batches  
ruspini                 Ruspini Data  
votes.repub             Votes for Republican Candidate in Presidential  
                        Elections  
xclara                  Bivariate Data Set with 3 Clusters  
  
Data sets in package ‘datasets’:  
  
AirPassengers           Monthly Airline Passenger Numbers 1949-1960  
BJsales                 Sales Data with Leading Indicator  
BJsales.lead (BJsales)  
                        Sales Data with Leading Indicator  
BOD                     Biochemical Oxygen Demand  
CO2                     Carbon Dioxide Uptake in Grass Plants  
ChickWeight             Weight versus age of chicks on different diets  
DNase                   Elisa assay of DNase  
EuStockMarkets          Daily Closing Prices of Major European Stock  
                        Indices, 1991-1998  
Formaldehyde            Determination of Formaldehyde  
HairEyeColor            Hair and Eye Color of Statistics Students  
Harman23.cor            Harman Example 2.3  
Harman74.cor            Harman Example 7.4  
Indometh                Pharmacokinetics of Indomethacin  
InsectSprays            Effectiveness of Insect Sprays  
JohnsonJohnson          Quarterly Earnings per Johnson & Johnson Share  
LakeHuron               Level of Lake Huron 1875-1972  
LifeCycleSavings        Intercountry Life-Cycle Savings Data  
Loblolly                Growth of Loblolly pine trees  
Nile                    Flow of the River Nile  
Orange                  Growth of Orange Trees  
OrchardSprays           Potency of Orchard Sprays  
PlantGrowth             Results from an Experiment on Plant Growth  
Puromycin               Reaction Velocity of an Enzymatic Reaction  
Seatbelts               Road Casualties in Great Britain 1969-84  
Theoph                  Pharmacokinetics of Theophylline  
Titanic                 Survival of passengers on the Titanic  
ToothGrowth             The Effect of Vitamin C on Tooth Growth in  

Data within any dataset can be accessed using the data function data(), to access a specific dataset. Within some packages, we can pass the name of the dataset and the package where the data is found as follows.

data("cars", package = "datasets")

car dataset in package ‘datasets’,

> data("cars", package = "datasets")  
> head(cars)  
  speed dist  
1     4    2  
2     4   10  
3     7    4  
4     7   22  
5     8   16  
6     9   10  
>  

As we can see, the car data frame contains two variables, which are speed and stopping distance of cars.

Summary

In this article, I demonstrated how to access a single dataset as well as multiple datasets belonging to different packages in R. I discussed different syntaxes that can be used to access either a dataset of a single package or all the datasets belonging to different datasets in R. Proper coding snippets and output are also provided.


Similar Articles