Introduction
Information can emerge from numerous sources. Various built-in packages are included inside the R directory. These packages are composed of predefined datasets from which data can be extracted for further analysis. Data can be read in R from a broad range of sources and this information can be perused in a large number of formats.
In this article, I will discuss how to access data of a predefined single dataset as well as multiple datasets belonging to different packages in R. I will discuss different syntaxes that can be used to access either a dataset of a single package or all the datasets belonging to different datasets in R.
Accessing predefined datasets in R
The directory of R is loaded with various predefined datasets which are packed inside a package called datasets. The availability of different varieties of datasets ensures that different kinds of datasets can be used in different projects. These datasets can be used to apply different kinds of analysis techniques.
In R a wide variety of datasets are available in different R packages. The data function data() can be used to list and display datasets that are available inside a particular loaded package.
To access the datasets package dataset we can use the syntax given below.
data()
Data sets in package ‘datasets’.
The above syntax will give the following output.
Data sets in package ‘datasets’:
AirPassengers Monthly Airline Passenger Numbers 1949-1960
BJsales Sales Data with Leading Indicator
BJsales.lead (BJsales) Sales Data with Leading Indicator
BOD Biochemical Oxygen Demand
CO2 Carbon Dioxide Uptake in Grass Plants
ChickWeight Weight versus age of chicks on different diets
DNase Elisa assay of DNase
EuStockMarkets Daily Closing Prices of Major European Stock
Indices, 1991-1998
Formaldehyde Determination of Formaldehyde
HairEyeColor Hair and Eye Color of Statistics Students
Harman23.cor Harman Example 2.3
Harman74.cor Harman Example 7.4
Indometh Pharmacokinetics of Indomethacin
InsectSprays Effectiveness of Insect Sprays
JohnsonJohnson Quarterly Earnings per Johnson & Johnson Share
LakeHuron Level of Lake Huron 1875-1972
LifeCycleSavings Intercountry Life-Cycle Savings Data
Loblolly Growth of Loblolly pine trees
Nile Flow of the River Nile
Orange Growth of Orange Trees
OrchardSprays Potency of Orchard Sprays
PlantGrowth Results from an Experiment on Plant Growth
Puromycin Reaction Velocity of an Enzymatic Reaction
Seatbelts Road Casualties in Great Britain 1969-84
Theoph Pharmacokinetics of Theophylline
Titanic Survival of passengers on the Titanic
ToothGrowth The Effect of Vitamin C on Tooth Growth in
Guinea Pigs
UCBAdmissions Student Admissions at UC Berkeley
UKDriverDeaths Road Casualties in Great Britain 1969-84
UKgas UK Quarterly Gas Consumption
USAccDeaths Accidental Deaths in the US 1973-1978
USArrests Violent Crime Rates by US State
USJudgeRatings Lawyers' Ratings of State Judges in the US
Superior Court
USPersonalExpenditure Personal Expenditure Data
UScitiesD Distances Between European Cities and Between
US Cities
VADeaths Death Rates in Virginia (1940)
WWWusage Internet Usage per Minute
WorldPhones The World's Telephones
ability.cov Ability and Intelligence Tests
airmiles Passenger Miles on Commercial US Airlines,
1937-1960
airquality New York Air Quality Measurements
anscombe Anscombe's Quartet of 'Identical' Simple Linear
Regressions
attenu The Joyner-Boore Attenuation Data
attitude The Chatterjee-Price Attitude Data
austres Quarterly Time Series of the Number of
Australian Residents
beaver1 (beavers) Body Temperature Series of Two Beavers
beaver2 (beavers) Body Temperature Series of Two Beavers
cars Speed and Stopping Distances of Cars
chickwts Chicken Weights by Feed Type
co2 Mauna Loa Atmospheric CO2 Concentration
crimtab Student's 3000 Criminals Data
discoveries Yearly Numbers of Important Discoveries
esoph Smoking, Alcohol and (O)esophageal Cancer
euro Conversion Rates of Euro Currencies
euro.cross (euro) Conversion Rates of Euro Currencies
eurodist Distances Between European Cities and Between
US Cities
faithful Old Faithful Geyser Data
fdeaths (UKLungDeaths) Monthly Deaths from Lung Diseases in the UK
freeny Freeny's Revenue Data
freeny.x (freeny) Freeny's Revenue Data
freeny.y (freeny) Freeny's Revenue Data
infert Infertility after Spontaneous and Induced
Abortion
iris Edgar Anderson's Iris Data
iris3 Edgar Anderson's Iris Data
islands Areas of the World's Major Landmasses
ldeaths (UKLungDeaths) Monthly Deaths from Lung Diseases in the UK
lh Luteinizing Hormone in Blood Samples
longley Longley's Economic Regression Data
lynx Annual Canadian Lynx trappings 1821-1934
mdeaths (UKLungDeaths) Monthly Deaths from Lung Diseases in the UK
morley Michelson Speed of Light Data
mtcars Motor Trend Car Road Tests
nhtemp Average Yearly Temperatures in New Haven
nottem Average Monthly Temperatures at Nottingham,
1920-1939
npk Classical N, P, K Factorial Experiment
occupationalStatus Occupational Status of Fathers and their Sons
precip Annual Precipitation in US Cities
presidents Quarterly Approval Ratings of US Presidents
pressure Vapor Pressure of Mercury as a Function of
Temperature
quakes Locations of Earthquakes off Fiji
randu Random Numbers from Congruential Generator
RANDU
rivers Lengths of Major North American Rivers
rock Measurements on Petroleum Rock Samples
sleep Student's Sleep Data
stack.loss (stackloss) Brownlee's Stack Loss Plant Data
stack.x (stackloss) Brownlee's Stack Loss Plant Data
stackloss Brownlee's Stack Loss Plant Data
state.abb (state) US State Facts and Figures
state.area (state) US State Facts and Figures
state.center (state) US State Facts and Figures
state.pision (state) US State Facts and Figures
state.name (state) US State Facts and Figures
state.region (state) US State Facts and Figures
state.x77 (state) US State Facts and Figures
sunspot.month Monthly Sunspot Data, from 1749 to "Present"
sunspot.year Yearly Sunspot Data, 1700-1988
sunspots Monthly Sunspot Numbers, 1749-1983
swiss Swiss Fertility and Socioeconomic Indicators
(1888) Data
treering Yearly Treering Data, -6000-1979
trees Diameter, Height and Volume for Black Cherry
Trees
uspop Populations Recorded by the US Census
volcano Topographic Information on Auckland's Maunga
Whau Volcano
warpbreaks The Number of Breaks in Yarn during Weaving
women Average Heights and Weights for American Women
In R, datasets available in each and every package can be listed and displayed using the following syntax,
data(package = .packages(all.available = TRUE))
To list the data sets in all available packages we can use the above syntax.
The above syntax will display a complete list of all the datasets that are loaded in all kinds of different packages that are available and preinstalled in the directory of R.
The above syntax will give the following output.
Data sets in package ‘boot’:
acme Monthly Excess Returns
aids Delay in AIDS Reporting in England and Wales
aircondit Failures of Air-conditioning Equipment
aircondit7 Failures of Air-conditioning Equipment
amis Car Speeding and Warning Signs
aml Remission Times for Acute Myelogenous Leukaemia
beaver Beaver Body Temperature Data
bigcity Population of U.S. Cities
brambles Spatial Location of Bramble Canes
breslow Smoking Deaths Among Doctors
calcium Calcium Uptake Data
cane Sugar-cane Disease Data
capability Simulated Manufacturing Process Data
catsM Weight Data for Domestic Cats
cav Position of Muscle Caveolae
cd4 CD4 Counts for HIV-Positive Patients
cd4.nested Nested Bootstrap of cd4 data
channing Channing House Data
city Population of U.S. Cities
claridge Genetic Links to Left-handedness
cloth Number of Flaws in Cloth
co.transfer Carbon Monoxide Transfer
coal Dates of Coal Mining Disasters
darwin Darwin's Plant Height Differences
dogs Cardiac Data for Domestic Dogs
downs.bc Incidence of Down's Syndrome in British
Columbia
ducks Behavioral and Plumage Characteristics of
Hybrid Ducks
fir Counts of Balsam-fir Seedlings
frets Head Dimensions in Brothers
grav Acceleration Due to Gravity
gravity Acceleration Due to Gravity
hirose Failure Time of PET Film
islay Jura Quartzite Azimuths on Islay
manaus Average Heights of the Rio Negro river at
Manaus
melanoma Survival from Malignant Melanoma
motor Data from a Simulated Motorcycle Accident
neuro Neurophysiological Point Process Data
nitrofen Toxicity of Nitrofen in Aquatic Systems
nodal Nodal Involvement in Prostate Cancer
nuclear Nuclear Power Station Construction Data
paulsen Neurotransmission in Guinea Pig Brains
poisons Animal Survival Times
polar Pole Positions of New Caledonian Laterites
remission Cancer Remission and Cell Activity
salinity Water Salinity and River Discharge
survival Survival of Rats after Radiation Doses
tau Tau Particle Decay Modes
tuna Tuna Sighting Data
urine Urine Analysis Data
wool Australian Relative Wool Prices
Data sets in package ‘cluster’:
agriculture European Union Agricultural Workforces
animals Attributes of Animals
chorSub Subset of C-horizon of Kola Data
flower Flower Characteristics
plantTraits Plant Species Traits Data
pluton Isotopic Composition Plutonium Batches
ruspini Ruspini Data
votes.repub Votes for Republican Candidate in Presidential
Elections
xclara Bivariate Data Set with 3 Clusters
Data sets in package ‘datasets’:
AirPassengers Monthly Airline Passenger Numbers 1949-1960
BJsales Sales Data with Leading Indicator
BJsales.lead (BJsales)
Sales Data with Leading Indicator
BOD Biochemical Oxygen Demand
CO2 Carbon Dioxide Uptake in Grass Plants
ChickWeight Weight versus age of chicks on different diets
DNase Elisa assay of DNase
EuStockMarkets Daily Closing Prices of Major European Stock
Indices, 1991-1998
Formaldehyde Determination of Formaldehyde
HairEyeColor Hair and Eye Color of Statistics Students
Harman23.cor Harman Example 2.3
Harman74.cor Harman Example 7.4
Indometh Pharmacokinetics of Indomethacin
InsectSprays Effectiveness of Insect Sprays
JohnsonJohnson Quarterly Earnings per Johnson & Johnson Share
LakeHuron Level of Lake Huron 1875-1972
LifeCycleSavings Intercountry Life-Cycle Savings Data
Loblolly Growth of Loblolly pine trees
Nile Flow of the River Nile
Orange Growth of Orange Trees
OrchardSprays Potency of Orchard Sprays
PlantGrowth Results from an Experiment on Plant Growth
Puromycin Reaction Velocity of an Enzymatic Reaction
Seatbelts Road Casualties in Great Britain 1969-84
Theoph Pharmacokinetics of Theophylline
Titanic Survival of passengers on the Titanic
ToothGrowth The Effect of Vitamin C on Tooth Growth in
Data within any dataset can be accessed using the data function data(), to access a specific dataset. Within some packages, we can pass the name of the dataset and the package where the data is found as follows.
data("cars", package = "datasets")
car dataset in package ‘datasets’,
> data("cars", package = "datasets")
> head(cars)
speed dist
1 4 2
2 4 10
3 7 4
4 7 22
5 8 16
6 9 10
>
As we can see, the car data frame contains two variables, which are speed and stopping distance of cars.
Summary
In this article, I demonstrated how to access a single dataset as well as multiple datasets belonging to different packages in R. I discussed different syntaxes that can be used to access either a dataset of a single package or all the datasets belonging to different datasets in R. Proper coding snippets and output are also provided.