R Built-in Data Sets(R内置数据集)

########R Built-in Data Sets#########

{datasets}

# > ls("package:datasets")

# [1] "ability.cov"          "airmiles"              "AirPassengers"       

# [4] "airquality"            "anscombe"              "attenu"             

# [7] "attitude"              "austres"              "beaver1"             

# [10] "beaver2"              "BJsales"              "BJsales.lead"       

# [13] "BOD"                  "cars"                  "ChickWeight"         

# [16] "chickwts"              "co2"                  "CO2"                 

# [19] "crimtab"              "discoveries"          "DNase"               

# [22] "esoph"                "euro"                  "euro.cross"         

# [25] "eurodist"              "EuStockMarkets"        "faithful"           

# [28] "fdeaths"              "Formaldehyde"          "freeny"             

# [31] "freeny.x"              "freeny.y"              "HairEyeColor"       

# [34] "Harman23.cor"          "Harman74.cor"          "Indometh"           

# [37] "infert"                "InsectSprays"          "iris"               

# [40] "iris3"                "islands"              "JohnsonJohnson"     

# [43] "LakeHuron"            "ldeaths"              "lh"                 

# [46] "LifeCycleSavings"      "Loblolly"              "longley"             

# [49] "lynx"                  "mdeaths"              "morley"             

# [52] "mtcars"                "nhtemp"                "Nile"               

# [55] "nottem"                "npk"                  "occupationalStatus" 

# [58] "Orange"                "OrchardSprays"        "PlantGrowth"         

# [61] "precip"                "presidents"            "pressure"           

# [64] "Puromycin"            "quakes"                "randu"               

# [67] "rivers"                "rock"                  "Seatbelts"           

# [70] "sleep"                "stack.loss"            "stack.x"             

# [73] "stackloss"            "state.abb"            "state.area"         

# [76] "state.center"          "state.division"        "state.name"         

# [79] "state.region"          "state.x77"            "sunspot.month"       

# [82] "sunspot.year"          "sunspots"              "swiss"               

# [85] "Theoph"                "Titanic"              "ToothGrowth"         

# [88] "treering"              "trees"                "UCBAdmissions"       

# [91] "UKDriverDeaths"        "UKgas"                "USAccDeaths"         

# [94] "USArrests"            "UScitiesD"            "USJudgeRatings"     

# [97] "USPersonalExpenditure" "uspop"                "VADeaths"           

# [100] "volcano"              "warpbreaks"            "women"               

# [103] "WorldPhones"          "WWWusage"     

#Ref.From:http://www.sthda.com/english/wiki/r-built-in-data-sets#preleminary-tasks

# Preleminary tasks

# List of pre-loaded data

# Loading a built-in R data

# Most used R built-in data sets

    # mtcars: Motor Trend Car Road Tests

    # iris

    # ToothGrowth

    # PlantGrowth

    # USArrests

# Summary


#R comes with several built-in data sets, which are generally used as demo data for playing with R functions.

#In this article, we’ll first describe how load and use R built-in data sets.

#Next, we’ll describe some of the most used R demo data sets: mtcars, iris, ToothGrowth, PlantGrowth and USArrests.

#Preleminary tasks

  #Launch RStudio as described here: Running RStudio and setting up your working directory

#List of pre-loaded data

  #To see the list of pre-loaded data, type the function data():

  data()

  #The output is as follow:  R data sets

  #Data sets in package ‘datasets’:#

# AirPassengers                        Monthly Airline Passenger Numbers 1949-1960

# BJsales                              Sales Data with Leading Indicator

# BJsales.lead (BJsales)                Sales Data with Leading Indicator

# BOD                                  Biochemical Oxygen Demand

# CO2                                  Carbon Dioxide Uptake in Grass Plants

# ChickWeight                          Weight versus age of chicks on different diets

# DNase                                Elisa assay of DNase

# EuStockMarkets                        Daily Closing Prices of Major European Stock Indices, 1991-1998

# Formaldehyde                          Determination of Formaldehyde

# HairEyeColor                          Hair and Eye Color of Statistics Students

# Harman23.cor                          Harman Example 2.3

# Harman74.cor                          Harman Example 7.4

# Indometh                              Pharmacokinetics of Indomethacin

# InsectSprays                          Effectiveness of Insect Sprays

# JohnsonJohnson                        Quarterly Earnings per Johnson & Johnson Share

# LakeHuron                            Level of Lake Huron 1875-1972

# LifeCycleSavings                      Intercountry Life-Cycle Savings Data

# Loblolly                              Growth of Loblolly pine trees

# Nile                                  Flow of the River Nile

# Orange                                Growth of Orange Trees

# OrchardSprays                        Potency of Orchard Sprays

# PlantGrowth                          Results from an Experiment on Plant Growth

# Puromycin                            Reaction Velocity of an Enzymatic Reaction

# Seatbelts                            Road Casualties in Great Britain 1969-84

# Theoph                                Pharmacokinetics of Theophylline

# Titanic                              Survival of passengers on the Titanic

# ToothGrowth                          The Effect of Vitamin C on Tooth Growth in Guinea Pigs

# UCBAdmissions                        Student Admissions at UC Berkeley

# UKDriverDeaths                        Road Casualties in Great Britain 1969-84

# UKgas                                UK Quarterly Gas Consumption

# USAccDeaths                          Accidental Deaths in the US 1973-1978

# USArrests                            Violent Crime Rates by US State

# USJudgeRatings                        Lawyers' Ratings of State Judges in the US Superior Court

# USPersonalExpenditure                Personal Expenditure Data

# UScitiesD                            Distances Between European Cities and Between US Cities

# VADeaths                              Death Rates in Virginia (1940)

# WWWusage                              Internet Usage per Minute

# WorldPhones                          The World's Telephones

# ability.cov                          Ability and Intelligence Tests

# airmiles                              Passenger Miles on Commercial US Airlines, 1937-1960

# airquality                            New York Air Quality Measurements

# anscombe                              Anscombe's Quartet of 'Identical' Simple Linear Regressions

# attenu                                The Joyner-Boore Attenuation Data

# attitude                              The Chatterjee-Price Attitude Data

# austres                              Quarterly Time Series of the Number of Australian Residents

# beaver1 (beavers)                    Body Temperature Series of Two Beavers

# beaver2 (beavers)                    Body Temperature Series of Two Beavers

# cars                                  Speed and Stopping Distances of Cars

# chickwts                              Chicken Weights by Feed Type

# co2                                  Mauna Loa Atmospheric CO2 Concentration

# crimtab                              Student's 3000 Criminals Data

# discoveries                          Yearly Numbers of Important Discoveries

# esoph                                Smoking, Alcohol and (O)esophageal Cancer

# euro                                  Conversion Rates of Euro Currencies

# euro.cross (euro)                    Conversion Rates of Euro Currencies

# eurodist                              Distances Between European Cities and Between US Cities

# faithful                              Old Faithful Geyser Data

# fdeaths (UKLungDeaths)                Monthly Deaths from Lung Diseases in the UK

# freeny                                Freeny's Revenue Data

# freeny.x (freeny)                    Freeny's Revenue Data

# freeny.y (freeny)                    Freeny's Revenue Data

# infert                                Infertility after Spontaneous and Induced Abortion

# iris                                  Edgar Anderson's Iris Data

# iris3                                Edgar Anderson's Iris Data

# islands                              Areas of the World's Major Landmasses

# ldeaths (UKLungDeaths)                Monthly Deaths from Lung Diseases in the UK

# lh                                    Luteinizing Hormone in Blood Samples

# longley                              Longley's Economic Regression Data

# lynx                                  Annual Canadian Lynx trappings 1821-1934

# mdeaths (UKLungDeaths)                Monthly Deaths from Lung Diseases in the UK

# morley                                Michelson Speed of Light Data

# mtcars                                Motor Trend Car Road Tests

# nhtemp                                Average Yearly Temperatures in New Haven

# nottem                                Average Monthly Temperatures at Nottingham, 1920-1939

# npk                                  Classical N, P, K Factorial Experiment

# occupationalStatus                    Occupational Status of Fathers and their Sons

# precip                                Annual Precipitation in US Cities

# presidents                            Quarterly Approval Ratings of US Presidents

# pressure                              Vapor Pressure of Mercury as a Function of Temperature

# quakes                                Locations of Earthquakes off Fiji

# randu                                Random Numbers from Congruential Generator RANDU

# rivers                                Lengths of Major North American Rivers

# rock                                  Measurements on Petroleum Rock Samples

# sleep                                Student's Sleep Data

# stack.loss (stackloss)                Brownlee's Stack Loss Plant Data

# stack.x (stackloss)                  Brownlee's Stack Loss Plant Data

# stackloss                            Brownlee's Stack Loss Plant Data

# state.abb (state)                    US State Facts and Figures

# state.area (state)                    US State Facts and Figures

# state.center (state)                  US State Facts and Figures

# state.division (state)                US State Facts and Figures

# state.name (state)                    US State Facts and Figures

# state.region (state)                  US State Facts and Figures

# state.x77 (state)                    US State Facts and Figures

# sunspot.month                        Monthly Sunspot Data, from 1749 to "Present"

# sunspot.year                          Yearly Sunspot Data, 1700-1988

# sunspots                              Monthly Sunspot Numbers, 1749-1983

# swiss                                Swiss Fertility and Socioeconomic Indicators (1888) Data

# treering                              Yearly Treering Data, -6000-1979

# trees                                Diameter, Height and Volume for Black Cherry Trees

# uspop                                Populations Recorded by the US Census

# volcano                              Topographic Information on Auckland's Maunga Whau Volcano

# warpbreaks                            The Number of Breaks in Yarn during Weaving

# women                                Average Heights and Weights for American Women

# Use ‘data(package = .packages(all.available = TRUE))’

# to list the data sets in all *available* packages.

#Loading a built-in R data

    #Load and print mtcars data as follow:

      # Loading

  data(mtcars)#A data frame with 32 observations on 11 (numeric) variables.

  # [, 1] mpg Miles/(US) gallon

  # [, 2] cyl Number of cylinders

  # [, 3] disp Displacement (cu.in.)

  # [, 4] hp Gross horsepower

  # [, 5] drat Rear axle ratio

  # [, 6] wt Weight (1000 lbs)

  # [, 7] qsec 1/4 mile time

  # [, 8] vs Engine (0 = V-shaped, 1 = straight)

  # [, 9] am Transmission (0 = automatic, 1 = manual)

  # [,10] gear Number of forward gears

  # [,11] carb Number of carburetors


      # Print the first 6 rows

  head(mtcars, 6)

  mpg cyl disp  hp drat    wt  qsec vs am gear carb

  Mazda RX4        21.0  6  160 110 3.90 2.620 16.46  0  1    4    4

  Mazda RX4 Wag    21.0  6  160 110 3.90 2.875 17.02  0  1    4    4

  Datsun 710        22.8  4  108  93 3.85 2.320 18.61  1  1    4    1

  Hornet 4 Drive    21.4  6  258 110 3.08 3.215 19.44  1  0    3    1

  Hornet Sportabout 18.7  8  360 175 3.15 3.440 17.02  0  0    3    2

  Valiant          18.1  6  225 105 2.76 3.460 20.22  1  0    3    1


#If you want learn more about mtcars data sets, type this:

  ?mtcars

#Most used R built-in data sets

  mtcars#Motor Trend Car Road Tests

  #The data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models)

#View the content of mtcars data set:

  # 1. Loading

  data("mtcars")

  # 2. Print

  head(mtcars)

  #It contains 32 observations and 11 variables:

  # Number of rows (observations)

  nrow(mtcars)

  [1] 32

  # Number of columns (variables)

  ncol(mtcars)

  [1] 11

#Description of variables:

# mpg: Miles/(US) gallon

# cyl: Number of cylinders#汽缸

# disp: Displacement (cu.in.)#排量

# hp: Gross horsepower#马力(功率单位)

# drat: Rear axle ratio#后轴比

# wt: Weight (1000 lbs)

# qsec: 1/4 mile time

# vs: V/S#Engine (0 = V-shaped, 1 = straight)

# am: Transmission (0 = automatic, 1 = manual)#n. 传动装置,[机] 变速器;传递;传送;播送

# gear: Number of forward gears#齿轮

# carb: Number of carburetors#汽化器

  #If you want to learn more about mtcars, type this:    

  ?mtcars

  iris

?iris

#iris is a data frame with 150 cases (rows) and 5 variables (columns) named Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, and Species.

#iris data set gives the measurements in centimeters of the  variables sepal length, sepal width, petal length and petal width, respectively, for 50 flowers from each of 3 species of iris. The species are Iris setosa, versicolor, and virginica.

# sepal n. [植] 萼片;花萼

#petal n. 花瓣

  #The species are Iris setosa(多刚毛的), versicolorz(adj. 杂色的,多色的;颜色变化的) and virginica.

data("iris")

head(iris)

Sepal.Length Sepal.Width Petal.Length Petal.Width Species

1          5.1        3.5          1.4        0.2  setosa

2          4.9        3.0          1.4        0.2  setosa

3          4.7        3.2          1.3        0.2  setosa

4          4.6        3.1          1.5        0.2  setosa

5          5.0        3.6          1.4        0.2  setosa

6          5.4        3.9          1.7        0.4  setosa

iris3

class(iris3)

[1] "array"

str(iris3)

num [1:50, 1:4, 1:3] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...

- attr(*, "dimnames")=List of 3

..$ : NULL

..$ : chr [1:4] "Sepal L." "Sepal W." "Petal L." "Petal W."

..$ : chr [1:3] "Setosa" "Versicolor" "Virginica"

#iris3 gives the same data arranged as a 3-dimensional array of size 50 by 4 by 3, as represented by S-PLUS. The first dimension gives the case number within the species subsample, the second the measurements with names Sepal L., Sepal W., Petal L., and Petal W., and the third the species.

?iris3

#Examples

dni3 <- dimnames(iris3)

ii <- data.frame(matrix(aperm(iris3, c(1,3,2)), ncol = 4,

                        dimnames = list(NULL, sub(" L.",".Length",

                                                  sub(" W.",".Width", dni3[[2]])))),

                Species = gl(3, 50, labels = sub("S", "s", sub("V", "v", dni3[[3]]))))

all.equal(ii, iris) # TRUE

ToothGrowth

?ToothGrowth

#ToothGrowth data set contains the result from an experiment studying the effect of vitamin C on tooth growth in 60 Guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, (orange juice or ascorbic acid (a form of vitamin C and coded as VC).

# Format

# A data frame with 60 observations on 3 variables.#

# [,1] len #numeric Tooth length

# [,2] supp #factor Supplement type (VC or OJ).

# [,3] dose #numeric Dose in milligrams/day                                                                                                                                                                                                                                                 

data("ToothGrowth")

head(ToothGrowth)

len supp dose

1  4.2  VC  0.5

2 11.5  VC  0.5

3  7.3  VC  0.5

4  5.8  VC  0.5

5  6.4  VC  0.5

6 10.0  VC  0.5

PlantGrowth

#Results obtained from an experiment to compare yields (as measured by dried weight of plants) obtained under a control and two different treatment condition.

data("PlantGrowth")

PlantGrowth

head(PlantGrowth)

weight group

1  4.17  ctrl

2  5.58  ctrl

3  5.18  ctrl

4  6.11  ctrl

5  4.50  ctrl

6  4.61  ctrl

?PlantGrowth

#PlantGrowth {datasets}

# Description

# Results from an experiment to compare yields (as measured by dried weight of plants) obtained under a control and two different treatment conditions.

# Usage

# PlantGrowth

# Format

# A data frame of 30 cases on 2 variables.

# [, 1] weight numeric

# [, 2] group factor

# The levels of group are ‘ctrl’, ‘trt1’, and ‘trt2’.

USArrests

#This data set contains statistics about violent crime rates by us state.

data("USArrests")

head(USArrests)

                  Murder Assault UrbanPop Rape

Alabama      13.2    236      58 21.2

Alaska      10.0    263      48 44.5

Arizona      8.1    294      80 31.0

Arkansas      8.8    190      50 19.5

California    9.0    276      91 40.6

Colorado      7.9    204      78 38.7

Murder: Murder arrests (per 100,000)

Assault: Assault arrests (per 100,000)

UrbanPop: Percent urban population

Rape: Rape arrests (per 100,000)

?USArrests

#Violent Crime Rates by US State

#Description

#This data set contains statistics, in arrests per 100,000 residents for assault, murder, and rape in each of the 50 US states in 1973. Also given is the percent of the population living in urban areas.

#Usage

USArrests

#Format

#A data frame with 50 observations on 4 variables.

# [,1] Murder numeric Murder arrests (per 100,000)

# [,2] Assault numeric Assault arrests (per 100,000)

# [,3] UrbanPop numeric Percent urban population

# [,4] Rape numeric Rape arrests (per 100,000)



#Summary

data(“dataset_name”)#Load a built-in R data set

head(dataset_name)#Inspect the data set

你可能感兴趣的:(R Built-in Data Sets(R内置数据集))