R语言数据结构5—factor

有两种类型的变量:类别(名义型)变量和有序类别(有序型),他们在R中称为因子(factor),函数factor()以一个整数向量的形式存储类别值,整数的取值范围是[1... k ](其中k 是名义型变量中唯一值的个数),同时一个由字符串(原始值)组成的内部向量将映射到这些整数上。

举例来说,假设有向量:

diabetes <- c(“type1”,”type2”,”type1”,”type1”)

语句diabetes <- factor(diabetes)将此向量存储为(1, 2, 1, 1),并在内部将其关联为1=Type1和2=Type2(具体赋值根据字母顺序而定)。针对向量diabetes进行的任何分析都会将其作为名义型变量对待,并自动选择适合这一测量尺度的统计方法。


#创建factor

gender.vector <- c("Male", "Female", "Female", "Male", "Male")


factor.gender.vector <- factor(gender.vector)

factor.gender.vector


> factor.gender.vector

[1] Male   Female Female Male   Male  
Levels: Female Male



hair.color.vector <- c("Blonde", "Blonde", "Brunette", "Ginger", "Grey", "Brunette")

temperature.vector <- c("High", "Low", "High", "Low", "Medium")


factor.hair.color.vector <- factor(hair.color.vector)

factor.temperature.vector <- factor(temperature.vector, order = TRUE, levels = c("Low",

   "Medium", "High"))


factor.temperature.vector

factor.hair.color.vector


> factor.temperature.vector

[1] High   Low    High   Low    Medium
Levels: Low < Medium < High

> factor.hair.color.vector

[1] Blonde   Blonde   Brunette Ginger   Grey     Brunette
Levels: Blonde Brunette Ginger Grey


survey.vector <- c("M","F","F","M","M")

factor.survey.vector <- factor( survey.vector )

factor.survey.vector


levels(factor.survey.vector) <- c("Female","Male")


factor.survey.vector


> factor.survey.vector #Print to console

[1] M F F M M
Levels: F M

> factor.survey.vector

[1] Male   Female Female Male   Male  
Levels: Female Male




> survey.vector <- c("M", "F", "F", "M", "M")


> factor.survey.vector <- factor(survey.vector)


> levels(factor.survey.vector) <- c("Female", "Male")


> factor.survey.vector

[1] Male   Female Female Male   Male  
Levels: Female Male

> # Type your code here for survey.vector

> summary(survey.vector)

 Length     Class      Mode
       5 character character

> # Type your code here for factor.survey.vector

> summary(factor.survey.vector)

Female   Male
    2      3


speed.vector <- c("Fast","Slow","Slow","Fast","Ultra-fast")

factor.speed.vector <-

factor(speed.vector,order = TRUE,levels=c('Slow','Fast','Ultra-fast'))

factor.speed.vector

summary(factor.speed.vector)


> factor.speed.vector

[1] Fast       Slow       Slow       Fast       Ultra-fast
Levels: Slow < Fast < Ultra-fast

> summary(factor.speed.vector)

    Slow       Fast Ultra-fast
      2          2          1



speed.vector <- c("Fast","Slow","Slow","Fast","Ultra-fast")

speed.factor.vector   <- factor(speed.vector, ordered=TRUE,levels=c("Slow","Fast","Ultra-fast") )


speed.factor.vector

compare.them <- speed.factor.vector[2] > speed.factor.vector[5]


# Is data analyst 2 faster than data analyst 5?

compare.them


> speed.factor.vector

[1] Fast       Slow       Slow       Fast       Ultra-fast
Levels: Slow < Fast < Ultra-fast

> compare.them

[1] FALSE


你可能感兴趣的:(R语言)