R的基本数据结构

1.向量

1.1产生向量

a <- c(1,2,3) #使用命令c()创建向量

1.2产生序列

b <- seq(from=-2,to =2 ,by =0.5) #使用seq函数生成步长为0.5的等差数列
b<- rep(1,times =5) #使用rep函数生成重复五次的序列

1.3向量的索引

x1 <- c(1,2,na,4,5) #向量的索引可以用于查询、赋值，其本身也是一个向量。向量的索引可以分为数字索引、字符索引和逻辑索引三种，其中默认索引形式为数值索引
x1[3] # 索引查找第三位元素
x1[1:3] # 索引查找第一/第二/第三位元素
x1[-1] # 负索引表示非此位置，例子为查询非1号位置的元素

1.4使用names函数可以方便的按照字符名称对向量进行查询和赋值。

book <- c(50,200,100,20)
names(book) <- c("English","mathematics","physical","computer")
book

English mathematics physical computer
50 200 100 20

2.矩阵

2.1 矩阵的创建

matrix1 <- matrix(c(1,2,3,4,5,6),nrow =2 ,ncol =3)# 定义一个2行3列的矩阵，默认是按列组成矩阵

matrix1
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
matrix1 <- matrix(c(1,2,3,4,5,6),nrow=2,ncol=3,byrow = TRUE)# 按行组成矩阵

2.2 矩阵的运算

t(matrix1) #矩阵的转置
solve（matrix1）# 矩阵求逆

2.3 矩阵的索引

matrix1 <- matrix(1:6,nrow=2)
matrix1
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
matrix1[1,] # 第一行
[1] 1 3 5
matrix1[,1] #第一列
[1] 1 2
matrix1[1:2,c(1,2)] #查询矩阵第一行第二行、第一列第二列元素
[,1] [,2]
[1,] 1 3
[2,] 2 4
matrix_name <- matrix(c(1,2,3,4),nrow=2,dimnames = list(c('row1','row2'),c('col1','col2')))# 通过dimnames参数指定索引名
matrix_name
col1 col2
row1 1 3
row2 2 4
matrix_name[1,'col2']
[1] 3

3 因子factor

因子是R中特有的一种数据结构，经常用在处理字符型变量数据的运算过程中。

age <- c('old','median','young','median','old','young')
age_f <- factor(age) #使用factor函数将序列转换为因子，分类水平排序按字母顺序或数字大小默认排序
age_f
[1] old median young median old young
Levels: median old young
age_f_ordered <- factor(age,levels = c('young','median','old'),ordered = T)#使用ordered参数按照指定因子顺序
age_f_ordered
[1] old median young median old young
Levels: young < median < old
table(age_f)
age_f
median old young
2 2 2
table(age_f_ordered)
age_f_ordered
young median old
2 2 2

4.列表

在一个对象中容纳多种数据类型（数据结构），可以使用列表方式。

v1 <- c(2:8)
v2 <- c('a','b','c')
m1 <- matrix(c(1:9),nrow=3)
f1 <- factor(c('high','low','low','high','high'))
mylist <- list(v1,v2,m1,f1)
mylist

5.数据框

5.1 数据框列表

将数据结构以数据框方式呈现。

list0 <- list(name= c('周鸿祎','刘强东','马云','马化腾'),age = c(41,45,52,51),company=c('360','jd','albb','tenct'))
data.frame(list0)
name age company
1 周鸿祎 41 360
2 刘强东 45 jd
3 马云 52 albb
4 马化腾 51 tenct

5.2 数据框的创建

name <- c('Jane','Alexa','Elant','Yolad')
english <-c (80,82,78,90)
math <- c(88,92,78,90)
art <- c(83,83,90,90)
score<- data.frame(name,english,math,art) #使用data.frame创建数据框score
score
name english math art
1 Jane 80 88 83
2 Alexa 82 92 83
3 Elant 78 78 90
4 Yolad 90 90 90

数据框可以使用使用序列数字索引或c（）函数产生的索引，完成数据框切片

score [1:2,c('name','art')]
name art
1 Jane 83
2 Alexa 83