本部分讲“数据重塑”和“函数”
> #于数据帧中加入列和行
> #我们可以使用cbind()函数连接多个向量来创建数据帧。 此外,我们可以使用rbind()函数合并两个数据帧。
> #create vector objects
> city <-c("Tampa","Seattle","Hartford","Denver")
> state <- c("FL","WA","CT","CO")
> zipcode <- c(33602,98104,06161,80294)
>
> #combine above three vectos into one data frame
> addresses <- cbind(city,state,zipcode)
>
> #print a header
> cat("# # # # The first data frame # # # #
+ ")
# # # # The first data frame # # # #
> print(addresses)
city state zipcode
[1,] "Tampa" "FL" "33602"
[2,] "Seattle" "WA" "98104"
[3,] "Hartford" "CT" "6161"
[4,] "Denver" "CO" "80294"
> # Create another data frame with similar columns
> new.address <- data.frame(
+ city = c("Lowry","Charlotte"),
+ state = c("CO","FL"),
+ zipcode = c("80230","33949"),
+ stringsAsFactors = FALSE
+ )
>
> # Print a header.
> cat("# # # The Second data frame
+ ")
# # # The Second data frame
>
> # Print the data frame.
> print(new.address)
city state zipcode
1 Lowry CO 80230
2 Charlotte FL 33949
>
> # Combine rows form both the data frames.
> all.addresses <- rbind(addresses,new.address)
>
> # Print a header.
> cat("# # # The combined data frame
+ ")
# # # The combined data frame
>
> # Print the result.
> print(all.addresses)
city state zipcode
1 Tampa FL 33602
2 Seattle WA 98104
3 Hartford CT 6161
4 Denver CO 80294
5 Lowry CO 80230
6 Charlotte FL 33949
>
>
>
> #合并数据帧
> #我们可以使用merge()函数合并两个数据帧。 数据帧必须具有相同的列名称,在其上进行合并。
> library(MASS)
> merged.Pima <- merge(x = Pima.te, y = Pima.tr,
+ by.x = c("bp", "bmi"),
+ by.y = c("bp", "bmi")
+ )
> print(merged.Pima)
bp bmi npreg.x glu.x skin.x ped.x age.x type.x npreg.y glu.y skin.y ped.y age.y type.y
1 60 33.8 1 117 23 0.466 27 No 2 125 20 0.088 31 No
2 64 29.7 2 75 24 0.370 33 No 2 100 23 0.368 21 No
3 64 31.2 5 189 33 0.583 29 Yes 3 158 13 0.295 24 No
4 64 33.2 4 117 27 0.230 24 No 1 96 27 0.289 21 No
5 66 38.1 3 115 39 0.150 28 No 1 114 36 0.289 21 No
6 68 38.5 2 100 25 0.324 26 No 7 129 49 0.439 43 Yes
7 70 27.4 1 116 28 0.204 21 No 0 124 20 0.254 36 Yes
8 70 33.1 4 91 32 0.446 22 No 9 123 44 0.374 40 No
9 70 35.4 9 124 33 0.282 34 No 6 134 23 0.542 29 Yes
10 72 25.6 1 157 21 0.123 24 No 4 99 17 0.294 28 No
11 72 37.7 5 95 33 0.370 27 No 6 103 32 0.324 55 No
12 74 25.9 9 134 33 0.460 81 No 8 126 38 0.162 39 No
13 74 25.9 1 95 21 0.673 36 No 8 126 38 0.162 39 No
14 78 27.6 5 88 30 0.258 37 No 6 125 31 0.565 49 Yes
15 78 27.6 10 122 31 0.512 45 No 6 125 31 0.565 49 Yes
16 78 39.4 2 112 50 0.175 24 No 4 112 40 0.236 38 No
17 88 34.5 1 117 24 0.403 40 Yes 4 127 11 0.598 28 No
> nrow(merged.Pima)
[1] 17
>
> library(MASS)
> print(ships)
type year period service incidents
1 A 60 60 127 0
2 A 60 75 63 0
3 A 65 60 1095 3
4 A 65 75 1095 4
5 A 70 60 1512 6
6 A 70 75 3353 18
7 A 75 60 0 0
8 A 75 75 2244 11
9 B 60 60 44882 39
10 B 60 75 17176 29
11 B 65 60 28609 58
12 B 65 75 20370 53
13 B 70 60 7064 12
14 B 70 75 13099 44
15 B 75 60 0 0
16 B 75 75 7117 18
17 C 60 60 1179 1
18 C 60 75 552 1
19 C 65 60 781 0
20 C 65 75 676 1
21 C 70 60 783 6
22 C 70 75 1948 2
23 C 75 60 0 0
24 C 75 75 274 1
25 D 60 60 251 0
26 D 60 75 105 0
27 D 65 60 288 0
28 D 65 75 192 0
29 D 70 60 349 2
30 D 70 75 1208 11
31 D 75 60 0 0
32 D 75 75 2051 4
33 E 60 60 45 0
34 E 60 75 0 0
35 E 65 60 789 7
36 E 65 75 437 7
37 E 70 60 1157 5
38 E 70 75 2161 12
39 E 75 60 0 0
40 E 75 75 542 1
>
> install.packages("Package Name")
Warning message:
package ‘Package Name’ is not available (for R version 3.4.1)
> install.packages("Reshape")
Warning messages:
1: package ‘Reshape’ is not available (for R version 3.4.1)
2: Perhaps you meant ‘reshape’ ?
> 2
[1] 2
>
> install.packages("reshape")
> install.packages("reshape2")
> install.packages("knitr")
> library(reshape2)
Warning message:
程辑包‘reshape2’是用R版本3.4.4 来建造的
> library(knitr)
Warning message:
程辑包‘knitr’是用R版本3.4.4 来建造的
> molten.ships <- melt(ships, id = c("type","year"))
> print(molten.ships)
type year variable value
1 A 60 period 60
2 A 60 period 75
3 A 65 period 60
4 A 65 period 75
5 A 70 period 60
6 A 70 period 75
7 A 75 period 60
8 A 75 period 75
9 B 60 period 60
10 B 60 period 75
11 B 65 period 60
12 B 65 period 75
13 B 70 period 60
14 B 70 period 75
15 B 75 period 60
16 B 75 period 75
17 C 60 period 60
18 C 60 period 75
19 C 65 period 60
20 C 65 period 75
21 C 70 period 60
22 C 70 period 75
23 C 75 period 60
24 C 75 period 75
25 D 60 period 60
26 D 60 period 75
27 D 65 period 60
28 D 65 period 75
29 D 70 period 60
30 D 70 period 75
31 D 75 period 60
32 D 75 period 75
33 E 60 period 60
34 E 60 period 75
35 E 65 period 60
36 E 65 period 75
37 E 70 period 60
38 E 70 period 75
39 E 75 period 60
40 E 75 period 75
41 A 60 service 127
42 A 60 service 63
43 A 65 service 1095
44 A 65 service 1095
45 A 70 service 1512
46 A 70 service 3353
47 A 75 service 0
48 A 75 service 2244
49 B 60 service 44882
50 B 60 service 17176
51 B 65 service 28609
52 B 65 service 20370
53 B 70 service 7064
54 B 70 service 13099
55 B 75 service 0
56 B 75 service 7117
57 C 60 service 1179
58 C 60 service 552
59 C 65 service 781
60 C 65 service 676
61 C 70 service 783
62 C 70 service 1948
63 C 75 service 0
64 C 75 service 274
65 D 60 service 251
66 D 60 service 105
67 D 65 service 288
68 D 65 service 192
69 D 70 service 349
70 D 70 service 1208
71 D 75 service 0
72 D 75 service 2051
73 E 60 service 45
74 E 60 service 0
75 E 65 service 789
76 E 65 service 437
77 E 70 service 1157
78 E 70 service 2161
79 E 75 service 0
80 E 75 service 542
81 A 60 incidents 0
82 A 60 incidents 0
83 A 65 incidents 3
84 A 65 incidents 4
85 A 70 incidents 6
86 A 70 incidents 18
87 A 75 incidents 0
88 A 75 incidents 11
89 B 60 incidents 39
90 B 60 incidents 29
91 B 65 incidents 58
92 B 65 incidents 53
93 B 70 incidents 12
94 B 70 incidents 44
95 B 75 incidents 0
96 B 75 incidents 18
97 C 60 incidents 1
98 C 60 incidents 1
99 C 65 incidents 0
100 C 65 incidents 1
101 C 70 incidents 6
102 C 70 incidents 2
103 C 75 incidents 0
104 C 75 incidents 1
105 D 60 incidents 0
106 D 60 incidents 0
107 D 65 incidents 0
108 D 65 incidents 0
109 D 70 incidents 2
110 D 70 incidents 11
111 D 75 incidents 0
112 D 75 incidents 4
113 E 60 incidents 0
114 E 60 incidents 0
115 E 65 incidents 7
116 E 65 incidents 7
117 E 70 incidents 5
118 E 70 incidents 12
119 E 75 incidents 0
120 E 75 incidents 1
> library(reshape)
载入程辑包:‘reshape’
The following objects are masked from ‘package:reshape2’:
colsplit, melt, recast
Warning message:
程辑包‘reshape’是用R版本3.4.4 来建造的
> recasted.ship <- cast(molten.ships, type+year~variable,sum)
> print(recasted.ship)
type year period service incidents
1 A 60 135 190 0
2 A 65 135 2190 7
3 A 70 135 4865 24
4 A 75 135 2244 11
5 B 60 135 62058 68
6 B 65 135 48979 111
7 B 70 135 20163 56
8 B 75 135 7117 18
9 C 60 135 1731 2
10 C 65 135 1457 1
11 C 70 135 2731 8
12 C 75 135 274 1
13 D 60 135 356 0
14 D 65 135 480 0
15 D 70 135 1557 13
16 D 75 135 2051 4
17 E 60 135 45 0
18 E 65 135 1226 14
19 E 70 135 3318 17
20 E 75 135 542 1
>
函数
> #内置功能
> #内置函数的简单示例是seq(),mean(),max(),sum(x)和paste(...)等。它们由用户编写的程序直接调用。 您可以参考最广泛使用的R函数。
> # Create a sequence of numbers from 32 to 44.
> print(seq(32,44))
[1] 32 33 34 35 36 37 38 39 40 41 42 43 44
>
> # Find mean of numbers from 25 to 82.
> print(mean(25:82))
[1] 53.5
>
> # Find sum of numbers frm 41 to 68.
> print(sum(41:68))
[1] 1526
>
> #用户定义的函数
> #我们可以在R语言中创建用户定义的函数。它们特定于用户想要的,一旦创建,它们就可以像内置函数一样使用。 下面是一个创建和使用函数的例子。
> new.function <- function(a) {
+ for(i in 1:a) {
+ b<- i^2
+ print(b)
+ }
+ }
>
> #调用函数
> new.function(6)
[1] 1
[1] 4
[1] 9
[1] 16
[1] 25
[1] 36
>
> #使用参数值调用函数(按位置和名称)
> #函数调用的参数可以按照函数中定义的顺序提供,也可以以不同的顺序提供,但分配给参数的名称。
> # Create a function with arguments.
> new.function <- function(a,b,c) {
+ result <- a * b + c
+ print(result)
+ }
>
> # Call the function by position of arguments.
> new.function(5,3,11)
[1] 26
>
> # Call the function by names of the arguments.
> new.function(a = 11, b = 5, c = 3)
[1] 58
>
> #功能的延迟计算
> 对函数的参数进行延迟评估,这意味着它们只有在函数体需要时才进行评估。 对函数的参数进行延迟评估,这意味着它们只有在函数体需要时才进行评估。
> # 对函数的参数进行延迟评估,这意味着它们只有在函数体需要时才进行评估。
> # Create a function with arguments.
> new.function <- function(a, b) {
+ print(a^2)
+ print(a)
+ print(b)
+ }
>
> # Evaluate the function without supplying one of the arguments.
> new.function(6)
[1] 36
[1] 6
Error in print(b) : 缺少参数"b",也没有缺省值
>