我们来看一个综合的例子,求出下面样本的数字在某区间内的分布数量,即求因子频率。下面是美国地震台网公布的全球2013年5月20日22点到24点的所有发生的地震的震级。
2013-05-20T23:57:12.000+00:00 |
1.6 |
2013-05-20T23:57:12.000+00:00 |
0.9 |
2013-05-20T23:52:59.000+00:00 |
2.1 |
2013-05-20T23:49:15.100+00:00 |
2.2 |
2013-05-20T23:46:36.000+00:00 |
2.3 |
2013-05-20T23:44:07.000+00:00 |
1.7 |
2013-05-20T23:38:17.000+00:00 |
1.3 |
2013-05-20T23:34:12.400+00:00 |
1.6 |
2013-05-20T23:33:43.440+00:00 |
4.7 |
2013-05-20T23:25:20.500+00:00 |
1.2 |
2013-05-20T23:23:35.100+00:00 |
0.9 |
2013-05-20T23:07:34.960+00:00 |
4.7 |
2013-05-20T23:06:42.800+00:00 |
0.6 |
2013-05-20T23:01:25.480+00:00 |
5.3 |
2013-05-20T22:59:58.000+00:00 |
1.1 |
2013-05-20T22:51:47.120+00:00 |
4.8 |
2013-05-20T22:48:40.570+00:00 |
4 |
2013-05-20T22:48:18.350+00:00 |
4.2 |
2013-05-20T22:36:27.310+00:00 |
4.6 |
2013-05-20T22:13:36.000+00:00 |
1.3 |
2013-05-20T22:13:09.000+00:00 |
2.1 |
2013-05-20T22:10:47.000+00:00 |
1.5 |
2013-05-20T22:09:33.600+00:00 |
3 |
我们计算一下地震震级的区间频率分布:
首先,将地震震级数据放入一个向量中。
> mag<-c(1.6,0.9,2.1,2.2,2.3,1.7,1.3,1.6,4.7,1.2,0.9,4.7,0.6,5.3,1.1,4.8,4,4.2,4.6,1.3,2.1,1.5,3)
> mag
[1] 1.6 0.9 2.1 2.2 2.3 1.7 1.3 1.6 4.7 1.2 0.9 4.7 0.6 5.3 1.1 4.8 4.0 4.2
[19] 4.6 1.3 2.1 1.5 3.0
然后,使用cut函数将震级分成5个区间,并建立因子
> factor(cut(mag,5))
[1] (1.54,2.48] (0.595,1.54] (1.54,2.48] (1.54,2.48] (1.54,2.48]
[6] (1.54,2.48] (0.595,1.54] (1.54,2.48] (4.36,5.3] (0.595,1.54]
[11] (0.595,1.54] (4.36,5.3] (0.595,1.54] (4.36,5.3] (0.595,1.54]
[16] (4.36,5.3] (3.42,4.36] (3.42,4.36] (4.36,5.3] (0.595,1.54]
[21] (1.54,2.48] (0.595,1.54] (2.48,3.42]
Levels: (0.595,1.54] (1.54,2.48] (2.48,3.42] (3.42,4.36] (4.36,5.3]
>
最后,统计因子频率
factor(cut(mag,5))->magfactor
> table(magfactor)
magfactor
(0.595,1.54] (1.54,2.48] (2.48,3.42] (3.42,4.36] (4.36,5.3]
8 7 1 2 5
>
可以看出2013年5月20日22点到24点期间,全球发生的地震在(0.595,1.54]内有8起,在(1.54,2.48]有7起等。
hist函数可用来绘制直方图
> hist(mag,breaks=5)
>