文章目录 [hide]
在前面的文章中,我们介绍了如何创建DataFrame。本文将介绍如何操作DataFrame里面的数据和打印出DataFrame里面数据的模式
在创建完DataFrame之后,我们一般都会查看里面数据的模式,我们可以通过printSchema
函数来查看。它会打印出列的名称和类型:
students.printSchema
root
|-- id
:
string (nullable
=
true
)
|-- studentName
:
string (nullable
=
true
)
|-- phone
:
string (nullable
=
true
)
|-- email
:
string (nullable
=
true
)
|
如果采用的是load方式参见DataFrame的,students.printSchema
的输出则如下:
root
|-- id|studentName|phone|email
:
string (nullable
=
true
)
|
打印完模式之后,我们要做的第二件事就是看看加载进DataFrame里面的数据是否正确。从新创建的DataFrame里面采样数据的方法有很多种。我们来对其进行介绍。
最简单的就是使用show方法,show方法有四个版本:
(1)、第一个需要我们指定采样的行数def show(numRows: Int);
(2)、第二种不需要我们指定任何参数,这种情况下,show函数默认会加载出20行的数据def show();
(3)、第三种需要指定一个boolean值,这个值说明是否需要对超过20个字符的列进行截取def show(truncate: Boolean);
(4)、最后一种需要指定采样的行和是否需要对列进行截断def show(numRows: Int, truncate: Boolean)
。实际上,前三个函数都是调用这个函数实现的。
Show函数和其他函数不同的地方在于其不仅会显示需要打印的行,而且还会打印出头信息,并且会直接在默认的输出流打出(console)。来看看怎么使用吧:
students.show()
//打印出20行
+---+-----------+--------------+--------------------+
| id|studentName| phone| email|
+---+-----------+--------------+--------------------+
|
1
| Burke|
1
-
300
-
746
-
8446
|ullamcorper.velit...|
|
2
| Kamal|
1
-
668
-
571
-
5046
|pede.Suspendisse
@
...|
|
3
| Olga|
1
-
956
-
311
-
1686
|Aenean.eget.metus...|
|
4
| Belle|
1
-
246
-
894
-
6340
|vitae.aliquet.nec...|
|
5
| Trevor|
1
-
300
-
527
-
4967
|dapibus.id
@
acturp...|
|
6
| Laurel|
1
-
691
-
379
-
9921
|adipiscing
@
consec...|
|
7
| Sara|
1
-
608
-
140
-
1995
|Donec.nibh
@
enimEt...|
|
8
| Kaseem|
1
-
881
-
586
-
2689
|cursus.et.magna
@
e...|
|
9
| Lev|
1
-
916
-
367
-
5608
|Vivamus.nisi
@
ipsu...|
|
10
| Maya|
1
-
271
-
683
-
2698
|accumsan.convalli...|
|
11
| Emi|
1
-
467
-
270
-
1337
| est
@
nunc.com|
|
12
| Caleb|
1
-
683
-
212
-
0896
|Suspendisse
@
Quisq...|
|
13
| Florence|
1
-
603
-
575
-
2444
|sit.amet.dapibus
@
...|
|
14
| Anika|
1
-
856
-
828
-
7883
|euismod
@
ligulaeli...|
|
15
| Tarik|
1
-
398
-
171
-
2268
|turpis
@
felisorci.com|
|
16
| Amena|
1
-
878
-
250
-
3129
|lorem.luctus.ut
@
s...|
|
17
| Blossom|
1
-
154
-
406
-
9596
|Nunc.commodo.auct...|
|
18
| Guy|
1
-
869
-
521
-
3230
|senectus.et.netus...|
|
19
| Malachi|
1
-
608
-
637
-
2772
|Proin.mi.Aliquam
@
...|
|
20
| Edward|
1
-
711
-
710
-
6552
|lectus
@
aliquetlib...|
+---+-----------+--------------+--------------------+
only showing top
20
rows
students.show(
15
)
+---+-----------+--------------+--------------------+
| id|studentName| phone| email|
+---+-----------+--------------+--------------------+
|
1
| Burke|
1
-
300
-
746
-
8446
|ullamcorper.velit...|
|
2
| Kamal|
1
-
668
-
571
-
5046
|pede.Suspendisse
@
...|
|
3
| Olga|
1
-
956
-
311
-
1686
|Aenean.eget.metus...|
|
4
| Belle|
1
-
246
-
894
-
6340
|vitae.aliquet.nec...|
|
5
| Trevor|
1
-
300
-
527
-
4967
|dapibus.id
@
acturp...|
|
6
| Laurel|
1
-
691
-
379
-
9921
|adipiscing
@
consec...|
|
7
| Sara|
1
-
608
-
140
-
1995
|Donec.nibh
@
enimEt...|
|
8
| Kaseem|
1
-
881
-
586
-
2689
|cursus.et.magna
@
e...|
|
9
| Lev|
1
-
916
-
367
-
5608
|Vivamus.nisi
@
ipsu...|
|
10
| Maya|
1
-
271
-
683
-
2698
|accumsan.convalli...|
|
11
| Emi|
1
-
467
-
270
-
1337
| est
@
nunc.com|
|
12
| Caleb|
1
-
683
-
212
-
0896
|Suspendisse
@
Quisq...|
|
13
| Florence|
1
-
603
-
575
-
2444
|sit.amet.dapibus
@
...|
|
14
| Anika|
1
-
856
-
828
-
7883
|euismod
@
ligulaeli...|
|
15
| Tarik|
1
-
398
-
171
-
2268
|turpis
@
felisorci.com|
+---+-----------+--------------+--------------------+
only showing top
15
rows
students.show(
true
)
+---+-----------+--------------+--------------------+
| id|studentName| phone| email|
+---+-----------+--------------+--------------------+
|
1
| Burke|
1
-
300
-
746
-
8446
|ullamcorper.velit...|
|
2
| Kamal|
1
-
668
-
571
-
5046
|pede.Suspendisse
@
...|
|
3
| Olga|
1
-
956
-
311
-
1686
|Aenean.eget.metus...|
|
4
| Belle|
1
-
246
-
894
-
6340
|vitae.aliquet.nec...|
|
5
| Trevor|
1
-
300
-
527
-
4967
|dapibus.id
@
acturp...|
|
6
| Laurel|
1
-
691
-
379
-
9921
|adipiscing
@
consec...|
|
7
| Sara|
1
-
608
-
140
-
1995
|Donec.nibh
@
enimEt...|
|
8
| Kaseem|
1
-
881
-
586
-
2689
|cursus.et.magna
@
e...|
|
9
| Lev|
1
-
916
-
367
-
5608
|Vivamus.nisi
@
ipsu...|
|
10
| Maya|
1
-
271
-
683
-
2698
|accumsan.convalli...|
|
11
| Emi|
1
-
467
-
270
-
1337
| est
@
nunc.com|
|
12
| Caleb|
1
-
683
-
212
-
0896
|Suspendisse
@
Quisq...|
|
13
| Florence|
1
-
603
-
575
-
2444
|sit.amet.dapibus
@
...|
|
14
| Anika|
1
-
856
-
828
-
7883
|euismod
@
ligulaeli...|
|
15
| Tarik|
1
-
398
-
171
-
2268
|turpis
@
felisorci.com|
|
16
| Amena|
1
-
878
-
250
-
3129
|lorem.luctus.ut
@
s...|
|
17
| Blossom|
1
-
154
-
406
-
9596
|Nunc.commodo.auct...|
|
18
| Guy|
1
-
869
-
521
-
3230
|senectus.et.netus...|
|
19
| Malachi|
1
-
608
-
637
-
2772
|Proin.mi.Aliquam
@
...|
|
20
| Edward|
1
-
711
-
710
-
6552
|lectus
@
aliquetlib...|
+---+-----------+--------------+--------------------+
only showing top
20
rows
students.show(
false
)
+---+-----------+--------------+-----------------------------------------+
|id |studentName|phone |email |
+---+-----------+--------------+-----------------------------------------+
|
1
|Burke |
1
-
300
-
746
-
8446
|ullamcorper.velit.in
@
ametnullaDonec.co.uk|
|
2
|Kamal |
1
-
668
-
571
-
5046
|pede.Suspendisse
@
interdumenim.edu |
|
3
|Olga |
1
-
956
-
311
-
1686
|Aenean.eget.metus
@
dictumcursusNunc.edu |
|
4
|Belle |
1
-
246
-
894
-
6340
|vitae.aliquet.nec
@
neque.co.uk |
|
5
|Trevor |
1
-
300
-
527
-
4967
|dapibus.id
@
acturpisegestas.net |
|
6
|Laurel |
1
-
691
-
379
-
9921
|adipiscing
@
consectetueripsum.edu |
|
7
|Sara |
1
-
608
-
140
-
1995
|Donec.nibh
@
enimEtiamimperdiet.edu |
|
8
|Kaseem |
1
-
881
-
586
-
2689
|cursus.et.magna
@
euismod.org |
|
9
|Lev |
1
-
916
-
367
-
5608
|Vivamus.nisi
@
ipsumdolor.com |
|
10
|Maya |
1
-
271
-
683
-
2698
|accumsan.convallis
@
ornarelectusjusto.edu |
|
11
|Emi |
1
-
467
-
270
-
1337
|est
@
nunc.com |
|
12
|Caleb |
1
-
683
-
212
-
0896
|Suspendisse
@
Quisque.edu |
|
13
|Florence |
1
-
603
-
575
-
2444
|sit.amet.dapibus
@
lacusAliquamrutrum.ca |
|
14
|Anika |
1
-
856
-
828
-
7883
|euismod
@
ligulaelit.co.uk |
|
15
|Tarik |
1
-
398
-
171
-
2268
|turpis
@
felisorci.com |
|
16
|Amena |
1
-
878
-
250
-
3129
|lorem.luctus.ut
@
scelerisque.com |
|
17
|Blossom |
1
-
154
-
406
-
9596
|Nunc.commodo.auctor
@
eratSed.co.uk |
|
18
|Guy |
1
-
869
-
521
-
3230
|senectus.et.netus
@
lectusrutrum.com |
|
19
|Malachi |
1
-
608
-
637
-
2772
|Proin.mi.Aliquam
@
estarcu.net |
|
20
|Edward |
1
-
711
-
710
-
6552
|lectus
@
aliquetlibero.co.uk |
+---+-----------+--------------+-----------------------------------------+
only showing top
20
rows
students.show(
10
,
false
)
+---+-----------+--------------+-----------------------------------------+
|id |studentName|phone |email |
+---+-----------+--------------+-----------------------------------------+
|
1
|Burke |
1
-
300
-
746
-
8446
|ullamcorper.velit.in
@
ametnullaDonec.co.uk|
|
2
|Kamal |
1
-
668
-
571
-
5046
|pede.Suspendisse
@
interdumenim.edu |
|
3
|Olga |
1
-
956
-
311
-
1686
|Aenean.eget.metus
@
dictumcursusNunc.edu |
|
4
|Belle |
1
-
246
-
894
-
6340
|vitae.aliquet.nec
@
neque.co.uk |
|
5
|Trevor |
1
-
300
-
527
-
4967
|dapibus.id
@
acturpisegestas.net |
|
6
|Laurel |
1
-
691
-
379
-
9921
|adipiscing
@
consectetueripsum.edu |
|
7
|Sara |
1
-
608
-
140
-
1995
|Donec.nibh
@
enimEtiamimperdiet.edu |
|
8
|Kaseem |
1
-
881
-
586
-
2689
|cursus.et.magna
@
euismod.org |
|
9
|Lev |
1
-
916
-
367
-
5608
|Vivamus.nisi
@
ipsumdolor.com |
|
10
|Maya |
1
-
271
-
683
-
2698
|accumsan.convallis
@
ornarelectusjusto.edu |
+---+-----------+--------------+-----------------------------------------+
only showing top
10
rows
|
我们还可以使用head(n: Int)方法来采样数据,这个函数也需要输入一个参数标明需要采样的行数,而且这个函数返回的是Row数组,我们需要遍历打印。当然,我们也可以使用head()函数直接打印,这个函数只是返回数据的一行,类型也是Row。
students.head(
5
).foreach(println)
[
1
,Burke,
1
-
300
-
746
-
8446
,ullamcorper.velit.in
@
ametnullaDonec.co.uk]
[
2
,Kamal,
1
-
668
-
571
-
5046
,pede.Suspendisse
@
interdumenim.edu]
[
3
,Olga,
1
-
956
-
311
-
1686
,Aenean.eget.metus
@
dictumcursusNunc.edu]
[
4
,Belle,
1
-
246
-
894
-
6340
,vitae.aliquet.nec
@
neque.co.uk]
[
5
,Trevor,
1
-
300
-
527
-
4967
,dapibus.id
@
acturpisegestas.net]
println(students.head())
[
1
,Burke,
1
-
300
-
746
-
8446
,ullamcorper.velit.in
@
ametnullaDonec.co.uk]
|
除了show、head函数。我们还可以使用first和take函数,他们分别调用head()和head(n)
println(students.first())
[
1
,Burke,
1
-
300
-
746
-
8446
,ullamcorper.velit.in
@
ametnullaDonec.co.uk]
students.take(
5
).foreach(println)
[
1
,Burke,
1
-
300
-
746
-
8446
,ullamcorper.velit.in
@
ametnullaDonec.co.uk]
[
2
,Kamal,
1
-
668
-
571
-
5046
,pede.Suspendisse
@
interdumenim.edu]
[
3
,Olga,
1
-
956
-
311
-
1686
,Aenean.eget.metus
@
dictumcursusNunc.edu]
[
4
,Belle,
1
-
246
-
894
-
6340
|