大数据Spark “蘑菇云”行动第72课: 基于Spark 2.0.1项目实现之二.
源数据格式 及代码的小bug修复。
大数据Spark “蘑菇云”行动第72课: 基于Spark 2.0.1项目实现之二.
规律:agg前一般都进行grouBy操作
大数据Spark “蘑菇云”行动第72课: 基于Spark 2.0.1项目实现之二.
规律:agg前一般都进行grouBy操作
{"userID":"userID5234","Name":"zhangsan","Gender":"man","Occupation":"student"}
{"userID":"userID2234","Name":"lisi","Gender":"woman","Occupation":"teacher"}
{"userID":"userID4234","Name":"wangwu","Gender":wo"man","Occupation":"student"}
{"userID":"userID5234","Name":"wangwu","Gender":"man","Occupation":"student"}
{"logID":"logID1111", "userID":"userID1234","time":"20161103","typed":"0","location":"shanghai","consumed":"100"}
{"logID":"logID2222", "userID":"userID2234","time":"20161103","typed":"0","location":"beijing","consumed":"200"}
{"logID":"logID3333", "userID":"userID3234","time":"20161103","typed":"0","location":"guangzhou","consumed":"300"}
{"logID":"logID4444", "userID"