SparkSql 06 开窗函数

row_number()

row_number() 开窗函数是按照某个字段分组,然后取另一字段的前几个的值,相当于 分组取topN

开窗函数格式:
row_number() over (partitin by xxx order by xxx )
java代码示例:

SparkSession sparkSession = SparkSession
                .builder()
                .appName("window")
                .master("local")
                //开启hive的支持,接下来就可以操作hive表了
                // 前提需要是需要开启hive metastore 服务
                .enableHiveSupport()
                .getOrCreate();

   sparkSession.sql("use spark");
   sparkSession.sql("drop table if exists sales");
   sparkSession.sql("create table if not exists sales (riqi string,leibie string,jine Int) "
      + "row format delimited fields terminated by '\t'");
   sparkSession.sql("load data local inpath '/root/test/sales' into table sales");
   /**
    * 开窗函数格式:
    * 【 rou_number() over (partitin by XXX order by XXX) 】
    */
   Dataset<Row> result = sparkSession.sql("select riqi,leibie,jine "
         	+ "from ("
            + "select riqi,leibie,jine,"
            + "row_number() over (partition by leibie order by jine desc) rank "
            + "from sales) t "
         + "where t.rank<=3");
   result.show();
   sparkSession.stop();

你可能感兴趣的:(BigData,-,SparkSql)