First_value & Last_value (page 252)
First_value and last_value functions are useful in calculating the maximum and minimum values in an orered result set. The first_value function retrieves the column value from the first row in window of rows, and the last_value function retrieves the column value from the last row in that window. Queries generating reports such as Top Store by Sales for a product and market segment are classic use cases for these analytic functions. Usually, store details and sales amounts would be shown in the report together for the store with maximum value in the Sale column. With proper partition-clause specification, the first_value function can be used retrieve these values in an optimal manner. Essentially, any report calculating maxinmum and minimum values can utilize first_value and last_value functions.
First_value和last_value函数在计算一已排序的结果集中的最大值和最小值时是有用的。first_value函数检索窗口行第一行的列值,last_value函数检索窗口行最后一行列值。 查询生成诸如“按产品和细分市场的销售额(Sale)排名第一的商店(Store)”的报告都是分析函数经典的用例。通常,商店零售和销售总值将同Sale列的最大值一道展现在报告中。使用适当的partition-clause规范,first_value函数用一种优化的方式检索这些值。实质上,任何计算最大,最小值的报告能利用first_value和last_value函数。
Power of first_value and last_value functions emanates from the support for partitioning and windowing clasues. Multi-level aggregation can be implemented concisely using the partitioning clase. For example, if the goal is to fetch the rows with maximum or minimum column values aggregated at different levels such as country,product,or region from the Sales table, then implementing the multi-level aggregation in akin to deciding the coloumns to include in the partitioning clause.
first_value和last_value的(强大)能力 发源 自对分区(partitioning)和开窗(windowing)子句的支持。 多层聚合能使用 分区子句精练的执行。例如,如果目标是取出带有在不同层次,如Sale表中country,product,或者region,聚合的最大值或者最小值的行,然后执行多层次聚合,类似于 确定包含在分区子句中的列。
Utilizing windowing clause, you can define sliding dynamic window for these functions to operate. This window can be defined to include just a few prior and/or later rows or every row in a data partition. Specifically, queries computing metrics such as maximum sales so far can be implemented using these functions. As the window can be defined to be a sliding window, these two functions can be used to answer questions such as Which store had maximum sale in the past three weeks?, Which product had maximum returns in the last two weeks, etc.
利用开窗(windowing)子句,你能定义这些函数所操作的滑动动态窗口。窗口能被定义包含几行之前和/或之后的行或者是数据分区中的每一行。 特别的,到现在为止,就能使用这些函数执行查询计算如最大销售值这样的标准度量。因为窗口能被定义成“滑动窗”,这两个函数能被用来回答如“过去三周中哪家商店具有最大的销售值”,“在过去的两周中哪种产品返回最大值”等。
Syntax for the first_value function is: first_value函数的语法如下:
first_value (expression) over(partition-clause order-by-clause windowing-clause)
In Listing 8-8, the clause partition by product, country, region, year is partitioning the rows using the specified partitioning columns. The rows are sorted in a descending order on the Sale column values by the clause order by sale desc.
在列表8-8中,子句partition by product, country, region, year指定分区列定义分区行。这些行被子句order by sale desc按Sale列值进行降序排列。
The top and bottom boundary condition fo the window is specified by the clause rows betwwen unbounded preceding and unbounded following. In this example, you are retrieving the top sales value at a level of Product, Country, Region, and Year columns and hence, the window includes all rows in a data partition.
窗口的上下边界由子句rows betwwen unbounded preceding and unbounded following指定。在本例中,你检索在Product, Country, Region, 和Year 列层级上的sales值,因此,窗口包含在数据分区中的所有行。
Operationally, data is sorted by product, Country, Region, Year, and Sale columns. Sorting order for the Sale column is descending order, though. The first row in every data partition will have the highest value for the Sale column due to descending sort order specification for the Sale column. So, the first_value(sale) clause is fetching the maximum Sale column value in the data partition.
操作上,数据按照product, Country, Region, Year, 和Sale 列值排序。 Sale列的排列顺序是降序。每一数据分区的第一行将是Sale列的最大值,因为Sale列的排序规范是降序。如此,first_value(sale) 子句取出了在数据分区中的最大的Sale列值。
In addition to fetching the maxinmum column value, you might want to fetch other columns from that top row. For example, you might want to fetch the Year and Week column value in which the maximum sale occurred. In conventional SQL statement, implementing this would result in a join and subquery. But, with analytic functions it is simpler to fetch other attributes from that top row too. Hance, the first_value(week) clause, with other parts of the analytic function kept the same as the first_value(sale), will fetch the Week column value associated with that top row.
除了取出最大列值,你可能想要取出顶行的其他列。例如,你想要取出最大销售(Sale)值所在的Year和Week列值。若使用传统的SQL语句,完成这项任务将导致一个连接和子查询。但是使用分析函数,这是非常容易取出顶行的其他属性值。因此,first_value(week)子句,分析函数的其他部分同first_value(sale)一样,将取出(最大)Week列值关联顶行。
Example: First_value to Calculate Maximum
In Listing 8-8 output, the Top_sale_year column is an aggregated column calculating the maximum
value of the Sale column. The Sale column is non-aggregated column. Both aggregated and not-
aggregated column values are fetched in the same row without a self join.
在列表8-8输出中,Top_sale_year列是计算最大Sale列的聚合列。Sale列是非聚合列。聚合和非聚合列值没有通过自连接而在一行取出。
Aggregation can be performed at a different level with a different partitioning clause. For example,
to compute the maximum value at product, country, and region level, the partitioning clause would be
partition by product, country, region .
聚合能通过不同的分区子句在不同的层级上执行。例如,计算在product, country, 和region层的最大值,分区子句将是依照product, country, region分区。
Listing 8-8. First_value Function
1 select year, week,sale,
2 first_value (sale) over(
3 partition by product, country, region ,year
4 order by sale desc
5 rows between unbounded preceding and unbounded following
6 ) top_sale_value,
7 first_value (week) over(
8 partition by product, country, region ,year
9 order by sale desc
10 rows between unbounded preceding and unbounded following
11 ) top_sale_week
12 from sales_fact
13 where country in ('Australia') and product ='Xtend Memory'
14* order by product, country,year, week;
YEAR WEEK SALE TOP_SALE_VALUE TOP_SALE_WEEK
----- ---- ---------- -------------- -------------
2000 49 42.38 246.74 46
2000 50 21.19 246.74 46
2000 52 67.45 246.74 46
2001 1 92.26 278.44 16
2001 2 118.38 278.44 16
2001 3 47.24 278.44 16
2001 4 256.70 278.44 16
Example: Last_value to Calculate Minimum
Similarly, you can use last_value function to calculate minimum or maximum values. The last_value
function fetches the column values from the last row in a window of rows. For example, if you want to
calculate the minimum Sale column value, then you could use the combination of the clause
Last_value(sale) and the clause order by sale desc sorting order. The clause order by sale desc will
sort the rows by Sale column values in a descending order, and the clause Last_value(sale) will fetch
the Sale column value from the last row. Listing 8-9 provides an example for last_value function usage.
相似的,你能用last_value函数计算最小或最大值。last_value函数取出窗口行中的最后一行的列值。例如,如果你想要计算Sale列值的最小值,你可用子句Last_value(sale) 和子句 order by sale desc排序的组合。子句order by sale desc将按Sale列值降序排列行,子句 Last_value(sale)将取最后一行的Sale列值。列表8-9提供了一个使用last_value函数的例子。
Listing 8-9. last_value Function
1 select year, week,sale,
2 last_value (sale) over(
3 partition by product, country, region ,year
4 order by sale desc
5 rows between unbounded preceding and unbounded following
6 ) low_sale
7 from sales_fact
8 where country in ('Australia') and product ='Xtend Memory'
9* order by product, country,year, week
YEAR WEEK SALE LOW_SALE
----- ---- ---------- ------------
...
2000 49 42.38 19.84
2000 50 21.19 19.84
2000 52 67.45 19.84
2001 1 92.26 22.37
2001 2 118.38 22.37
2001 3 47.24 22.37
...
Granular control of window specification can be effectively utilized to produce complex reports. For
example, the clause rows between 10 preceding and 10 following is specifying a window of 21 rows to
calculate maximum or minimum value.
窗口规格的粒度控制能有效的用于生成复杂的报告。例如,子句rows between 10 preceding and 10 following 指定一个21行的窗口计算最小值和最大值。
Null values are handled by the clause [RESPECT NULLS|IGNORE NULLS] . Clause RESPECT NULLS is the default, and the first_value function returns the null value if the column value in the first row is null, by default. If the clause IGNORE NULLS is specified, then the first_value function returns the first non-null column value in a window of rows.
Null值由子句[RESPECT NULLS|IGNORE NULLS]处理。子句RESPECT NULLS是默认值,默认情况下,如果第一行的列值是null则first_value函数返回null。如果是子句IGNORE NULLS,则first_value函数将返回在窗口行中第一个的非null列值。