SQL Server - Window Function - 分析函数

概要

本文主要介绍分析窗口函数的使用方法,通过获取相邻两个股票交易日收盘价的例子,来介绍分析窗口函数在获取相邻值的用法。

分析窗口函数

本文主要介绍FIRST_VALUE和LAG两个分析窗口函数,与其类似的LAST_VALUE和LEAD两个函数,因为使用方法完全类似,所以不赘述,读者可以根据已有例子实现类似的操作。

FIRST_VALUE函数是获取有序集合的第一个值。

FIRST_VALUE ( [scalar_expression ] ) [ IGNORE NULLS | RESPECT NULLS ]
OVER ( [ partition_by_clause ] order_by_clause [ rows_range_clause ] )

  1. 有序由order by语句决定
  2. 集合由具体窗口决定,包含partition和Framing子句

LAG 函数 获取当前数据的前一条数据内容,可以省去表的自联操作。

LAG (scalar_expression [,offset] [,default])
OVER ( [ partition_by_clause ] order_by_clause )

  1. 什么是前一条由order by决定
  2. 集合由具体窗口决定,包含partition和Framing子句

注意该方法在SQL Server 2012或其之后的版本才有支持。

需求介绍

本文以一个存储股票信息的数据表来说明分析窗口函数的使用方法。数据初始化见附录,建表语句如下:

  create table MicrosiftStockHistory(
	Id int primary key identity(1,1),
	StockDate date not null ,
	OpenPrice decimal(18,2) not null,
	High decimal(18,2) not null,
	Low decimal(18,2) not null,
	Volumn int not null,
	ClosePrice decimal(18,2) not null,
	AdjClose decimal(18,2) not null,
  )

该表只记录MS一只股票的数据,每天生成一条数据,节假日没有数据。ClosePrice 表示股票的收盘价,StockDate 是记录生成的日期。

当前需求是生成一个新列PreviousClosePrice 用于显示上一个交易日的收盘价。

解决方案

表的自联接解决方案

如果没有分析函数,我我们首先想到的是表格自联接,通过StockDate列去左联自己StockDate + 1 天的数据。

具体实现如下:

	SELECT 
		m1.[StockDate], m1.[ClosePrice],
		m2.[StockDate], m2.[ClosePrice] AS PreviousClosePrice 
	FROM [dbo].[MicrosiftStockHistory] m1
	LEFT JOIN [dbo].[MicrosiftStockHistory] m2
	ON m1.[StockDate] = DATEADD(DD,1, m2.[StockDate])
	ORDER BY m1.[StockDate]

执行结果如下:

SQL Server - Window Function - 分析函数_第1张图片
从结果我们看出很多行没有取到上一个交易日的收盘价,原因很明显,因为不是每天都有股票数据,周末和公共假期是没有的,所以我们改进一下左联的操作,增加更多的左联以应对多天的假期问题。

SELECT 
		m1.[StockDate], m1.[ClosePrice],
		m2.[StockDate], m2.[ClosePrice], 
		m3.[StockDate], m3.[ClosePrice],
		m4.[StockDate], m4.[ClosePrice],
		COALESCE(m2.[ClosePrice],m3.[ClosePrice],  m4.[ClosePrice]) AS PreviousClosePrice 
	FROM [dbo].[MicrosiftStockHistory] m1
	LEFT JOIN [dbo].[MicrosiftStockHistory] m2
	ON m1.[StockDate] = DATEADD(DD,1, m2.[StockDate])
	LEFT JOIN [dbo].[MicrosiftStockHistory] m3
	ON m1.[StockDate] = DATEADD(DD,2, m3.[StockDate])
	LEFT JOIN [dbo].[MicrosiftStockHistory] m4
	ON m1.[StockDate] = DATEADD(DD,3, m4.[StockDate])
	ORDER BY m1.[StockDate]

我们增加了两天假期和三天假期的处理。执行结果如下:

SQL Server - Window Function - 分析函数_第2张图片
从执行结果上看好了很多,但是如果假期更多,例如像国庆或者春节,可能要休息七天,难道要写7个左联操作 ?

表的自联加窗口函数的解决方案

现在最大的问题是因为假期,日期是不连续的,造成左联操作产生空值,如果我们认为制造一个连续的列代替日期列,问题也就解决了。

 ;WITH PreviousPrice as (
	SELECT 
		StockDate,
		ClosePrice,
		ROW_NUMBER() over (PARTITION BY 1 ORDER BY StockDate) as RN 
	from [dbo].[MicrosiftStockHistory]
  )

  SELECT  
       c.[StockDate]
       ,c.[ClosePrice]
	   ,p. ClosePrice as PreviousPrice
	from PreviousPrice c
	LEFT JOIN PreviousPrice p
	on c.RN = p.RN +1
  1. 我们定义了一个CTE,以该CTE进行左联操作,避免嵌套子查询。
  2. CTE中包含一个通过窗口函数ROW_NUMBER,生成序号列,该列是按照日期升序排序,且连续的。PARTITION BY 1表示窗口是全表数据。
  3. PreviousPrice左联自己,获得前一个交易日的收盘数据,左联条件是序号相差1的,也就是前一个交易日的。

分析函数FIRST_VALUE的解决方案

上面的解决方案相比于第一种解决方案,不再需要考虑具体是几天假期了,但是还是需要联表操作。通过分析函数FIRST_VALUE我们可以省去该操作。

SELECT 
    [StockDate],
    [ClosePrice],
    FIRST_VALUE([ClosePrice]) over(
        PARTITION BY 1
        ORDER BY [StockDate] ROWS BETWEEN 1 PRECEDING AND CURRENT ROW
    ) as PreviousClosePrice
FROM [dbo].[MicrosiftStockHistory]
  1. 分析函数FIRST_VALUE表示取当前窗口内有序数据的第一个
  2. 我们将当前Framing设置为从当前行的上一日到当前行,也就是每个窗口两条数据,按照日期升序排列,我们取第一条,执行结果如下:
    SQL Server - Window Function - 分析函数_第3张图片
    因为7月1号的数据就是最早的了,它没有前面一条,所以它的窗口只有一条,也就是它自己。

分析函数LAG的解决方案

上面的解决方案还需要设置Framing,使用LAG函数可以更加简单

SELECT 
    [StockDate],
    [ClosePrice],
    LAG([ClosePrice]) OVER(
        PARTITION BY 1
        ORDER BY [StockDate]
    ) AS PreviousClosePrice
FROM [dbo].[MicrosiftStockHistory]

LAG函数本身就具有取上面一条数据的功能,所以我们只需要设置好按照StockDate升序排列即可。

更复杂的需求

现在需求变更了,MicrosiftStockHistory表更名为StockHistory,不再只存MS的股票数据了,而是可能存储多只股票的数据,增加了一个StockId列表示不同的股票数据,建表语句如下:

create table StockHistory(
Id int primary key identity(1,1),
StockId int not null,
StockDate date not null ,
OpenPrice decimal(18,2) not null,
High decimal(18,2) not null,
Low decimal(18,2) not null,
Volumn int not null,
ClosePrice decimal(18,2) not null,
AdjClose decimal(18,2) not null,
)

初始化数据见附录。

需求变为获取每只股票每个交易日和前一个交易日的收盘价。

表的自联加窗口函数的解决方案

;WITH PreviousPrice as (
	SELECT 
	    [StockId],
		[StockDate],
		[ClosePrice],
		ROW_NUMBER() OVER (PARTITION BY [StockId] ORDER BY StockDate) as RN 
	FROM [dbo].[StockHistory]
  )

   SELECT 
	   c.[StockId]
       ,c.[StockDate]
       ,c.[ClosePrice]
	   ,p. ClosePrice as PreviousClosePrice
	FROM PreviousPrice c
	LEFT JOIN PreviousPrice p
	ON c.RN = p.RN +1 and c.StockId = p.StockId
  1. CTE中的窗口不再是整个数据表而是每个STockId一个窗口。
  2. 左联时候,增加STockId相等的限制。

分析函数FIRST_VALUE的解决方案

SELECT
    [StockId], 
    [StockDate],
    [ClosePrice],
    FIRST_VALUE([ClosePrice]) over(
        PARTITION BY [StockId]
        ORDER BY [StockDate] ROWS BETWEEN 1 PRECEDING AND CURRENT ROW
    ) as PreviousClosePrice
FROM [dbo].[StockHistory]

分析函数LAG的解决方案

SELECT
    [StockId],  
    [StockDate],
    [ClosePrice],
    LAG([ClosePrice]) OVER(
        PARTITION BY [StockId]
        ORDER BY [StockDate]
    ) AS PreviousClosePrice
FROM [dbo].[StockHistory]

附录

添加MS股票数据

insert into MicrosiftStockHistory (StockDate,OpenPrice, High, Low , ClosePrice, Volumn, AdjClose) values
  ('2010-07-01', 23.09,23.32, 23.73,23.16, 92239400, 19.67),
  ('2010-07-02', 23.36,23.48, 23.05,23.27, 62485100, 19.76),
  ('2010-07-06', 23.70,24.09, 23.58,23.82, 73592000, 20.23),
  ('2010-07-07', 23.82,24.32, 23.61,24.30, 79965300, 20.64),
  ('2010-07-08', 24.60,24.62, 23.97,24.41, 50758100, 20.73),
  ('2010-07-09', 24.33,24.41, 24.15,24.27, 53806100, 20.61),
  ('2010-07-12', 24.43,24.89, 24.42,24.83, 49854200, 21.09),
  ('2010-07-13', 25.14,25.30, 24.90,25.13, 61928700, 21.34),

  ('2010-07-14', 25.50,25.61, 25.12,25.44, 72808100, 21.61),
  ('2010-07-15', 25.50,25.59, 24.98,25.51, 56934700, 21.67),
  ('2010-07-16', 25.51,25.64, 24.88,24.89, 65064800, 21.14),
  ('2010-07-19', 24.96,25.30, 24.91,25.23, 38181800, 21.43),
  ('2010-07-20', 24.86,25.48, 24.70,25.48, 45530700, 21.64),

  ('2010-07-21', 25.60,25.65, 24.98,25.12, 73297300, 21.33),
  ('2010-07-22', 25.51,25.99, 25.47,25.84, 73016400, 21.95),
  ('2010-07-23', 25.84,26.02, 25.25,25.81, 108520000, 19.67),
  ('2010-07-26', 25.86,26.20, 25.80,26.10, 67249900, 22.17),
  ('2010-07-27', 26.14,26.24, 25.96,26.16, 60672100, 22.22)

添加所有股票数据

insert into StockHistory (StockId, StockDate,OpenPrice, High, Low , ClosePrice, Volumn, AdjClose) values
  (1,'2010-07-01', 23.09,23.32, 23.73,23.16, 92239400, 19.67),
  (1,'2010-07-02', 23.36,23.48, 23.05,23.27, 62485100, 19.76),
  (1,'2010-07-06', 23.70,24.09, 23.58,23.82, 73592000, 20.23),
  (1,'2010-07-07', 23.82,24.32, 23.61,24.30, 79965300, 20.64),
  (1,'2010-07-08', 24.60,24.62, 23.97,24.41, 50758100, 20.73),
  (1,'2010-07-09', 24.33,24.41, 24.15,24.27, 53806100, 20.61),
  (1,'2010-07-12', 24.43,24.89, 24.42,24.83, 49854200, 21.09),
  (1,'2010-07-13', 25.14,25.30, 24.90,25.13, 61928700, 21.34),

  (1,'2010-07-14', 25.50,25.61, 25.12,25.44, 72808100, 21.61),
  (1,'2010-07-15', 25.50,25.59, 24.98,25.51, 56934700, 21.67),
  (1,'2010-07-16', 25.51,25.64, 24.88,24.89, 65064800, 21.14),
  (1,'2010-07-19', 24.96,25.30, 24.91,25.23, 38181800, 21.43),
  (1,'2010-07-20', 24.86,25.48, 24.70,25.48, 45530700, 21.64),

  (1,'2010-07-21', 25.60,25.65, 24.98,25.12, 73297300, 21.33),
  (1,'2010-07-22', 25.51,25.99, 25.47,25.84, 73016400, 21.95),
  (1,'2010-07-23', 25.84,26.02, 25.25,25.81, 108520000, 19.67),
  (1,'2010-07-26', 25.86,26.20, 25.80,26.10, 67249900, 22.17),
  (1,'2010-07-27', 26.14,26.24, 25.96,26.16, 60672100, 22.22),


--------------------------------------------------------------------

  (2,'2010-07-01', 23.09,23.32, 23.73,13.16, 92239400, 19.67),
  (2,'2010-07-02', 23.36,23.48, 23.05,13.27, 62485100, 19.76),
  (2,'2010-07-06', 23.70,24.09, 23.58,13.82, 73592000, 20.23),
  (2,'2010-07-07', 23.82,24.32, 23.61,14.30, 79965300, 20.64),
  (2,'2010-07-08', 24.60,24.62, 23.97,14.41, 50758100, 20.73),
  (2,'2010-07-09', 24.33,24.41, 24.15,14.27, 53806100, 20.61),
  (2,'2010-07-12', 24.43,24.89, 24.42,14.83, 49854200, 21.09),
  (2,'2010-07-13', 25.14,25.30, 24.90,15.13, 61928700, 21.34),

  (2,'2010-07-14', 25.50,25.61, 25.12,15.44, 72808100, 21.61),
  (2,'2010-07-15', 25.50,25.59, 24.98,15.51, 56934700, 21.67),
  (2,'2010-07-16', 25.51,25.64, 24.88,14.89, 65064800, 21.14),
  (2,'2010-07-19', 24.96,25.30, 24.91,15.23, 38181800, 21.43),
  (2,'2010-07-20', 24.86,25.48, 24.70,15.48, 45530700, 21.64),

  (2,'2010-07-21', 25.60,25.65, 24.98,15.12, 73297300, 21.33),
  (2,'2010-07-22', 25.51,25.99, 25.47,15.84, 73016400, 21.95),
  (2,'2010-07-23', 25.84,26.02, 25.25,15.81, 108520000, 19.67),
  (2,'2010-07-26', 25.86,26.20, 25.80,16.10, 67249900, 22.17),
  (2,'2010-07-27', 26.14,26.24, 25.96,16.16, 60672100, 22.22),

  --------------------------------------------------------------------

  (3,'2010-07-01', 23.09,23.32, 23.73,33.16, 92239400, 19.67),
  (3,'2010-07-02', 23.36,23.48, 23.05,33.27, 62485100, 19.76),
  (3,'2010-07-06', 23.70,24.09, 23.58,33.82, 73592000, 20.23),
  (3,'2010-07-07', 23.82,24.32, 23.61,34.30, 79965300, 20.64),
  (3,'2010-07-08', 24.60,24.62, 23.97,34.41, 50758100, 20.73),
  (3,'2010-07-09', 24.33,24.41, 24.15,34.27, 53806100, 20.61),
  (3,'2010-07-12', 24.43,24.89, 24.42,34.83, 49854200, 21.09),
  (3,'2010-07-13', 25.14,25.30, 24.90,35.13, 61928700, 21.34),

你可能感兴趣的:(数据库,sql,数据库,sqlserver)