Pattern Matching (MATCH_RECOGNIZE) 学习

Row pattern matching was introduced by SQL:2016 with three optional features:

  • R010 is the basis: this feature requires the match_recognize clause as explained above with the aggregate functions min, max, sum, count, avg.
  • R020 allows using patterns in the over clause to define frames.
  • R030 supports all aggregate functions for row patterns (e.g. stddev_pop, …).

各个引擎的支持情况:

  • 主流的oltp引擎中,当前只有oracle支持match_recognize;
  • 但calcite本身支持是支持match_recognize语法的;
屏幕快照 2019-11-18 下午2.11.01.png

Oracle语法学习

CREATE TABLE Ticker (SYMBOL VARCHAR2(10), tstamp DATE, price NUMBER);
 
INSERT INTO Ticker VALUES('ACME', '01-Apr-11', 12);
INSERT INTO Ticker VALUES('ACME', '02-Apr-11', 17);
INSERT INTO Ticker VALUES('ACME', '03-Apr-11', 19);
INSERT INTO Ticker VALUES('ACME', '04-Apr-11', 21);
INSERT INTO Ticker VALUES('ACME', '05-Apr-11', 25);
INSERT INTO Ticker VALUES('ACME', '06-Apr-11', 12);
INSERT INTO Ticker VALUES('ACME', '07-Apr-11', 15);
INSERT INTO Ticker VALUES('ACME', '08-Apr-11', 20);
INSERT INTO Ticker VALUES('ACME', '09-Apr-11', 24);
INSERT INTO Ticker VALUES('ACME', '10-Apr-11', 25);
INSERT INTO Ticker VALUES('ACME', '11-Apr-11', 19);
INSERT INTO Ticker VALUES('ACME', '12-Apr-11', 15);
INSERT INTO Ticker VALUES('ACME', '13-Apr-11', 25);
INSERT INTO Ticker VALUES('ACME', '14-Apr-11', 25);
INSERT INTO Ticker VALUES('ACME', '15-Apr-11', 14);
INSERT INTO Ticker VALUES('ACME', '16-Apr-11', 12);
INSERT INTO Ticker VALUES('ACME', '17-Apr-11', 14);
INSERT INTO Ticker VALUES('ACME', '18-Apr-11', 24);
INSERT INTO Ticker VALUES('ACME', '19-Apr-11', 23);
INSERT INTO Ticker VALUES('ACME', '20-Apr-11', 22);

SELECT *
FROM Ticker MATCH_RECOGNIZE (
     PARTITION BY symbol
     ORDER BY tstamp
     MEASURES  STRT.tstamp AS start_tstamp,
               LAST(DOWN.tstamp) AS bottom_tstamp,
               LAST(UP.tstamp) AS end_tstamp
     ONE ROW PER MATCH
     AFTER MATCH SKIP TO LAST UP
     PATTERN (STRT DOWN+ UP+)
     DEFINE
        DOWN AS DOWN.price < PREV(DOWN.price),
        UP AS UP.price > PREV(UP.price)
     ) MR
ORDER BY MR.symbol, MR.start_tstamp;

SYMBOL     START_TST BOTTOM_TS END_TSTAM
---------- --------- --------- ---------
ACME       05-APR-11 06-APR-11 10-APR-11
ACME       10-APR-11 12-APR-11 13-APR-11
ACME       14-APR-11 16-APR-11 18-APR-11
 

其中SQL表达的意思是想找出,股市行情中V字型的3个点:

  • 其中PATTERN (STRT DOWN+ UP+)表示DOWN/UP可以为多行;
  • 在DEFINE中,DOWN表示price比之前的行要小的行,UP表示price要比之前的行大的行;
    在经过上述处理后,处理结果如下:


    GUID-8D36E8DD-50EA-469D-901E-64632B68D03B-default (1).png
  • AFTER MATCH SKIP TO LAST UP表示支取最后一个点,处理后结果如下:
    GUID-65EBE05A-EB5D-4931-B775-8534B5A6CDCD-default.png

后续

  • 除了上述介绍的简单使用方法,应该还有很多其他类型,后续使用过程中可以慢慢消化;

参考

  • https://modern-sql.com/feature/match_recognize

  • Oracle 官方文档: https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8966

  • https://oracle-base.com/articles/12c/pattern-matching-in-oracle-database-12cr1

  • https://www.pgcon.org/2018/schedule/attachments/494_standard-sql-gap-analysis.pdf

  • https://azure.microsoft.com/en-us/blog/azure-stream-analytics-now-supports-match-recognize/

你可能感兴趣的:(Pattern Matching (MATCH_RECOGNIZE) 学习)