2019独角兽企业重金招聘Python工程师标准>>>

本文主要参考于Elasticsearch Reference 6.3 —— Pipeline Aggregations

Pipeline Aggregations 相关特性

Pipeline Aggregations 是一组工作在其他聚合计算结果而不是文档集合的聚合。
有很多种不同类型的 Pipeline Aggregations ，不同的 Pipeline Aggregations 对其他聚合计算结果有不同的聚合计算。
总的来说 Pipeline Aggregations 有以下两种类型：

Parent : 以父聚合的结果作为输入，对父聚合的结果进行聚合计算。可以计算出新的桶或是新的聚合结果加入到现有的桶中。
Sibling : 以兄弟聚合（同级聚合）的结果作为输入，对兄弟聚合的结果进行聚合计算。计算出一个新的聚合结果，结果与兄弟聚合的结果同级。

Buckets Path Syntax

Pipeline Aggregations 是针对于其他聚合结果进行聚合计算的聚合，所以需要做到的一点是如何指出需要进行聚合计算的其他聚合。每个 Pipeline Aggregations 都会有一个 bucketspath 参数，用于指定其他聚合。
Pipeline Aggregations 不可以具有子聚合，但是根据类型，可以通过 bucketspath 引用另一个 Pipeline Aggregations ，使管道聚合链接起来。
而 buckets_path 的语法如下：

AGG_SEPARATOR       =  '>' ;
METRIC_SEPARATOR    =  '.' ;
AGG_NAME            =   ;
METRIC              =   ;
PATH                =   [ ,  ]* [ ,  ] ;

像 ++my_bucket>my_stats.avg++ 这样的值，指的是对 my_bucket 桶的子聚合统计聚合 my_stats 的 avg 结果进行聚合计算。

下面有两个具体的使用例子：

POST /_search
{
    "aggs": {
        "my_date_histo":{
            "date_histogram":{
                "field":"timestamp",
                "interval":"day"
            },
            "aggs":{
                "the_sum":{
                    "sum":{ "field": "lemmings" } 
                },
                "the_movavg":{
                    "moving_avg":{ "buckets_path": "the_sum" } 
                }
            }
        }
    }
}

POST /_search
{
    "aggs" : {
        "sales_per_month" : {
            "date_histogram" : {
                "field" : "date",
                "interval" : "month"
            },
            "aggs": {
                "sales": {
                    "sum": {
                        "field": "price"
                    }
                }
            }
        },
        "max_monthly_sales": {
            "max_bucket": {
                "buckets_path": "sales_per_month>sales" 
            }
        }
    }
}

特殊路径

可以使用 count 作为 bucketspath 参数的值，该值表示聚合中的数值，Pipeline Aggreagtions 将指示的聚合结果中的数值作为入参，进行聚合计算。
具体例子如下：

POST /_search
{
    "aggs": {
        "my_date_histo": {
            "date_histogram": {
                "field":"timestamp",
                "interval":"day"
            },
            "aggs": {
                "the_movavg": {
                    "moving_avg": { "buckets_path": "_count" } 
                }
            }
        }
    }
}

还可以使用 _bucket_count 作为 buckets_path 的值，这个值表示桶聚合的桶数，即将桶数作为 Pipeline Aggreagtions 的入参，进行聚合计算。
具体例子如下：

POST /sales/_search
{
  "size": 0,
  "aggs": {
    "histo": {
      "date_histogram": {
        "field": "date",
        "interval": "day"
      },
      "aggs": {
        "categories": {
          "terms": {
            "field": "category"
          }
        },
        "min_bucket_selector": {
          "bucket_selector": {
            "buckets_path": {
              "count": "categories._bucket_count" 
            },
            "script": {
              "source": "params.count != 0"
            }
          }
        }
      }
    }
  }
}

最后还有诸如 Percentiles Aggregation 的聚合结果，要如何处理这种带有小数点的桶名的聚合结果呢？处理方法如下：

"buckets_path": "my_percentile[99.9]"

处理具有空值的数据

数据总是存在噪音的，不会总是符合我们处理的标准。而具有空值——指定域不存在的数据是时常存在的，而所有的 Pipeline Aggreagtions 都具有一个可选参数 gappolicy ，用于处理具有空值的数据。
gappolicy 参数可选的值为以下两个：

skip : 跳过，如果计算时遇到具有空值的数据则跳过，继续执行下一个数据。
insert_zero : 插入0值，当遇到空值，则将空值替换成0，然后继续执行。

Avg Bucket Aggregation

Sibling 类型的 Pipeline Aggregation ，计算指定聚合结果的平均数。详情参考：Avg Bucket Aggregation

Max Bucket Aggregation

Sibling 类型的 Pipeline Aggregation ，计算指定聚合结果的最大值。详情参考：Max Bucket Aggregation

Min Bucket Aggreagtion

Sibling 类型的 Pipeline Aggregation ，计算指定聚合结果的最小值。详情参考：Min Bucket Aggregation

Sum Bucket Aggregation

Sibling 类型的 Pipeline Aggregation ，计算指定聚合结果的总和。详情参考：Sum Bucket Aggregation

Stats Bucket Aggregation

Sibling 类型的 Pipeline Aggregation ，对指定聚合结果进行 Stats Aggregation 计算。详情参考：Stats Bucket Aggregation ，Stats Aggregation

Extended Stats Bucket Aggregation

Sibling 类型的 Pipeline Aggregation ，对指定聚合结果进行 Extended Stats Aggregation 计算。详情参考：Extended Stats Bucket Aggregation ， Extended Stats Aggregation

Percentiles Bucket Aggregation

Sibling 类型的 Pipeline Aggregation ，对指定聚合结果进行 Percentiles Aggregation 计算。详情参考：Percentiles Bucket Aggregation ， Percentiles Aggregation
该功能在 5.x 还是实验性功能。

Derivative Aggregation

Parent 类型的 Pipeline Aggregation ，对指定聚合结果进行求导计算。该功能在 5.x 版本还是实验性功能。
Derivative Aggregation 只能作用于 histogram 或是 date_histogram 这类的直方图聚合。指定的域必须是数值类型，并且设置了 min_doc_count 为 0 （Histogram Aggregation 默认的 min_doc_count 参数为 0），这主要是保证求导的函数连续的，不连续的函数不可求导。

一次求导

以下为 Derivative Aggregation 的示例，计算每月销售的导数。

POST /sales/_search
{
    "size": 0,
    "aggs" : {
        "sales_per_month" : {
            "date_histogram" : {
                "field" : "date",
                "interval" : "month"
            },
            "aggs": {
                "sales": {
                    "sum": {
                        "field": "price"
                    }
                },
                "sales_deriv": {
                    "derivative": {
                        "buckets_path": "sales" 
                    }
                }
            }
        }
    }
}

响应结果为：

{
   "took": 11,
   "timed_out": false,
   "_shards": ...,
   "hits": ...,
   "aggregations": {
      "sales_per_month": {
         "buckets": [
            {
               "key_as_string": "2015/01/01 00:00:00",
               "key": 1420070400000,
               "doc_count": 3,
               "sales": {
                  "value": 550.0
               } #1
            },
            {
               "key_as_string": "2015/02/01 00:00:00",
               "key": 1422748800000,
               "doc_count": 2,
               "sales": {
                  "value": 60.0
               },
               "sales_deriv": {
                  "value": -490.0 #2
               }
            },
            {
               "key_as_string": "2015/03/01 00:00:00",
               "key": 1425168000000,
               "doc_count": 2, 
               "sales": {
                  "value": 375.0
               },
               "sales_deriv": {
                  "value": 315.0
               }
            }
         ]
      }
   }
}

上面的响应结果中的一些注意点： #1：第一个桶没有求导结果，因为求导至少需要两个数据点 #2：求导结果的单位默认与父直方图聚合挂钩。在上述例子中，如果 sales 聚合结果的单位为 RMB ，即求导结果的单位为 RMB/月。

二次求导

Derivative Aggreagtion 能够连接另一个 Derivative Aggreagtion ，从而实现二次求导。以下为二次求导的例子：

POST /sales/_search
{
    "size": 0,
    "aggs" : {
        "sales_per_month" : {
            "date_histogram" : {
                "field" : "date",
                "interval" : "month"
            },
            "aggs": {
                "sales": {
                    "sum": {
                        "field": "price"
                    }
                },
                "sales_deriv": {
                    "derivative": {
                        "buckets_path": "sales"
                    }
                },
                "sales_2nd_deriv": {
                    "derivative": {
                        "buckets_path": "sales_deriv" 
                    }
                }
            }
        }
    }
}

响应结果：

{
   "took": 50,
   "timed_out": false,
   "_shards": ...,
   "hits": ...,
   "aggregations": {
      "sales_per_month": {
         "buckets": [
            {
               "key_as_string": "2015/01/01 00:00:00",
               "key": 1420070400000,
               "doc_count": 3,
               "sales": {
                  "value": 550.0
               } 
            },
            {
               "key_as_string": "2015/02/01 00:00:00",
               "key": 1422748800000,
               "doc_count": 2,
               "sales": {
                  "value": 60.0
               },
               "sales_deriv": {
                  "value": -490.0
               } 
            },
            {
               "key_as_string": "2015/03/01 00:00:00",
               "key": 1425168000000,
               "doc_count": 2,
               "sales": {
                  "value": 375.0
               },
               "sales_deriv": {
                  "value": 315.0
               },
               "sales_2nd_deriv": {
                  "value": 805.0
               }
            }
         ]
      }
   }
}

第一个与第二个桶都没有二次求导的结果，因为像前面说到的，求导至少需要两个数据点。

Units

Derivative Aggregation 允许指定求导结果的单位。当指定求导结果的单位后，响应结果中会有一个添加一个域 normalized_value ，表示的是指定单位进行求导的结果。以下示例为以天为计算单位，进行求导。

POST /sales/_search
{
    "size": 0,
    "aggs" : {
        "sales_per_month" : {
            "date_histogram" : {
                "field" : "date",
                "interval" : "month"
            },
            "aggs": {
                "sales": {
                    "sum": {
                        "field": "price"
                    }
                },
                "sales_deriv": {
                    "derivative": {
                        "buckets_path": "sales",
                        "unit": "day" 
                    }
                }
            }
        }
    }
}

响应结果：

{
   "took": 50,
   "timed_out": false,
   "_shards": ...,
   "hits": ...,
   "aggregations": {
      "sales_per_month": {
         "buckets": [
            {
               "key_as_string": "2015/01/01 00:00:00",
               "key": 1420070400000,
               "doc_count": 3,
               "sales": {
                  "value": 550.0
               } 
            },
            {
               "key_as_string": "2015/02/01 00:00:00",
               "key": 1422748800000,
               "doc_count": 2,
               "sales": {
                  "value": 60.0 
               },
               "sales_deriv": {
                  "value": -490.0, #1
                  "normalized_value": -15.806451612903226 #2
               }
            },
            {
               "key_as_string": "2015/03/01 00:00:00",
               "key": 1425168000000,
               "doc_count": 2,
               "sales": {
                  "value": 375.0
               },
               "sales_deriv": {
                  "value": 315.0,
                  "normalized_value": 11.25
               }
            }
         ]
      }
   }
}

1 的数据项还是表示以默认单位计算得出的求导结果，而 #2 表示的则是而天为单位，计算得出的求导结果。

Moving Average Aggregation

对于一组有序的数据，Moving Average Aggregation 能够以窗口的方式（基于窗口大小）划分数据，并计算窗口中数据的平均值，然后可以通过移动窗口，计算下一组数据的平均值。
例如给出以下一组数据 [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] ，简单的进行相应的 Moving Average 计算，结果如下：

(1 + 2 + 3 + 4 + 5) / 5 = 3
(2 + 3 + 4 + 5 + 6) / 5 = 4
(3 + 4 + 5 + 6 + 7) / 5 = 5
etc

Moving Average 是一种可以平滑有序数据的较简单的方法。Moving Average 的毕竟经典的用法是作用于基于时间序列化的数据，例如服务器的度量信息（内存使用率，CPU负载等），平滑化数据可以有效消除高频数据和随机噪声造成的影响，可以更有效的展示低频数据造成的影响。例如应用服务总是会存在使用的高峰期与使用的低峰，在进行服务器度量信息统计时，进行类似 Moving Average 的计算才更有可能提高有效的信息。

Moving Average Aggregation 参数表：

参数名	描述	必须/可选	默认值
buckets_path	指定需要进行计算的聚合，具体可以参考 Pipeline Aggregations 相关特性一节中的 Buckets Path Syntax	必须
model	希望使用的移动平均加权模型	可选	simple
gap_policy	空值的处理方式，具体可以参考 Pipeline Aggregations 相关特性一节中的处理具有空值的数据	可选	skip
window	移动窗口的大小	可选	5
minimize	部分 model 需要在算法上进行最小化，具体查看 Minimization	可选	false （大部分的 model 没有该项设置）
settings	特定于 model 的设置，具体的值因指定的 model 而异	可选

Moving Average Aggregation 只能嵌套在 histogram 或是 date_histogram 聚合。
以下为使用示例：

POST /_search
{
    "size": 0,
    "aggs": {
        "my_date_histo":{                
            "date_histogram":{
                "field":"date",
                "interval":"1M"
            },
            "aggs":{
                "the_sum":{
                    "sum":{ "field": "price" } 
                },
                "the_movavg":{
                    "moving_avg":{ "buckets_path": "the_sum" } 
                }
            }
        }
    }
}

响应结果：

{
   "took": 11,
   "timed_out": false,
   "_shards": ...,
   "hits": ...,
   "aggregations": {
      "my_date_histo": {
         "buckets": [
             {
                 "key_as_string": "2015/01/01 00:00:00",
                 "key": 1420070400000,
                 "doc_count": 3,
                 "the_sum": {
                    "value": 550.0
                 }
             },
             {
                 "key_as_string": "2015/02/01 00:00:00",
                 "key": 1422748800000,
                 "doc_count": 2,
                 "the_sum": {
                    "value": 60.0
                 },
                 "the_movavg": {
                    "value": 550.0
                 }
             },
             {
                 "key_as_string": "2015/03/01 00:00:00",
                 "key": 1425168000000,
                 "doc_count": 2,
                 "the_sum": {
                    "value": 375.0
                 },
                 "the_movavg": {
                    "value": 305.0
                 }
             }
         ]
      }
   }
}

在实际使用中根据不同的需求需要指定不同的 model ，这里不展开叙述，详情参考：Models

Cumulative Sum Aggregation

Sibling 类型的 Pipeline Aggregation ，计算指定聚合结果的累计和。Cumulative Sum Aggregation 只能作用于 histogram 或是 date_histogram 这类的直方图聚合。指定的域必须是数值类型，并且设置了 min_doc_count 为 0 。
以下为使用示例：

POST /sales/_search
{
    "size": 0,
    "aggs" : {
        "sales_per_month" : {
            "date_histogram" : {
                "field" : "date",
                "interval" : "month"
            },
            "aggs": {
                "sales": {
                    "sum": {
                        "field": "price"
                    }
                },
                "cumulative_sales": {
                    "cumulative_sum": {
                        "buckets_path": "sales" 
                    }
                }
            }
        }
    }
}

响应结果：

{
   "took": 11,
   "timed_out": false,
   "_shards": ...,
   "hits": ...,
   "aggregations": {
      "sales_per_month": {
         "buckets": [
            {
               "key_as_string": "2015/01/01 00:00:00",
               "key": 1420070400000,
               "doc_count": 3,
               "sales": {
                  "value": 550.0
               },
               "cumulative_sales": {
                  "value": 550.0
               }
            },
            {
               "key_as_string": "2015/02/01 00:00:00",
               "key": 1422748800000,
               "doc_count": 2,
               "sales": {
                  "value": 60.0
               },
               "cumulative_sales": {
                  "value": 610.0
               }
            },
            {
               "key_as_string": "2015/03/01 00:00:00",
               "key": 1425168000000,
               "doc_count": 2,
               "sales": {
                  "value": 375.0
               },
               "cumulative_sales": {
                  "value": 985.0
               }
            }
         ]
      }
   }
}

Bucket Script Aggregation

Parent 类型的 Pipeline Aggregation，能够根据父聚合的桶的域进行指定的计算，参与计算的域需要是数值类型，而计算结果也需要是数值。
Bucket Script Aggregation 大致的请求格式如下：

{
    "bucket_script": {
        "buckets_path": {
            "my_var1": "the_sum", 
            "my_var2": "the_value_count"
        },
        "script": "params.my_var1 / params.my_var2"
    }
}

参数 script 根据 buckets_path 中的变量定义计算的方式。

实际例子：

POST /sales/_search
{
    "size": 0,
    "aggs" : {
        "sales_per_month" : {
            "date_histogram" : {
                "field" : "date",
                "interval" : "month"
            },
            "aggs": {
                "total_sales": {
                    "sum": {
                        "field": "price"
                    }
                },
                "t-shirts": {
                  "filter": {
                    "term": {
                      "type": "t-shirt"
                    }
                  },
                  "aggs": {
                    "sales": {
                      "sum": {
                        "field": "price"
                      }
                    }
                  }
                },
                "t-shirt-percentage": {
                    "bucket_script": {
                        "buckets_path": {
                          "tShirtSales": "t-shirts>sales",
                          "totalSales": "total_sales"
                        },
                        "script": "params.tShirtSales / params.totalSales * 100"
                    }
                }
            }
        }
    }
}

响应格式：

{
   "took": 11,
   "timed_out": false,
   "_shards": ...,
   "hits": ...,
   "aggregations": {
      "sales_per_month": {
         "buckets": [
            {
               "key_as_string": "2015/01/01 00:00:00",
               "key": 1420070400000,
               "doc_count": 3,
               "total_sales": {
                   "value": 550.0
               },
               "t-shirts": {
                   "doc_count": 1,
                   "sales": {
                       "value": 200.0
                   }
               },
               "t-shirt-percentage": {
                   "value": 36.36363636363637
               }
            },
            {
               "key_as_string": "2015/02/01 00:00:00",
               "key": 1422748800000,
               "doc_count": 2,
               "total_sales": {
                   "value": 60.0
               },
               "t-shirts": {
                   "doc_count": 1,
                   "sales": {
                       "value": 10.0
                   }
               },
               "t-shirt-percentage": {
                   "value": 16.666666666666664
               }
            },
            {
               "key_as_string": "2015/03/01 00:00:00",
               "key": 1425168000000,
               "doc_count": 2,
               "total_sales": {
                   "value": 375.0
               },
               "t-shirts": {
                   "doc_count": 1,
                   "sales": {
                       "value": 175.0
                   }
               },
               "t-shirt-percentage": {
                   "value": 46.666666666666664
               }
            }
         ]
      }
   }
}

Bucket Selector Aggregation

Parent 类型的 Pipeline Aggregation，与 Bucket Script Aggregation 类似，不过不是通过计算获取额外想要的数据，而是通过计算过滤父聚合的桶。同样，参与计算的域需要是数值类型，不过计算得出的结果需要是布尔类型。如果指定的计算脚本是表达式，返回数值类型也是可以的，0 代表 false ，其余值代表 true 。
需要注意的是，Bucket Selector Aggregation 像其他 Pipeline Aggreagtion 一样，在其他兄弟聚合执行完成后再执行，这意味着通过 Bucket Selector Aggregation 过滤聚合结果将不会节省总体的聚合计算时间。

Bucket Selector Aggregation 请求的格式大致如下：

{
    "bucket_selector": {
        "buckets_path": {
            "my_var1": "the_sum", 
            "my_var2": "the_value_count"
        },
        "script": "params.my_var1 > params.my_var2"
    }
}

参数 script 定义计算的脚本，计算的参数来自于参数 buckets_path 。

以下为实际使用例子：

POST /sales/_search
{
    "size": 0,
    "aggs" : {
        "sales_per_month" : {
            "date_histogram" : {
                "field" : "date",
                "interval" : "month"
            },
            "aggs": {
                "total_sales": {
                    "sum": {
                        "field": "price"
                    }
                },
                "sales_bucket_filter": {
                    "bucket_selector": {
                        "buckets_path": {
                          "totalSales": "total_sales"
                        },
                        "script": "params.totalSales > 200"
                    }
                }
            }
        }
    }
}

响应结果：

{
   "took": 11,
   "timed_out": false,
   "_shards": ...,
   "hits": ...,
   "aggregations": {
      "sales_per_month": {
         "buckets": [
            {
               "key_as_string": "2015/01/01 00:00:00",
               "key": 1420070400000,
               "doc_count": 3,
               "total_sales": {
                   "value": 550.0
               }
            },
            {
               "key_as_string": "2015/03/01 00:00:00",
               "key": 1425168000000,
               "doc_count": 2,
               "total_sales": {
                   "value": 375.0
               },
            }
         ]
      }
   }
}

响应结果中，2月份的由于销售额少于 200 而被 bucket_selector 聚合过滤。

Bucket Sort Aggregation

Parent 类型的 Pipeline Aggregation，可以指定零个或是多个域，用于对父聚合的桶进行排序。排序可以基于每个桶的 _key 域，_count 域或者是子聚合。同时，还可以指定参数 from 和 size 在进行排序后丢弃部分结果。
Bucket Sort Aggregation 请求的格式大致如下：

{
    "bucket_sort": {
        "sort": [
            {"sort_field_1": {"order": "asc"}},
            {"sort_field_2": {"order": "desc"}},
            "sort_field_3"
        ],
        "from": 1,
        "size": 3
    }
}

以下为使用示例：

POST /sales/_search
{
    "size": 0,
    "aggs" : {
        "sales_per_month" : {
            "date_histogram" : {
                "field" : "date",
                "interval" : "month"
            },
            "aggs": {
                "total_sales": {
                    "sum": {
                        "field": "price"
                    }
                },
                "sales_bucket_sort": {
                    "bucket_sort": {
                        "sort": [
                          {"total_sales": {"order": "desc"}}
                        ],
                        "size": 3
                    }
                }
            }
        }
    }
}

响应结果：

{
   "took": 82,
   "timed_out": false,
   "_shards": ...,
   "hits": ...,
   "aggregations": {
      "sales_per_month": {
         "buckets": [
            {
               "key_as_string": "2015/01/01 00:00:00",
               "key": 1420070400000,
               "doc_count": 3,
               "total_sales": {
                   "value": 550.0
               }
            },
            {
               "key_as_string": "2015/03/01 00:00:00",
               "key": 1425168000000,
               "doc_count": 2,
               "total_sales": {
                   "value": 375.0
               },
            },
            {
               "key_as_string": "2015/02/01 00:00:00",
               "key": 1422748800000,
               "doc_count": 2,
               "total_sales": {
                   "value": 60.0
               },
            }
         ]
      }
   }
}

bucket_sort 比较特殊的一点是该聚合不是使用 buckets_path 指定参与计算的域，而实在参数 sort 中指定。
因为在 bucketsort 中指定了参数 size 的值为3，所以响应结果只有 3 个桶。基于上述的特性，还可以用 bucketsort 丢弃部分结果，即通过控制参数 size 的值，控制返回的桶的数量。

POST /sales/_search
{
    "size": 0,
    "aggs" : {
        "sales_per_month" : {
            "date_histogram" : {
                "field" : "date",
                "interval" : "month"
            },
            "aggs": {
                "bucket_truncate": {
                    "bucket_sort": {
                        "from": 1,
                        "size": 1
                    }
                }
            }
        }
    }
}

响应结果：

{
   "took": 11,
   "timed_out": false,
   "_shards": ...,
   "hits": ...,
   "aggregations": {
      "sales_per_month": {
         "buckets": [
            {
               "key_as_string": "2015/02/01 00:00:00",
               "key": 1422748800000,
               "doc_count": 2
            }
         ]
      }
   }
}

最后要注意的是，Pipeline 聚合都是在其他兄弟聚合完成后，才进行计算的，所以哪怕将 bucket_sort 的参数 size 设置到很小，返回的桶也很少，但是不会缩短执行总的聚合计算的时间。

Serial Differencing Aggregation

Parent 类型的 Pipeline Aggregation，可以译成连续差分聚合，只能作用于 histogram 或 date_histogram 类型的聚合。简单来说就是根据 histogram 或 date_histogram 计算得出的统计图，计算相距 n 距离的两点之间的差值，得出对应的统计图。连续差分统计图中的数据点通过该公式计算：f(x) = f(xt) - f(Xt-n)
Serial Differencing Aggregation 请求的格式大致如下：

{
    "serial_diff": {
        "buckets_path": "the_sum",
        "lag": "7"
    }
}

较为详细的使用例子如下：

POST /_search
{
   "size": 0,
   "aggs": {
      "my_date_histo": {                  
         "date_histogram": {
            "field": "timestamp",
            "interval": "day"
         },
         "aggs": {
            "the_sum": {
               "sum": {
                  "field": "lemmings"     
               }
            },
            "thirtieth_difference": {
               "serial_diff": {                
                  "buckets_path": "the_sum",
                  "lag" : 30
               }
            }
         }
      }
   }
}

这里需要注意的是参数 lag ，该参数就是数据点计算公式 f(x) = f(x_t) - f(X_t-n) 中的 n ，指定相差多少距离的数据参与差分计算。

Elasticsearch聚合——Pipeline Aggregations

Pipeline Aggregations 相关特性

Buckets Path Syntax

特殊路径

处理具有空值的数据

Avg Bucket Aggregation

Max Bucket Aggregation

Min Bucket Aggreagtion

Sum Bucket Aggregation

Stats Bucket Aggregation

Extended Stats Bucket Aggregation

Percentiles Bucket Aggregation

Derivative Aggregation

一次求导

二次求导

Units

1 的数据项还是表示以默认单位计算得出的求导结果，而 #2 表示的则是而天为单位，计算得出的求导结果。

Moving Average Aggregation

Cumulative Sum Aggregation

Bucket Script Aggregation

Bucket Selector Aggregation

Bucket Sort Aggregation

Serial Differencing Aggregation

你可能感兴趣的:(Elasticsearch聚合——Pipeline Aggregations)