Esper

http://esper.codehaus.org/

http://www.infoq.com/cn/news/2007/10/esper

Esper(InfoQ曾在一年前报道其1.0版本的发布消息)是一个事件流处理(Event Stream Processing,ESP)和复杂事件处理(Complex Event Processing,CEP)的系统,它可以监测事件流并当特定事件发生时触发某些行动——可看作是把数据库反过来,语句是固定的,而数据流进进出出。事件处理是软件行业的一个发展趋势,已有数家大厂商以及许多初创企业加入到该市场中。其常有的应用例子包括系统自动交易、BAM、RFID、高级监测系统、欺诈检测,甚至直接集成进SOA。InfoQ恰遇Esper的创始人,向他了解了项目的近况,以及最近的基准测试问题。

如Esper开发小组所说,Esper现在是仅有的纯Java开源ESP/CEP引擎,由EsperTech公司提供商业支持服务,而这个公司也在维护一个同样的.Net项目。

BEA得到了Esper授权,将在修改后在加入到六月发布的WebLogic Event Server中。根据多方面的反馈,Thomas跟InfoQ谈道:

我相信Esper在BEA的产品中占一席位的事实,在多个方面都有助于Esper的发展。首先,我们所获得的反馈的声音对于Esper的改进有很重要的作用。其次,BEA的产品从总体上提高了CEP/ESP技术的知名度,并且因此扩大了市场的共识。第三,这是Esper技术的开放性,可扩展性,适应企业级应用的最好的证明。Esper社区和用户群都真的为此而感到自豪。

随着市场空间的扩大,多种实现之间出现的竞争,标准化能给该行业带来一定的好处。Thomas对CEP语言标准化的潜力和背景作了评价:

CEP社区显然把CEP和ESP看作是互补的,并且认为其他手段(如贝叶斯网络或神经网络)也可应用于CEP的问题。由于存在各种实现技术,各厂商又各执己见,ANSI SQL标准化委员会在扩展SQL基础上所提供的“行序列的模式匹配”似乎成为最重要的曙光。

对于这个初步的课题当然会有进一步的研究,并且标准化很可能会超出ESP/CEP语言标准化的范围。

Esper近期最突出的消息是在八月中旬发布了一个性能基准测试工具及公布了性能测试结果:

Esper在双2GHz CPU的Intel系统测试环境下,处理超过500 000个事件/秒;在VWAP基准测试中在有1000语句的情况下,引擎延时平均小于3微妙(在10us时超过99%的预测准确率)——最高时有70 Mbit/s流量并占用85%的CPU资源。

虽然这个基准测试是基于一个相当简单的用例,其发表的目的是震动整个行业,因为它提供了完整的工具集来重现测试结果。Esper事件服务器监听远端客户端通过网络传送过来的股票市场事件信息。Esper引擎是通过一个滑动的时间窗口或事件窗口,来实时计算输入信息的成交量加权平均价。当被问及该基准测试的必要性时,Esper回应道:

整个CEP市场已被含糊不清的信息所包围,每个厂商都在各自的宣传单上做文章,避开详细地交待实际性能和延时。在这个领域中还没有对它们作比较的基准测试。

在这个行业中含糊的性能信息已经受到 Progress Apama和 其他人的批评。以下是来自于Apama的博客中的抱怨的声音:

……Skyler处理速度高达200,000条/秒……主要特征:Coral8每秒处理从数千到百万计的事件……StreamBase性能领先,每秒处理超过1百万个事件,反应时间接近零……Aleri Labs打破了亚毫秒反应时间的障碍……
Apama自己也 说过“一个能处理数千事件每秒的高性能、可伸缩的引擎”这种话。同样的措词在BEA也能找到, WebLogic Event Server公告了似乎较差但较为精确的指标“当我们的产品准备好,它将提供50,000复杂事件/秒的处理速度”。

那些测试结果似乎确定了在这个领域里“数十万”事件每秒是普遍的,毫无例外。同时也正显示了Esper 在特定场景中的性能表现。它同样给了用户群有价值的工具来更好地得知实际性能,而不是听信厂商任意的令人充满疑惑的宣传,对有价值的开源软件普遍怀有的偏见。

Esper小组已经在其wiki上发布了所有运行的详情,并且已更新了网页的性能部分和性能最佳实践部分。另一个基准测试的来源是最近成立的STAC基准测试委员会,该委员会的目的是为技术的交易而提供由客户推动的基准测试标准。

请点击这里获取InfoQ之前有关Esper和CEP背景的相关报道:http://infoq.com/esper。

查看英文原文:Catching up with Esper: Event Stream Processing Framework

 

Event Processing with Esper and NEsper

Esper is a component for complex event processing (CEP), available for Java as Esper, and for .NET as NEsper.

Esper and NEsper enable rapid development of applications that process large volumes of incoming messages or events. Esper and NEsper filter and analyze events in various ways, and respond to conditions of interest in real-time.

Technology Introduction

Complex event processing (CEP) delivers high-speed processing of many events across all the layers of an organization, identifying the most meaningful events within the event cloud, analyzing their impact, and taking subsequent action in real time (source:Wikipedia).

Esper offers a Domain Specific Language (DSL) for processing events. The Event Processing Language (EPL) is a declarative language for dealing with high frequency time-based event data.

Some typical examples of applications are:

  • Business process management and automation (process monitoring, BAM, reporting exceptions, operational intelligence)
  • Finance (algorithmic trading, fraud detection, risk management)
  • Network and application monitoring (intrusion detection, SLA monitoring)
  • Sensor network applications (RFID reading, scheduling and control of fabrication lines, air traffic)

 

 

Quick Start

This quick start guide provides step-by-step instructions for creating a first event-driven application using Esper.

Installation

Esper is easy to install and run: The first step is to download and unpack the distribution zip or tar file. Provided you have a Java VM installed, you may then run an example as described in the online documentation

Esper consists of a jar file named "esper-version.jar" that can be found in the root directory of the distribution.

Dependent libraries to Esper are in the "lib" folder.

Esper includes several examples and a benchmark kit that are documented in the reference manual. These can be run from the command line. Esper does not include a GUI or a server middleware itself, other then the benchmark client and server components.

Creating a Java Event Class

Java classes are a good choice for representing events, however Map-based or XML event representations can also be good choices depending on your architectural requirements.

A sample Java class that represents an order event is shown below. A simple plain-old Java class that provides getter-methods for access to event properties works best:

package org.myapp.event;

public class OrderEvent {
    private String itemName;
    private double price;

    public OrderEvent(String itemName, double price) {
        this.itemName = itemName;
        this.price = price;
    }

    public String getItemName() {
        return itemName;
    }

    public double getPrice() {
        return price;
    }
}

Creating a Statement

A statement is a continuous query registered with an Esper engine instance that provides results to listeners as new data arrives, in real-time, or by demand via the iterator (pull) API.

The next code snippet obtains an engine instance and registers a continuous query. The query returns the average price over all OrderEvent events that arrived in the last 30 seconds:

EPServiceProvider epService = EPServiceProviderManager.getDefaultProvider();
String expression = "select avg(price) from org.myapp.event.OrderEvent.win:time(30 sec)";
EPStatement statement = epService.getEPAdministrator().createEPL(expression);

Adding a Listener

Listeners are invoked by the engine in response to one or more events that change a statement's result set. Listeners implement the UpdateListener interface and act on EventBean instances as the next code snippet outlines:

public class MyListener implements UpdateListener {
    public void update(EventBean[] newEvents, EventBean[] oldEvents) {
        EventBean event = newEvents[0];
        System.out.println("avg=" + event.get("avg(price)"));
    }
}

By attaching the listener to the statement the engine provides the statement's results to the listener:

MyListener listener = new MyListener();
statement.addListener(listener);

Sending events

The runtime API accepts events for processing. As a statement's results change, the engine indicates the new results to listeners right when the events are processed by the engine.

Sending events is straightforward as well:

OrderEvent event = new OrderEvent("shirt", 74.50);
epService.getEPRuntime().sendEvent(event);

Configuration

Esper runs out of the box and no configuration is required. However configuration can help make statements more readable and provides the opportunity to plug-in extensions and to configure relational database access.

One useful configuration item specifies Java package names from which to take event classes.

This snippet of using the configuration API makes the Java package of the OrderEvent class known to an engine instance:

Configuration config = new Configuration();
config.addEventTypeAutoAlias("org.myapp.event");
EPServiceProvider epService = EPServiceProviderManager.getDefaultProvider(config);

In order to query the OrderEvent events, we can now remove the package name from the statement:

String epl = "select avg(price) from OrderEvent.win:time(30 sec)";EPStatement statement = epService.getEPAdministrator().createEPL(epl);

 

 

 

 

Tutorial

Introduction

Esper is an Event Stream Processing (ESP) and event correlation engine (CEP, Complex Event Processing). Targeted to real-time Event Driven Architectures (EDA), Esper is capable of triggering custom actions written as Plain Old Java Objects (POJO) when event conditions occur among event streams. It is designed for high-volume event correlation where millions of events coming in would make it impossible to store them all to later query them using classical database architecture.

A tailored Event Processing Language (EPL) allows expressing rich event conditions, correlation, possibly spanning time windows, thus minimizing the development effort required to set up a system that can react to complex situations.

Esper is a lightweight kernel written in Java which is fully embeddable into any Java process, JEE application server or Java-based Enterprise Service Bus. It enables rapid development of applications that process large volumes of incoming messages or events.

Introduction to event streams and complex events

Information is critical to make wise decisions. This is true in real life but also in computing, and especially critical in several areas, such as finance, fraud detection, business intelligence or battlefield operation. Information flows in from different sources in the form of messages or events, giving a hint on the state at a given time such as stock price. That said, looking at those discrete events is most of the time meaningless. A trader needs to look at the stock trend over a period, possibly combined with other information to make the best deal at the right time.

While discrete events when looked one by one might be meaningless, event streams--that is an infinite set of events--considered over a sliding window and further correlated, are highly meaningful, and reacting to them with the minimal latency is critical for effective action and competitive advantage.

Introduction to Esper

Relational databases or message-based systems such as JMS make it really hard to deal with temporal data and real-time queries. Indeed, databases require explicit querying to return meaningful data and are not suited to push data as it changes. JMS systems are stateless and require the developer to implement the temporal and aggregation logic himself. By contrast, the Esper engine provides a higher abstraction and intelligence and can be thought of as a database turned upside-down: instead of storing the data and running queries against stored data, Esper allows applications to store queries and run the data through. Response from the Esper engine is real-time when conditions occur that match user defined queries. The execution model is thus continuous rather than only when a query is submitted.

Such concepts are a key foundation of EDA, and have been under active research in more than the last 10 years. Awareness of the importance of such systems in real-world architectures has started to emerge only recently.

In Esper, a tailored EPL allows registering queries in the engine. A listener class--which is basically a POJO--will then be called by the engine when the EPL condition is matched as events flow in. The EPL enables to express complex matching conditions that include temporal windows, joining of different event streams, as well as filtering, aggregation, and sorting. Esper statements can also be combined together with "followed by" conditions thus deriving complex events from more simple events. Events can be represented as JavaBean classes, legacy Java classes, XML document or java.util.Map, which promotes reuse of existing systems acting as messages publishers.

A trivial yet meaningful example is as follow: assume a trader wants to buy Google stock as soon as the price goes below some floor value-- not when looking at each tick but when the computation is done over a sliding time window--say of 30 seconds. Given a StockTick event bean with a price and symbol property and the EPL "select avg(price) from StockTick.win:time(30 sec) where symbol='GOOG'", a listener POJO would get notified as ticks come in to trigger the buy order.

Developing event-driven applications

Developing event-driven application is not hard using Esper. You may want to roughly follow these steps:

  1. Define the mission of your application by analyzing your business domain and defining the situations to be detected or information to be reported
  2. Define your performance requirements, specifically throughput and latency
  3. Identify where events are coming from
  4. Identify lower level event formats and event content that is applicable to your domain
  5. Design the event relationships leading to complex events
  6. Instrument your sources of events
  7. Design how you want to represent events: as Java classes, as Maps, or as XML events
  8. Define EPL statements for patterns and stream processing
  9. Use the CSV adapter as an event simulation tool, to test situations to be detected, or to generate to load
  10. Test against your performance requirements: throughput and latency in your target environment

    With "Instrument your sources of events" we mean to plan for, design and implement hooks in the event source systems so that they can generate the events in the format defined in 4. Instrumentation is roughly placing hooks that don't change the nominal execution flow. There are several techniques for that, from custom code to aspect-oriented technology with in the middle a whole range of component-dependant and framework-dependant technology (servlet filter, proxy objects, decorators, topic-based architecture etc.). This can also be implemented at a more coarse grained level (pipeline derivation in an enterprise service bus or a BPEL/BPM process for example).

Designing event representations

Java classes are a simple, rich and versatile way to represent events in Esper. Java classes offer inheritance and polymorphism via interfaces and super-classes, and can represent a complex business domain via an object graph. Maps and XML are an alternative way of representing events.

Event Stream Analysis

EPL statements derive and aggregate information from one or more streams of events, to join or merge event streams, and to feed results from one event stream to subsequent statements.

EPL is similar to SQL in it's use of the select clause and the where clause. However EPL statements instead of tables use event streams and a concept called views. Similar to tables in an SQL statement, views define the data available for querying and filtering. Views can represent windows over a stream of events. Views can also sort events, derive statistics from event properties, group events or handle unique event property values.

This is a sample EPL statement that computes the average price for the last 30 seconds of stock tick events:

  select avg(price) from StockTickEvent.win:time(30 sec) 

A sample EPL that returns the average price per symbol for the last 100 stock ticks.

select symbol, avg(price) as averagePrice
    from StockTickEvent.win:length(100)
group by symbol

This example joins 2 event streams. The first event stream consists of fraud warning events for which we keep the last 30 minutes (1800 seconds). The second stream is withdrawal events for which we consider the last 30 seconds. The streams are joined on account number.

select fraud.accountNumber as accntNum, fraud.warning as warn, withdraw.amount as amount,
        MAX(fraud.timestamp, withdraw.timestamp) as timestamp, 'withdrawlFraud' as desc
from FraudWarningEvent.win:time(30 min) as fraud,
        WithdrawalEvent.win:time(30 sec) as withdraw
where fraud.accountNumber = withdraw.accountNumber

Event Pattern Matching

Event patterns match when an event or multiple events occur that match the pattern's definition. Patterns can also be temporal (time-based). Pattern matching is implemented via state machines.

Pattern expressions can consist of filter expressions combined with pattern operators. Expressions can contain further nested pattern expressions by including the nested expression(s) in round brackets.

There are 5 types of operators:

  1. Operators that control pattern finder creation and termination: every
  2. Logical operators: and, or, not
  3. Temporal operators that operate on event order: -> (followed-by)
  4. Guards are where-conditions that filter out events and cause termination of the pattern finder, such as timer:within
  5. Observers observe time events as well as other events, such as timer:interval, timer:at

A sample pattern that alerts on each IBM stock tick with a price greater then 80 and within the next 60 seconds:

every StockTickEvent(symbol="IBM", price>80) where timer:within(60 seconds)

A sample pattern that alerts every 5 minutes past the hour:

every timer:at(5, *, *, *, *)

A sample pattern that alerts when event A occurs, followed by either event B or event C:

A -> ( B or C )

An event pattern where a property of a following event must match a property from the first event:

every a=EventX -> every b=EventY(objectID=a.objectID)

Combining Patterns Matching with Event Stream Analysis

Patterns match when a sequence (or absence) of events is detected. Pattern match results are available for further analysis and processing.

The pattern below detects a situation where a Status event is not followed by another Status event with the same id within 10 seconds. The statement further counts all such occurrences grouped per id.

select a.id, count(*) from pattern [
        every a=Status -> (timer:interval(10 sec) and not Status(id=a.id)
] group by id

Named windows

A named window is a global data window that can take part in many statement queries, and that can be selected-from, inserted- into and deleted-from by multiple statements. Named windows are similar to a table in a relational database system.

One can create a named window for example as follows:

create window AlertNamedWindow as (origin string, priority string, alarmNumber long)

One can trigger a select, update or delete when an event arrives. Here we show a select that simply counts the number of rows:

on TriggerEvent select count(*) from AlertNamedWindow

Named windows can as well be queried with fire-and-forget queries through the API and inward-facing JDBC driver.

Match-Recognize Pattern Matching

A match-recognize pattern is a regular-expression-based pattern-matching syntax proposed for inclusion in SQL standards.

The below query is a sample match-recognize pattern. It detects a pattern that may be present in the events held by the named window as declared above. It looks for two immediately-followed events, i.e. with no events in-between for the same origin. The first of the two events must have high priority and the second of the two events must have medium priority.

select * from AlertNamedWindow
  match_recognize (
    partition by origin
    measures a1.origin as origin, a1.alarmNumber as alarmNumber1, a2.alarmNumber as alarmNumber2
    pattern (a1 a2)
    define
      a1 as a1.priority = 'high',
      a2 as a2.priority = 'medium' 
)

Variables

A variable is a scalar, object or event value that is available for use in all statements including patterns. Variables can be used in an expression anywhere in EPL.

 

 

你可能感兴趣的:(open,source)