Dagstuhl#10201 on EP: Experiences of others in the CEP space

Reading Time: 3 minutes

Opher Etzion has blogged a good summary of one of the more “interesting” sessions at Dagstuhl, a.k.a. the “industry feedback” evening session. My list of the “most interesting points raised” includes:

  1. Marco Sierio from Sweden talked about his ASP/SaaS model for delivering event processing – proof if needed that event processing applications can be deployed as services or in the cloud. His RuleCore application does track n trace for trans-Europe shipments, with customer rules sent as events directly from his customer to the system.
    • The “rule updates are just events” philosophy makes perfect sense in my mind, albeit one requiring considerably more event validation / security / etc etc than the simple sensor events that drive the application. A similar approach is used to update the template-defined rules in TIBCO Active Service Gateway, for example.
    • Marco mentioned one side effect of the application being that the shipping company discovered truckers were apt to make stops outside – and therefore effectively associating their ad-emblazoned trailers with-  er, “establishments they would not normally want to be associated with”. Real life meets monitored life?
    • Some of the “issues” in learning to “do CEP” he finds in customers are
      • understanding the declarative (rather than procedural) model, and
      • understanding asynchronous events (rather than “request-reply”, a.k.a. simple SOA).
  2. Richard Tibbetts talked about some of his “observations” about developer practices in building CEP systems from his customers at Streambase – and applicable to all CEP development and customers, for sure.
    • Lack of discipline in MOM message design: developers create too many message types and replicate information inside messages. I have heard this before, and continue to wonder whether there is now a need for “Master Event Management” …
    • A tendency towards premature optimisation: some EP operations are simply not used enough to warrant much effort in optimising them. Use a profiler to see what needs optimising *after* the system is running …
    • Use incremental development – don’t try and replicate the “big bang” and wait until functional completion before unit-testing!
    • Event patterns tend NOT to be favored by developers, compared to lower level developer constructs like queries.
  3. Badrish Chandramouli from Microsoft explained the genesis of the Microsoft stream processing engine (MSSI) from its routes in the CEDR research project in Microsoft Research.
    • MSSI is used internally for web analytics on the Microsoft Hotmail system, doing user classification for the on-line ad system at a rate of 1M events per day (i.e. probably 20-50 per sec), as a part of a system that uses map reduce as a part of the machine learning.
    • The MSSI product is validated internally by translating sample continuous queries into SQLServer queries via some temporal algebra translator, and event streams into database tables; if the outputs from SQLServer and MSSI don’t match they know they have a problem!
    • Although MSSI is brand new to the market, they are already looking at High Availability features for a future release, and topics like real-time data mining…
  4. IBM covered 2 of their (at least 3, maybe 4) CEP technologies:
    • Martin Hirzel from the IBM System S (a.k.a. Infosphere Streams) research group talked about their progression from research tool to product, and the development of their Streams Processing Language.
    • Udo Pletat talked about using the IBM RFID tool that exploits the IBM Labs’ Amit ECA rule engine. I hadn’t realised Amit had gone into real-world use, so will need to do more updates on the CEP Market time-graph!