Brenda Michelson over on ebizQ’s Business-Driven-Architect blog had an interesting post on an article published in Communications of the ACM titled “Data in Flight”. The article tries to explain the idea of “data in motion” and the benefits of event stream processing, presumably assuming an audience of database folks.
Brenda’s quotes from the article are probably a good place for a bit of commentary:
“The streaming query engine is a new technology … “
… if you mean “new” is 10 years old or more. But maybe the author means “new” to the presumed reader? Although I would have thought that readers of this particular journal would be well advised on such technology trends…
“CEP has been used within the industry as a blanket term to describe the entire field of streaming query systems.”
Actually CEP covers multiple types of event processing, including continuous queries on event streams. Is there something fundamentally different between an event stream (from some external source) and a “data stream” (for some internal source)? No, I don’t think so.
“This is regrettable because it has resulted in a religious war between SQL-based and non-SQL-based vendors…”
Religious war? Most people accept that there are many types of event processing languages, and these are best suited to different types of event operations – and we see this in a variety of leading CEP vendors. Indeed, 2 out of the top 3 market leading CEP vendors (including TIBCO as the market leader) do not rely on SQL-based continuous queries (although this is indeed available, for tasks like stream processing, in TIBCO BusinessEvents).
“… in overly focusing on financial services applications, has caused other application areas to be neglected.”
Overly focusing on financial services applications? Well, it is certainly true that many CEP vendors focus on algorithmic trading, probably contributing to some vendors’ eventual demise during the recent financial downturn. But TIBCO’s main CEP customer base is in transport, logistics, and telecom, closely followed by government and then financial services.
“Because of their shared SQL language, streaming query engines and relational databases can collaborate to solve problems in monitoring and realtime business intelligence. SQL makes them accessible to a large pool of people with SQL expertise.”
Well, continuous query languages tend to be loosely based on SQL, but have different semantics. For example, in TIBCO BusinessEvents Query Language (BQL), there is a policy statement that defines things like time window, resultset size, etc. Such a continuous query, executing asynchronously, is probably not something a synchronous SQL / stored procedure developer will be automatically comfortable with.
“…streaming query systems can support patterns such as enterprise messaging, complex event processing, continuous data integration, and new application areas that are still being discovered.”
Well, this is truly where the ACM article reviewers missed some hype. As a subset of CEP, for sure, continuous queries “support” operations such as routing events, identifying event patterns, merging event data, and so forth. But so do other event processing technologies (and TIBCO customers do these with both production rules and queries…).
So one might conclude that this article is some sort of clever marketing from a stream-processing start-up. Which, indeed, it probably was – and all kudos to them for their success in getting this published in ACM. I believe it was aimed at encouraging database-users out of their “transactional shells” into the “real-time world of events” – which is possibly a frightful proposition to the CIOs and DBAs used to the staid and stable world of “data at rest” (and presumably on a database somewhere).
And hopefully someone (maybe from EPTS?) will counter with a more balanced article on event processing applications and use cases in the near future.