Big Data vs Fast Data

There was an interesting, but not uncommon, comment (on tibbr) recently about a Proof Of Concept in the CEP space that had been completed in 3 days by the 3-person TIBCO team while the competitor team were still struggling at the 3 weeks mark. This despite, per certain analysts’ reports, this competitor being one of the “big guns” in CEP. In the past some could argue that high productivity is an opposing requirement to high performance / scalability; I would counter that in event processing they are closely related. Consider:

  • TIBCO BusinessEvents remains today one of the few CEP technologies to include integrated high performance datagrid technology – you develop the concept model with the necessary metadata and methods for interacting with that data, but have no need to step out into a different (database) environment
  • Large (Tb level) datasets can be accommodated in the DataGrid simply by organising several DataGrid service instances (and a fast interconnect!)
  • Without such data interaction, development teams are forced to involve new skillsets and problems in integrating (at best) other cache or datagrid technologies to (at worst) high-latency databases.

Now not all CEP applications require fast data mechanisms – the commonly stated CEP example being “trading applications” in Capital Markets, using event stream processing, where missing a trade opportunity due to a trade pattern not being in memory is simply missing a quick profit. However, in most “business event processing”,  the context required for effective decisions involve various dimensions of customers, past transactions, services etc. Such context will often be too much data to be entirely in-situ. So it’s no surprise that while most of the CEP competition historically remains focused on the simple-data requirements of stream processing,  TIBCO’s growth and success in this market area comes from capabilities around processing high rates of events against large volumes of contextual data (via “fast data”) through states, decisions and rules.

So if fast data is driving the success of CEP, why is there so much attention being paid to Big Data? The 2 are of course simply different ends of the same continuum of event data being collected across businesses today. Old data is collected in repositories and analysed for long-term trends (using data mining and predictive analytics) which can then be applied to high-speed short-term contextual processing of important (business) events. Exploiting Big Data needs Fast Data too – that is, assuming you need fast operational decisions too (or “2SA“) …