How to Drive Big Data Architectures Forward with Event-Driven Rules

Reading Time: 10 minutes

Typical big data technology is not enough to build a comprehensive solution… you need event-driven rules as part of the overall picture. Join TIBCO’s global strategic solutions lead Nelson Petracek, and host, tech journalist Ellis Booker, as they talk about big data, and why just having the enabling technology—say, a Hadoop Cluster—doesn’t answer business requirements.

Transcript:

Welcome to the TIBCO podcast. I’m your host, Ellis Booker. Today’s topic is driving your big data architecture forward with event-driven rules. We’re going to talk about big data and why just having the enabling technologies say, I do cluster, doesn’t answer all the important business requirements. Joining us, is Nelson Petracek, Global Lead of TIBCO Software Strategic Solutions Group. Hello Nelson.

Nelson: Hi Ellis. Glad to be here and looking forward to the discussion.

Ellis: Nelson, for starters, can you briefly describe your role at TIBCO?

Nelson: Sure. I manage a group within TIBCO, called the Strategic Solutions Group. We’re a global team of domain experts focused on technology areas, such as business process management, case management, event processing and operation intelligence. And our job is really to work with customers to help them understand how these technologies can fit into areas such as IoT and big data.

Ellis: Great. Let’s stroll down into today’s topic, the need for event-driven rules for big data architectures. Why is that important?

Nelson: When you look at big data architectures, I mean, traditionally they evolve from this need to execute batch-style of processing better. So a lot of organizations were looking to reduce the cost of processing ETL jobs or analytical jobs on top of a data warehouse and they wanted to move to a cheaper, more effective and more scalable environment, in order to execute this style of batch processing. Then what I found was organizations then said, “Well, we are going to have batch but we also need near real-time or real-time capabilities.

And this is where you get into a lot of discussions nowadays around technologies like Spark. There are other open source or big data technologies that are now looking to offer near real-time processing capabilities. And this is often referred to as streaming analytics, the ability to apply math on data as it’s being adjusted. However, what we’re forgetting is one key aspect. And that aspect is the fact that the business tends to think in rules.

So when I’m a business person, I’m thinking in terms of well, if the propensity to buy a product under certain conditions exceeds this value and the situation is triggered while the customer is browsing my website, and if I have sufficient inventory, and so on and so on, then I want to do this. They’re thinking in terms of rules. So we need to add that capability to our big data architecture and to our solutions in order to maximize the value that we want to provide to the business.

Ellis: You’ve written elsewhere that big data rule processing environments must include the following context, time and state. Can you unpack all that for us?

Nelson: Sure. Let me drill down on each one of those in a little more detail. Typically when you refer to context, you’re really talking about adding additional outside information into your event-driven rule of processing. So for example, I may have a set of rules and these rules may be concerned about how I offer products or manage inventory in a particular geographical area but if it happens to be in the middle of a snowstorm, in that geographical area, obviously those rules need to know that information, so that they can respond accordingly.

So context is all about bringing in additional data. It might be information from a master data management environment, and it may actually be additional information from my big data environment, just to provide that additional context in which the rules themselves will execute. And of course, time. You often think about rules being executed at a point in time.

So I ask a bunch of rules a question and it fires back an answer. In a lot of cases, the decisions we need to make, actually need to span a period of time. So the fact that a temperature reading went over a certain threshold once, may or may not actually have a lot of value to you, but the fact that it went over a threshold and is continuing to exceed that threshold, over a particular time window, that’s actually when I’m interested.

And so the notion of time often needs to be taken into account when determining whether or not certain aspects of my business need to be addressed. And then state, state is really about the system remembering the state of key business entities and then being able to respond differently, depending on what state that entity is in. So for example, if I have a shipment, that shipment can go through a bunch of different states. It may be in the picking-up state, it might be in the in-transit state, and then I might be in a being-delivered state.

So my rules are going…it might be the same set of rules, but they’re going to behave differently, depending on what state this business entity is in. And so it’s really the combination of these three aspects become key when building a big data event-driven rule system. And one of the things that I always tell people not to forget about, is the fact that, in a lot of cases, you’re using these three aspects to identify things that do not happen. So how do I look for missing events? In a lot of cases, that’s actually more important than the events themselves.

Ellis: So existing applications, business applications, operational applications, have taken these lessons, have cooked these ideas in, right? But now, we’re layering on big data and large volume and fast moving events and so forth. So the question I’m sure everyone’s wondering, is, how difficult is it to take those existing business rules that we’ve created and migrate them into this big data environment? What can stay the same? What needs updating?

Nelson: All right. It’s an interesting conversation, typically. It really depends on the characteristics of the rules themselves and on the use cases that you’re trying to solve. So in certain situations, the rules, for the most part, can be migrated over. The logic still applies. Maybe what I’m doing is, I’m moving away from a batch style processing of the rules into more of an event-driven style of processing of those rules. So to do so, just means that I’m not waiting until midnight to process my rule-set against 3 million records, instead, I’m doing it record by record throughout the day so that I can deliver more timely information.

So those types of rules, in many cases, can be mostly brought over. However, in other cases, if I want to incorporate the notion of context, time and state, as we just discussed, then these rules may require some enhancements. I may need to add again, a notion of time, I may need to add additional information to my rules, so that they can execute smarter.

Ellis: Okay. Is that expensive? That’s the next question every potential customer is going to have. Is it expensive? Do you need new tools? Do you need new skills? Do you need new people to create these things?

Nelson: It really depends on your definition of expensive, I guess. What I always ask people is, how expensive is it to not do it? So for example, what is the cost of missing a series of indicators that indicate that an expensive piece of equipment is going to fail or what is the cost of providing a poor customer experience? If I walk into a store and they don’t know anything about me, they don’t communicate with me consistently across multiple channels, I then walk out of that store, I go across the street and buy something else. What is the cost of doing that?

So there’s an expense of not doing this as well. And I always tell people, you don’t have to go and implement everything all of a sudden, in an event-driven rule of fashion. Start small, you can incrementally add capabilities but start with a certain use case that has a large amount of business value, build that core foundation and then add additional business use cases as time progresses. So don’t try and do everything in one shot right up front, do it incrementally over a period of time.

Ellis: I’m glad you mentioned use cases, and that retail scenario, just a minute ago. So I really want to look at some real world examples. I think often, when people talk about this, they’re thinking of credit card companies or giant insurance companies who are dealing with lots of information, quickly moving information. So in your choice of some use cases, let’s pick scenarios that aren’t those usual suspects, if you could.

Nelson: In terms of use cases, this concept of applying event-driven rules to big data architectures, can really span any vertical. I talked a little bit about logistics and tracking packages, this is typically a scenario or a use case category that I refer to as track and trace. So you’re tracking some key business entity as it moves from, in this case, point A to point B. And you’re also tracking and monitoring the equipment that’s doing so. So the trucks themselves, co-relating information about the trucks, to roads, to traffic, all those things can be brought together in order to make that system more intelligent.

I always use airlines as an example. Most of us fly, most of us have probably had different experiences while flying, whether they’re good or bad. But of course, they’re all about having a whole series of events, coming from a variety of sources and I need to make decisions quickly, in order to respond to changing conditions. And those conditions may be related to crew scheduling, gate changes, maintenance, baggage handling and so on. Fraud detection and cyber security is another set of use cases that I see this type of technology is really applicable for.

And the final one, if you’re looking at use cases that are more unexpected…we’ve talked in the past about oil and gas use cases, where oil and gas companies have traditionally, especially as they’re drilling, been seen as more of a very batch-oriented process. They do a lot of their data reporting monthly…they don’t, they’re often not seen as organizations that deal with a lot of event-driven rule data. But in certain cases, when you’re getting into monitoring expensive equipment, they need just that, they need that capability to monitor this information in real-time, apply math to this information, execute rules against this information in order to try and predict equipment failure. And then the last one is really in health care.

We’ve got organizations that have taken clinical rules and put them as part of an event-driven rule system, in order to improve patient care. And we’re going to start seeing more of this, with the proliferation of wearables now, the information and the amount of data that’s being generated. I need to be able to collect that information and make more intelligent decisions, again, related to better patient care. And as health companies are moving towards more personalized experiences, collaboration, moving care from hospitals to the home, all of these things, are going to lead to a greater need for event-driven rules within a big data architecture.

Ellis: You mentioned two things just now that caught my attention. One was the predictive analytics, that companies are going to begin using, the oil and gas people knowing when their pumps are going to fail, if they can get ahead of that, before the failure happens, they’re going to save a lot of money. And the other one is, the wearables case, where you’re personalizing things more and more, a lot of data is flowing from these systems. Again, there’s probably a predictive component in that as well.

We’re not waiting for the person having a heart attack and then dealing with it. We may be able to tell if something is starting to go off the rails. So speak about a little, if you could, the predictive component of this. We didn’t talk about it earlier but from an event-driven perspective.

Nelson: It really evolves from your historical data, so the cycle is typically…you collect a series of information and you may put that into your big data architecture. So you’re going to use that to store not only the raw events, but you may also use the big data architecture to store the series of derived data layers as well. But what you want to do then is execute various analytics against that information. So you’re going to test out different models, you’re going to look at applying different algorithms against the data that you’ve collected. And the whole purpose here is to identify the model or the set of models that is most applicable and the best representation of the use case that you are trying to capture.

So if I’m looking at trying to predict equipment failure, I’m looking at those variables that are the best indicators as to when that piece of equipment is going to fail and within some time period, as an example. If I’m looking to predict customer churn, I’m looking to see what are the indicators of churn, how does the customer interact with me on my website, what type of behavior do they exhibit, in order for me to identify earlier, rather than later, when a customer is going to leave my organization.

So typically I’m taking the data that’s in my big data architecture, executing analytics against that data, generating a series of models and then taking those models and executing them within the event-driven real-time rule context. So now, my system is actively monitoring those signals and applying those signals or variables and data against my model and using the results of that to determine what action it should take proactively, rather than reactively.

Ellis: So I guess, one of the questions people are going to have is, where do I start? In other words, you said up top, you’ve got to be aware of those business requirements, in terms of your rules, the ones that are important to capture in this environment. But do you have any advice on where a company that hasn’t gone down this road, does have business rules through their traditional relational databases or inventory systems, but now they’ve heard a lot about big data and real-time data, is there any advice you can give them on how they can frame the question, even, to know where to begin, to start?

Nelson: Well, the key here is to look for that business problem, that use case, where if you had the information sooner, it would lead to better customer experience, better predictability in your business. It would lead to more…the standard measures around more revenue, less cost and so on. You want to try and find that use case that has the right amount of visibility within your organization and the right amount of potential value to move towards, typically away from batch into this event-driven rule system. And again, it’s taking those concepts of traditional request apply rules are not enough. I need to apply the notions of context, time and state.

So I need a situation or a scenario where those same principles apply but I want to look for that use case, I want to make it…I want to scale it to the level where it’s not going to be a multi-month project. I want to measure the project duration in weeks, rather than months. I need to be able to get that quick win, demonstrate that win to the business and then build upon that core foundation.

So there’s really no one magic question or one thing that specifically you can look for, other than here’s a scenario where today I may do it in batch, today I don’t have enough visibility if I’m doing things like track and trace. Today, I need more predictability into how my business is executing and what business area can I identify that particular use case, what’s the value to me as an organization and to the business and then how can I execute that to build that core foundation that I can then build upon.

Ellis: Great. And Nelson, that’s going to be the last answer, that’s all the time we have for this episode of the TIBCO podcast. If you’d like to hear other podcasts or suggest show topics, please visit our website. Thanks for visiting and thanks for listening.