Mastering Big Data with MapR and Syntelli

Our blog rarely focuses on ourselves – Spotfire – but as we end 2009 and start 2010, we thought it would be appropriate to check in with Spotfire. We spoke with Spotfire’s Vice President of Marketing, Mark Lorion. Read on to learn what Spotfire is seeing in the marketplace.
Reading Time: 2 minutes

Using Big Data the right way can give your business the competitive edge it needs to succeed. Unfortunately, it’s all too easy to get lost in the complex Big Data jungle. Before you know it, you’re stuck between Hadoop infrastructure, software, install, and configuration issues that pop up every step of the way. If this happens, you can lose the path and never see its true benefits.

Even when you succeed in creating an enterprise data lake—combining structured, semi-structured, and unstructured data—it may not be available to end-users for analytics and self- service reporting until it has been cleaned up and massaged.

The Internet of Things (IoT) is impacting this as well. It is critical for organizations to use data efficiently to be competitive. IoT data from sensors and social media data are typically in the form of a complex JavaScript Object Notation (JSON) format which cannot be utilized by business users directly with ANSI SQL. In order to make it usable, IT has to maintain expensive Extract, Transform and Load (ETL) cycles and maintain schemas. Anytime a schema changes or a new attribute needs to be added, the full cycle of development happens again.

Spotfire provides the analytics on top of unstructured data using Apache Drill. Drill supports a variety of NoSQL databases and file systems, including:

  • HBase
  • MongoDB
  • MapR-DB
  • HDFS
  • MapR-FS
  • Amazon S3
  • Azure Blob Storage
  • Google Cloud Storage
  • Swift
  • NAS
  • Local files

A single query can join data from multiple data stores. For example, you can join a user profile collection in MongoDB with a directory of event logs in Hadoop.

Spotfire and Apache Drill also offer:

Self-service raw data exploration: You can explore and analyze raw data sets of any complexity as they arrive on Hadoop using Spotfire. There is no need for expensive ETL cycles by IT to maintain schemas; you can do instant joins across newly ingested data and find insights.

Insights on structured/semi-structured data: You can now use SQL to natively query and manipulate complex/semi-structured JSON data originating from NoSQL applications, such as web/mobile and sensor-equipped Internet of Things (IoT) devices. Apache Drill allows instant flattening and native querying of complex nested data.

Join us for our September 22nd webinar, where you will learn how to:

  • make it easier to access and analyze Hadoop data
  • blend Hadoop with RDBMS data, social media, and other disparate data to improve insight and gain a competitive edge
  • distribute your key metrics throughout your organization to allow for better actionable collaboration
  • interactively blend and explore your data for breakthrough insights