Whitepaper: The Characteristics of NoSQL Analytics Systems
by John De Goes, CTO and Co-Founder of SlamData
Semistructured data, called NoSQL data in this paper, is growing at an unprecedented rate. This growth is fueled, in part, by the proliferation of web and mobile applications, APIs, event-oriented data, sensor data, machine learning, and the Internet of Things, all of which are disproportionately powered by NoSQL technologies and data models.
NoSQL operational databases continue to gain ground on relational databases, with MongoDB recently becoming the 4th most popular database in the world. Hadoop is well on its way to becoming the de facto “data lake” for company-wide data, regardless of structure, and the rapid maturation of machine learning has provided robust ways to turn unstructured data like videos, audio, and images into NoSQL data.
Recently, there has been a flurry of discussion about the implications of the rise of NoSQL for analytics and decision support systems. These discussions often revolve around use cases, the extent to which non-tabular data models permit analytics (and if so what kind), and whether or not NoSQL systems have the ability to participate in analytic workloads.
As expected for any early-stage technology, these discussions are often imprecise, and conflate a wide range of concerns, including semantics, architecture, performance, technology, use cases, and user interface.
In contrast, this paper carves out a single concern, by focusing on the system-level capabilities required to derive maximum analytic value from a generalized model of NoSQL data. This approach leads to eight well-defined, objective characteristics, which collectively form a precise capabilities-based definition of a NoSQL analytics system.
These capabilities are inextricably motivated by use cases, but other considerations are explicitly ignored. They are ignored not because they are unimportant (quite the contrary), but because they are orthogonal to the raw capabilities a system must possess to be capable of deriving analytic value from NoSQL data.
Table of Contents
- The Nature of NoSQL Data
- NoSQL Databases
- Big Data
- A Generic Data Model for NoSQL
- Approaches to NoSQL Analytics
- Coding & ETL
- Real-Time Analytics
- Relational Model Virtualization
- First-Class NoSQL Analytics
- Characteristics of NoSQL Analytics Systems
- Generic Data Model
- Isomorphic Data Model
- Unified Schema/Data
- Polymorphic Queries
- Dynamic Type Discovery & Conversion
- Structural Patterns
Who Is Using SlamData?
News, Analysis and Blogs
The Hadoop Data Lake has been positioned as the one size fits all answer to a company’s data silo problems. In reality, Hadoop is often NOT the most efficient answer to the problem.read more
SlamData Inc., the leading open source analytics company for modern unstructured data today announced that it has raised a $6.7M Series A funding round, led by Shasta Ventures.read more
SlamData just released its first update of 2017, SlamData 4.1.1. It delivers a number of new UI enhancements, performance improvements, new charts, as well as commercial releases for the Couchbase, MarkLogic and Spark/Hadoop connectors.read more
Damon provides a quick “getting started” video for SlamData.read more
We’re excited to announce that we’ve been included in the 2017 list!read more
The following is an interview I conducted with Jeff Carr, CEO and Founder of SlamData regarding the trends in enterprise business intelligence.read more
Send Us A Message