Why Modern BI is Failing
Jeff Carr CEO & Co-Founder
May 16, 2017
Legacy BI tools were not built to handle modern NoSQL data models like JSON or XML. Change starts with real innovation, not tweaking the status quo.
Why Modern BI Is Failing
Almost daily I hear from companies asking how we can help them connect their relational BI tool (Tableau, PowerBI, ect.) to their NoSQL database including MongoDB, Couchbase, Marklogic and others. It’s understandable, these tools have been around a long time, and many people are comfortable using them. The problem is these tools were not built to handle modern NoSQL data models like JSON or XML, and have extremely poor support for them. Nevertheless, people are determined to make them work. The results in most cases are poor, and ultimately they look for other ways to solve the problem after spending valuable time and money trying to force the square peg in the round hole.
An entire industry has thrived in the last decade based on the simple idea that all data should fit the existing tools, regardless of where it started..
Frequently Asked Questions
If You Have A Hammer, Everything Looks Like A Nail
The relational data model was first defined in the early 1970’s and it’s been the dominant model for database and analytics for 40+ years. In recent years new more complex data models have taken hold to support modern applications including IoT, Social media and SaaS. Developers have openly embraced this change since it made their job easier and more efficient. However when analytics on these modern applications comes up the immediate response is make ALL the data relational, regardless of how it is now. Its as if we have decided there can only be one data model for analytics for the rest of eternity.
First, you should ask yourself, why am I using a NoSQL DB in the first place? For many the answer is schema flexibility, or the ability to not need to have to design a completely fixed schema upfront. It makes development of the application much easier and more agile. If you value this feature for building your app, you should value it for analytics also. Unfortunately the absolute requirement of ALL traditional BI tools is a fixed schema. From this point forward most of your analytics effort will not be actual analytics at all, but trying to make the flexible schema JSON data fit into a fixed schema model so your tool can understand it. If you don’t need the flexible schema in the data then use a relational DB like Postgres or MySQL, your legacy BI tool will work fine. Otherwise you need to seriously rethink your analytics approach.
Conventional thinking is just “make” the data fit the tool.
How Did We Get Here?
For many non-developers they really don’t fully understand the implications of modern NoSQL datastores. If you’re a business analyst, or marketing person, that did not have a hand in picking the datastore why would you? Most people confronted with the need to gain insights into data will simply default to a tool they know. I recently was trying to explain to an analyst why traditional BI tools were not a great fit for NoSQL, his response was “I’m comfortable with Tableau”. This is the crux of the problem. Modern DB’s have exploded over the last 10 years, they are popping up everywhere, and gaining insights from data models like JSON require tools not built solely for relational data models. Since existing BI were never designed for non-relational data, the focus has been changing the data to fit the tools, instead of just building better tools designed for JSON and other non-relational data.
Need evidence? The number one indicator someone will find a solution like SlamData is have they already failed trying to use an existing BI tool. We have seen companies spend weeks or even months trying to get a “bigger” hammer for that square peg, ultimately only to surrender and realize that powerful analytics on NoSQL data is a very different beast, not just a few degrees from what they already know.
Solving the problem of modern data analytics is as much an exercise in human nature as it is a technical problem. Users learn to like a particular analytics tool and become determined to use them on every possible data problem regardless of the probability it will work.
Don’t Fix Your Data, Fix Your BI Tool
An entire industry has thrived in the last decade based on the simple idea that all data should fit the existing tools, regardless of where it started. A large percentage of “Data prep” is really just making non-relational data like JSON fit into tables so traditional BI tools can work. Sure a small percentage is related to data cleaning, or combining disparate data sources, but majority is not. Studies have shown that 80% of “Data science” is really just data prep, of which most is organizing data to a fixed schema of perfectly flat homogenous tables. This is what your BI tool expects, regardless of the original form of the data. Conventional thinking is just “make” the data fit the tool. Unfortunately this approach creates a number of issues including not just the effort needed to change the data, but the fact that by changing it you actually loose data fidelity. You are “dumbing” down the data.
The correct approach is building better BI tools that actually take full advantage of complex data models like JSON as they were designed. Easy to say, hard to do.
I get asked all the time why nobody else has tried this approach? Simple, it’s really hard to build this kind of solution, which means in most cases it’s just easier to make users change the data.
“The Hard Thing About Hard Things”
Quoting the title of one of my favorite books by VC Ben Horowitz. The point is doing hard things may be hard, but they still need to get done. So the answer is build better, smarter BI tools and stop “prepping” your data to the point of uselessness.
Creating a solution like SlamData required us to innovate an entirely new algebra to work with multi-dimensional data, and to build a much more flexible data model for handling modern data. We have to match the schema flexibility of today’s modern datastore with an analytic capability that embraces this flexibility. Much more that just a simple visualization tool, we had to allow users to create complex analytic workflows that allow them to reshape, combine, and aggregate data no matter how complex and then deliver an interactive analytic experience.
Nobody else has done this, that’s right, nobody. I hear a new vendor every week claiming to solve this problem. I start reading their documentation, usually within the first paragraph you find the tipoff, “Declare your fixed schema”. If you find these words or anything like them you heading in the wrong direction fast.
While human nature and conventional thinking are always hard to change, its important. Significant innovations don’t generally happen fast or easily, but they happen. Data complexity is increasing, and our analytic tools need to keep up. The wrong assumption that all data must be a certain way for analytics, forever, needs to change. Change starts with real innovation, not tweaking the status quo.
News, Analysis and Blogs
The Hadoop Data Lake has been positioned as the one size fits all answer to a company’s data silo problems. In reality, Hadoop is often NOT the most efficient answer to the problem.read more
SlamData Inc., the leading open source analytics company for modern unstructured data today announced that it has raised a $6.7M Series A funding round, led by Shasta Ventures.read more
SlamData just released its first update of 2017, SlamData 4.1.1. It delivers a number of new UI enhancements, performance improvements, new charts, as well as commercial releases for the Couchbase, MarkLogic and Spark/Hadoop connectors.read more
Damon provides a quick “getting started” video for SlamData.read more
We’re excited to announce that we’ve been included in the 2017 list!read more
The following is an interview I conducted with Jeff Carr, CEO and Founder of SlamData regarding the trends in enterprise business intelligence.read more
Send Us A Message
© 2017 SlamData, Inc.