Why Organizations Want To Solve The JSON Problem

JSON stands for JavaScript Object Notation, and it’s a way to format data. It was developed in the early 2000s, but it’s only in the last few years that it’s really caught on. JavaScript spec now includes a JSON object, and many developers are incorporating JSON as a sort of subset of the language itself.

What’s The Hype About?

JSON is just represented as text, so you can’t input functions or dynamic date values. There are no methods or other functionality in JSON — it’s just text. But that simplicity is a good thing, since it’s what makes JSON so flexible.

JSON objects can also be nested inside each other infinitely, allowing users to express data in as many categories and subcategories as they want. And since it’s all stored as text, it can be stored in basically any way a user wants — in a database, a text file, client storage, or even as its own file using the .json file extension.

The practical upshot of all this is that the JSON format can be used to log almost any kind of data from any source and from any complexity. And given the state of the internet these days, that’s a very big deal.

The Complex Data Problem

In the last ten years, there has been an explosion of SaaS businesses, online gaming, mobile apps, and IoT devices — billions of data points are being generated by millions of users on a daily basis, and the companies that make these devices want to know what their users are doing with them. The more complicated and numerous the devices get, the more complicated the data they produce and the more difficult it is for marketers, CIOs, or data engineers to get any useful insights out of it.

User event data can range from how users play games, use apps, and interact with each other, to the devices they use to do so, the times and locations they use them, and how long they use them for. IoT sensors generate even more data — every time a video doorbell is pressed, a smart thermostat is adjusted, a voice assistant is triggered, or any number of other connected devices is used, a data point is generated.

The Old Solution

The flexibility of JSON data means that all these apps and devices can work in similar formats, producing data in the same syntax. The downside is that since the data being tracked is coming from such disparate sources, it’s nearly impossible to use one single solution to turn it into analytics-ready tables.

There are tools available, including ETL (Extract, Load, and Transform) tools, that allow data engineers to transform data for analytics and put it in a data warehouse. The problem is that none of these tools can handle the complexity of modern JSON data.

To make the data work for the tools, companies have been writing code manually. Customized data parsers can turn complex data into more usable forms, but it’s slow and expensive. It’s also a process that only data engineers can access.

Data For Everyone

The downside of using custom data parsers, individually tailored to each organization’s needs, is that no one else can use them. No one else in the industry, obviously — though that’s a problem, since it means that companies can’t just buy a tool built for someone else’s similar company unless the data is formatted in exactly the same way.

But it also means that the people who actually want to use the data to draw actionable conclusions have to go through data engineers. It’s an incredibly tedious process — queries have to be submitted and tools have to be built to extract the relevant data points from the huge amount of data that’s been produced. This can take weeks or even months.

Worse still, that data isn’t easy to update. If the analysts or marketers who asked the questions in the first place have follow-up questions or need to run the same queries a year later to compare numbers, they won’t be able to without the same laborious process by the engineers.

The New Solution

So is there an answer? Not a simple one. The underlying problem is that the mathematical principles that these tools are built on, called Relational Algebra, can never fully encompass the complexity of modern JSON data.

That’s why SlamData created a new system, a novel but powerful underlying algebra called multi-dimensional relational algebra (MRA). Our powerful new tool can natively understand any JSON thrown at it from any source and with any complexity, displaying it in an interface that anyone who’s used the filesystem on their computer can use.

With SlamData, it’s easy for any user to access complex JSON data stored in S3, Azure, Wasabi or just about anywhere and turn that data into tables ready for analytics. Better still, they can edit and iterate on these tables quickly and easily as they discover what they need. Hopefully, we can solve the JSON problem and open up a whole new world of detailed data analytics for years to come.