Existing ETL tools use relational algebra, which was originally written almost 50 years ago and requires data to be flat and uniform in order to analyze it. But modern data isn’t like that. We had to come up with something new.
A mathematical approach to semi-structured data that accounts for complexity in range and variability.
SlamData’s solution includes native support for variable schema and data types.
Built to integrate with modern SaaS, IoT apps, and API’s.
Relational algebra, which was originally created in 1972, requires data to be completely flat, uniform, and have a strong fixed schema throughout in order to perform analytics. This approach is fundamentally in conflict with modern data models like JSON, XML, log files, or even CSV which are highly heterogeneous and have varying degrees of dynamic schema and multiple levels of data (nested data).
MRA is a novel meta-model almost a decade in development, derived from ZF set theory and capable of expressing traditional relational algebra as a special-case (i.e. when the data happens to be homogeneous and mono-dimensional). In fact, not only can MRA express all of the classic relational algebra, but it also formalizes critical SQL operations (such as reductions) which are unspecified by Codd’s original work. When data is heavily nested, or heterogeneous (with every row potentially having a completely different schema from the previous), or both at the same time, MRA is completely unprecedented in its ability to elegantly define a stable, reliable, and consistent evaluation model.
Modern SaaS and IoT applications, as well as most web APIs, deliver JSON data payloads with a high degree of heterogeneity. This type of data can be very hard to work with for end users and non-data engineers. And this type of data will not work with any popular BI or Data Science tools without significant transformation and preparation. MRA solves this issue.
MRA defines an elegant unification of structure and identity which allows SlamData to blur the lines between rows, values, structure, and depth. From the standpoint of the mathematics, there is no difference (up to isomorphism) between a pair of values that occur in two separate rows, versus a pair of values that occur in two separate structures within a row. A powerful and unique dependent type system allows us to efficiently compile expressions defined in terms of MRA down to incredibly fast and scalable data transformation pipelines.
You can apply SlamData to data which has tens of millions of rows, or data where each row has tens of millions of values (current row limits allow a maximum of roughly 50 GB per row on a single commodity server or VM), or even data which is both large and wide at the same time! All of these transformations will be evaluated in a streaming fashion with substantially higher performance than hand-optimized Python or even C, not to mention the fact that, with SlamData, you won’t be writing any code whatsoever. MRA powers a sophisticated and intuitive point-and-click user experience for exploring and transforming your data.
MRA lets users work with complex data as it is rather than with some watered-down version. It is a novel and much-needed update of the core algebra that has powered database analytics for well over 40 years.