The Definitive Guide To JOINs On MongoDB

The Obvious: NoSQL has been around a while. Same with MongoDB. If you’re not using it now, you’re probably going to bump into it soon.

The Reality: Developers pushed a lot of businesses down the path of using MongoDB because the benefits were innumerable. And there’s no going back.

The Ghastly! Truth: The “business” side of the house kinda got the short end of the stick. In plain English: they got screwed. In “Relational Land” they — analysts, BI folks — enjoyed many a robust toolset/software solution for analytics. They could do robust analysis on loads of data in a single click — all without intervention from “IT”. In “Non-relational Land”? No such luck. Basically “switching over to MongoDB” meant starting building BI from scratch.

The Denouement: This guide lays out the facts and the options for querying MongoDB. It’s bleak up front but gets way better as you think outside the box. Stick with it because it’s awesome.

The ABCs of JOINs

We’re not gonna bore you, but let’s lay out the basics. JOINs are at the center of data analysis. They’re the workhorse of analytics. And there are only a few types of them.

They are: Cross, Inner and Outer. Outer comes in a few forms: Left, Right and Full.

When made available to a pro analyst, JOINs are like… it’s like combining a scalpel and a microscope into one tool. JOINs simply make it easy to get the right subset of data in front of your eyes. 

Out of the Box: JOINs On MongoDB

You might be excited to know that you can do all JOIN types on MongoDB. YES!

But practically speaking only LEFT OUTER JOINs work. NO!

Here’s why.

Option A: Oh Yes! Wait! What the…

MongoDB built pretty good infrastructure for doing native queries — it’s called the Aggregation Framework. But due to certain limitations of that framework you can only do a LEFT OUTER JOIN.

Is that a half-finished job? That’s the nice way of putting it. But it’s more a limitation of the underlying technology than unfinished word. The need is certainly there.

Regardless, you likely need more than LEFT OUTER JOIN. If by some odd fluke, that’s all you need, then you’re good to go. But don’t kick your heels too fast: you’ll still need to learn, know, wield Mongo Query Language. You’ll need to dedicate a resource to learning it and keeping up with it. That’s a lot of overhead. All for LEFT OUTER JOIN…

Option B: MapReduce Hell

No reason to sugar coat anything here, right? We’re all adults. If you need or want the full gamut of queries — you do — then you’re going to have to go down this path.  Bring water, extra food, and a flashlight. Or more literally take out the checkbook. It’s gonna hurt. Here’s the brutal truth:

1. Brush up on your MapReduce. It’s fundamental to working with queries on MongoDB. No one really likes MapReduce for a lot of reasons. But it’s all you got so crack the books. 

2. Find someone who is expert at writing Javascript.

3. Dedicate someone to learning Mongo Query Language.

4. Start keeping tabs on your schema changes. Why? Well, if you do pull off some good query writing with the three technologies listed above then you’re going to get just one minute of fame and fortune because once your schema changes everything will fall apart. Your queries will break. Your reports will break. That equals more development time, more upkeep, more time, more money. That’s a hamster wheel.

The sum total here is that you invested in great technology, and if you stick with the status quo for analytics you’re going to pay twice. But it doesn’t have to be that way!

JOINs On MongoDB When You’re Running SlamData

SlamData’s mission is to tame the polyglot world (databases gone wild!). How? By offering the world a lingua franca. SlamData is a platform that allows you to simply get to work on data wherever it is without doing anything special. That sounds casual but it was actually the output of a large development team working for a few years.

Here’s how it works with MongoDB:

Have Your Cake and Eat It Too!

  1. Fire up SlamData
  2. Connect it to MongoDB
  3. Write a query using SQL
  4. Run the query*
  5. Enjoy (share) the results

*The special sauce is this: SlamData translates the query into a highly optimized native query and then sends it down to MongoDB via the Aggregation Framework or MapReduce. It’s smart enough to figure out the best route on the fly. It’s 100% automated.

Did You Notice Anything Different?

Oh yeah you did.

  • Did you touch Mongo Query Language?
  • Did you write Javascript?
  • Did you think about MapReduce?
  • Did you worry yourself at all about schema this or schema that?

No, you didn’t!

When you use SlamData as the analytics layer on MongoDB what you get is radical simplicity and bulletproof queries that work all the time.

The bottom line is this: get out of the plumbing business. Or the janitor business. Or both. Stop maintaining one-off, homegrown apps, stop prepping data, stop cleaning up data, and most importantly stop waiting for insight. 

 

Bottlenecks or Robust Analytics On Live Data?

The significance of this… blockage… (that’s really the right word) is worth a visual metaphor. Let’s have some fun… It’s like the difference between a tricycle and a jet.

 

The tricycle actually can get you wherever you want to go, right? Say, from Denver to Boulder, CO. Go ahead and use it! It’s cheap. It’s practical. You’ll get there in a few days or weeks. Maybe longer. You’ll be a wreck by the time you arrive but you’ll arrive.

But, wait, there’s a jet fighter on the tarmac right next to the trike. It’s fueled up and ready to go. You’ll get to Boulder in 2 minutes!

Butter knife vs. Swiss Army Knife? Dust pan vs. Hoover? It’s hard to fully capture it visually so let me break it down one last time: based on a few premises, the final arithmetic is simple.

The Premises

1. You (your business) have finite resources.
2. Analysis of data is critical.
3. You’re using MongoDB.
4. Prep work is a cost — a big cost (see For Big-Data Scientists, ‘Janitor Work’ Is Key Hurdle to Insights).

The Final Calculus

  • If you could give up the overhead of learning a bunch of specialty technologies…
  • If you could instead continue to rely on SQL, the ‘tried and true’ workhorse of analytics…
  • If you could never worry about schema changes again…
  • If you could get developers back to working on development projects instead of pitching in on ad hoc biz-related queries or maintaining a custom application…
  • And if you could get insight today instead of next week…

Would you do it?

Actually, the best question is: “Why wouldn’t you?”

It’s Actually Going To Get Even Better

When SlamData releases its next version — later this year — you’ll see radical performance improvements. SlamData engineers have been hard at work building the infrastructure to make MongoDB analytics (NoSQL analytics!) as fast and as easy as what the world was used when it was just RDBMS.

Not only speed, though!

In addition, you’re going to get the ability to do JOINs across collection across a number of different NoSQL DBs. If you’re reading between the lines then you know we’ve figured out how to transcend the limitations of the native query APIs for the different NoSQL data sources. In other words, awesome JOIN power regardless of whether the native APIs support them.

That’s a game-changer.

This is an exciting time for NoSQL analytics. Actually, for analytics in general.

In fact, SlamData is starting to dismantle the line between SQL and NoSQL… because having one tool for all of your data — wherever it is — will change the way you work. It will change the way you think about data. You’ll no longer care about what kind of db you’re using — you’ll always go for the best db that fits the data — because you know you’ll get your analytics just the way you like ’em.

What Our Customers Are Saying

We use SlamData to build custom reports and have found the tool is exceptionally easy to use and very powerful. We recently needed to engage the support team and we were very pleased with the turn-around time and the quality of support that we received.

Troy Thompson
Director of Software Engineering
Intermap Technologies, Inc.

When our company migrated from SQL database to MongoDB, all our query tools became obsolete. SlamData saved the day! I was able to easily write SQL2 queries. Plus the sharing, charting, and interactive reports were a game changer.

Michael Melmed
VP, Ops and Strategy
US Mobile

Slamdata helped shine the light on how our new product was being used. The support staff was awesome and we saved engineering cycles in building all the analytics in-house. I am using it to change the mindset in the teams and shift the focus from product launches to product landings

Engineering Lead
Cisco Systems

News, Analysis and Blogs

WHITEPAPER

The Characteristics of NoSQL Analytics Systems

  • The Nature of NoSQL Data
    • APIs
    • NoSQL Databases
    • Big Data
    • A Generic Data Model for NoSQL
  • Approaches to NoSQL Analytics
    • Coding & ETL
    • Hadoop
    • Real-Time Analytics
    • Relational Model Virtualization
    • First-Class NoSQL Analytics
  • Characteristics of NoSQL Analytics Systems
    • Generic Data Model
    • Isomorphic Data Model
    • Multi-Dimensionality
    • Unified Schema/Data
    • Post-Relational
    • Polymorphic Queries
    • Dynamic Type Discovery & Conversion
    • Structural Patterns

© 2017 SlamData, Inc.

Do NOT follow this link or you will be banned from the site!

SlamData Provides Missing Platform for NoSQL Data Insight

“The ROI has been in saving my time building, refreshing and tweaking reports and saving the time of engineers, who would otherwise have to build custom reports for our analytics portal."

- Michael Melmed, US Mobile
Read the Case Study Now
The study was conducted by Constellation Research.
close-link
Click Me