The following is a transcript of the the video above:
Qin from CouchBase:
Good afternoon, and welcome. I’m Qin from Couchbase. I run the product management team. And I’d like to welcome our partners from SlamData, Damon and Chris. And one of the fundamental building blocks of a Couchbase server is to support for the JSON document data model, all services from data service SQL to support JSON, there’s N1QL, and we’re introducing a full text search, again, running natively on JSON data type. And we’re previewing an unedited survey, the talk that’s coming next, again, working natively on JSON data. So therefore, it’s natural and strategic for us to partner with SlamData.
They are the leading BI and reporting tool that work natively on top of JSON data. So, there’s no need for doing data extraction, transformation, and loading. Can work directly against data that’s stored in your customary database. With that, I’ll turn over to Damon and Chris for their presentation and demo.
Thank you. Thank you, Qin.
My name is Chris, and I’m part of the SlamData team from outside of Philadelphia, and we are nominally based in Colorado. We have an office there. Most of our developers work remotely. We have a couple of folks in Boulder, but I think they go to the office. I think it’s the CEO who goes in. So, let me do some impromptu market research real quick. Be honest. Have you heard of SlamData prior to today? Raise your hand if you have. Money well spent then.
Okay, second question. Are you doing BI on Couchbase now? Are you involved with the project? Raise your hand if the answer is yes. That’s also I guess good news because you’re coming here to learn maybe how to do it. So, for those folks who did raise their hand, there was like three of you, off the shelf, or totally bespoke? Maybe just answer me.
Oop, did you … Whoever here answered yes that they’re involved in a project. You disappeared on me? Anybody wanna say whether it’s a … From the back. Okay, thank you. Alright, that was really helpful actually. So, the title of our presentation is Available Now., fancy, I did that graphic this morning, Full Blown Native Analytics Everywhere You Use Couchbase. I’m going to talk for 10 minutes. Damon is gonna do a demo for 15 minutes, a real demo, and then we’ll do some Q&A. So, SlamData 101.
We are direct analytics of modern data.CLICK TO TWEETMost of that means JSON, but we support other types as well. When I say direct it is it requires no understanding of what that data is. We require no mapping. You can query right against JSON. And it’s a novel innovation. It took us many years to do a lot of math. Our CTO John De Goes is also a mathematician, or I’d say he’s a mathematician first, technology second. He spent about three to five years extending relational algebra to make it possible to simply connect to Couchbase and see your data structures, and to start querying without any understanding of what that schema is, without moving, extracting data, and no mapping. So, that’s fundamentally simply put our product. It is not a service. It’s a software that you can download, run wherever you want. Runs on Mac, Windows, Linux. You can use it as a desktop app. Most of our clients put it on the server. So, currently these are the data sources that we support. At the end there it’s really Hadoop via Spark. But, we are positioned where our mission is to be the lingua franca.A polyglot world emerged seven or eight years ago, and that genie is not going back in the bottle. So, for CEOs and CTOs looking to have a simple way of doing analytics across their business, a new kind of model, new thinking is required. And through SlamData you can access any of these data sources. Later in the year, we’re adding RBDMS connector. You could even hit an API. Ultimately, it takes us about 48 weeks to add a connector, because we chose the route of going and creating a compiler, not an additional computation engine. So we compile queries, if you create one in our visual builder or write in SQL, we compile and shoot to the database via the native, or the supported API. It’s the database that does the computation.
The results come back. So, from a risk mitigation perspective or from a scalability over the next 10, 20 years, we believe that having a compiler as the strategic analytics tool or platform is advantageous. So, just picking out funny pictures with my kids. This one stuck. Think about that for a second. We go through a lot of demos and a lot of explanations of this, and we had the head of … NoSQL Analytics are a big consultancy in North America, and 45 minutes in he said, “So where do you move the data? “When do you move the data?” I was like, “No, no, no, you don’t move the data.” So for the last 40 years you brought the data to the analytics. So, we’re saying that’s not required. We can send the analytics to the data.
So, given the amount of data that’s been created just in the last couple of years, maybe just sit and kind of recognize that as a real flip with the way things have been done, and we think there’s a real advantage to doing that. So even The New York Times reports that 80% of big data analytics projects is pretty much the mapping, the ETL, the cleanup, the data prep, and we’re trying to do away with that completely. And we’ve done a pretty good job and you’ll see that. So, here’s basically how it happens, simple. Number one … I had in there earlier today query interface, but really it’s an ask a question interface. You can do a little bit of drag and drop. You can do straight SQL. That goes into our compiler. It translates that into N1QL or SQL++ or another data source language, shoot that to the database, computation happens there, we return the results, and you get your visualization. So, take that into practice.
Start over here, explore. So, you connect within five minutes less you’re connected. You’re seeing the structures of your data without telling us anything. You can start exploring through the drag and drop interface. You can, if you know SQL you can start writing SQL. You build your visualizations, and you can easily from SlamData send out a link. You can click on embed. If you’ve ever embedded a YouTube video it’s that simple. Take that HTML, stick it in your own app. Scale that to your internal users, your seven million customers.
It doesn’t matter to us. And so it embeds simply, and that’s what we call full blown analytics, full blown BI. This is kind of a simple use case. Someone else’s app power charts, our graphs, our interactivity. So it’s really we use markdown to create that interactivity, but it’s really, it’s a non technical way to build your own customer report. And this actually, I photoshopped a somewhat legitimate data set or use case. So, this was a healthcare app tracking people, a person. So, what are the benefits of going with a solution like ours? Well, I should’ve put up here, it works better than Tableau, Qlik, and Power BI, because they don’t support JSON. So, out of the box you get BI on your most complicated nested data.
There’s no latency. Well, it should be told it takes, if you have a monster query and it takes a couple of seconds to run, then you’ve got to wait a couple seconds, or maybe even a couple of minutes. But, it’s fresh data that this one, those if you stop by today, these graphs, these charts, it’s on live Couchbase data. So, you can embed it anywhere. You can scale it. It’s white label, open source, and the bold were open source. What are those bold, Damon? Advanced edition, thank you. So, that’s how we make money. We sell advanced features like multi tenant security, authentication, authorization, and auditing. So, everyone knows the Couchbase context. It’s not just operational. You have NoSQL powering apps of all sorts and all sorts of verticals. And it’s not just developers anymore. It’s end users, customers, analysts, architects, executive folks. And our contention is that these folks one will benefit from fresh insight from Couchbase. So, don’t settle. The old way of dumping, cleaning, mapping, that’s onerous, expensive, time consuming.
Doing a custom map, I’m sure that’s, I expect a lot more of you to say that you are involved and that you are doing something custom. Use us and the whole thing is solved for you. And don’t settle 2X or Version 2. So the world went JSON, but guess who didn’t come along? This is the Magic Quadrant from 2016 and 2017. The top half is like a wasteland, right? And the one on the right is 2017. So, Tableau and Microsoft, according to Gartner, are the leaders in BI, but they don’t touch JSON. Is that weird? I see one person nodding yes, so that is–
Male Audience Member:
That is weird.
That is weird. That’s because … We’re not there yet. That was 471, we can kind of go down. Say that again, I’m sorry. Yes, so maybe our timeline here. That’s more of our future timeline. So, it took us two and a half years to build the technology. We went through our Series A. And we just landed our Series A, and we’re just coming onto the market really. People have been downloading our, we started with Mongo probably a year and a half ago. So, give us a year. So, what’s coming for the remainder of the year?
I mentioned the RDBMS connector, streaming, and the real party animal here at the end is the distributed join. So, imagine doing a join across, Oracle and with Couchbase. That’s gonna be something that you can do. And here are some of our customers. And … So, partnering with Couchbase, our thesis is when we work with Couchbase we are together greater than the sum of our parts. So, I had 10 there, but I felt like that was a little ostentatious, I didn’t want to show you guys up, but it’s definitely more than two. And I’ve heard we had a lot of great conversations and we hope to follow up with every one of you and to continue to work with Couchbase. We’re really excited. And it’s showtime for Damon.
Thanks for your time.
Okay, so we can see here, you are probably familiar with this interface, the Couchbase Query Console. I just wanted to show you an example of one of the documents here inside of the different data sets that we’re going to play with. This one is from the Patients data set. Let’s see here. We’ve got lots of arrays, we’ve got sub documents, we’ve got arrays with sub documents, different data types, etc. So, just keep that in mind as we move forward here.
One of the key things, like Chris said, was that we don’t do ETL and we understand this nested data natively. So we don’t need to do any kind of ETL or time consuming process. So, this is our interface. This is actually one interface behind, so our newer one is a little cleaner than this, but I figured doing a demo probably rely on something tried and true rather than the latest greatest. So, you can see here we’ve got a number of different buckets. I’ll even show you one with the Beer sample in a little bit. Has everyone installed the different sample buckets? Has anyone done that with Couchbase before? You go in the Settings tab and you install the sample buckets? Yeah, so the Beer sample, Game Sim, those are all included with Couchbase. So, let me go in, and now that I’ve shown you the JSON aspect of it, if I go into the Patients bucket … Now keep in mind, this is all live. I didn’t do any mapping. I just told it … In fact, I’ll show you. I just told it basically the credentials, and the host, and the port, and it figures everything else out for you.
So, I click down into those areas. If I click on the Patients data set, it’s gonna ask me to create a workspace, and a workspace is where you do all of your analytics. It’s 95% of what you’re going to do with SlamData the product. Click Explore, and the first thing it does is come up with a preview table. I’m gonna zoom out a little bit here. And what you see is that immediately you can tell that the schema is along the top. And we see that we have age and its numeric value. City is a string.
So, if I wanted to do something like give the average age of men and women across the states, I can do that without knowing anything about the schema and without running and SQL. So, I’ll do that now. If I click on Setup Chart, these blue buttons are the what we call cards. Those are things that you can take action on now based off of the previous step you just took. So, if I click on Setup Chart, we have 15 or 16 different chart types. Each one has its own configuration screen and applies to different data models. If I click on Bar Chart. What it’s doing right now is getting a quick sample of the data and analyzing the schema. Then it comes back, and now what I can do is choose the category. Click on Choose Category, and you’ll notice it gives me pretty much the entire schema here. I can go down into codes. I can go down to the location. These are all the arrays if I wanted to go down into I could get more data out. But, if I want to categorize on state, I click on State, click Confirm. And notice when I click on Measure it’s gonna have to be a numeric value. So, SlamData automatically limits the choices to numeric values. Again, I just point this out because we didn’t have to specify those fields as numeric values. Oh yeah. Yeah, thanks for bringing that up. Alright, so I’m gonna measure on age.
And I don’t want the sum of age. I want the average age. And then, finally, we can do a series match, which is we can stack them on top of each other or do them side by side. Since I wanted to compare men and women, I’m gonna choose parallel and select gender. Gonna click Confirm. This all looks good, so I slide this again. And now my only option really is this Show Chart button. I click that, and now you can kind of see a relatively simple visualization of average age of men and women in different states. As you hold your mouse over and hover, it can give you the values. You can turn on or off different series of data. So, from going to just typing in your credentials and connecting to SlamData you can create something like this in about two or thee minutes without knowing anything about the schema or without learning SQL. You can take this now that you’ve created it and you can click this icon and then click Publish. And what that does is it gives you a read only URL, which you can share with people in your company or something. Click Preview. And what you’re gonna find is, or what you’re going to see, is that exact same screen but with one difference is that this is a read only version of it, meaning I can’t do anything.
I can’t go backwards or forwards in this workspace, whereas if I go back to my what I call the author’s deck I can go back and change the configuration, etc. So, this gives you an easy way to show a report. And this is all live. So, if you sent this to one of your coworkers and they went to it tomorrow it would be updated with that data. And, it’s just as easy to embed that into your own application by instead of clicking Publish you can click Embed Deck, and Embed Deck gives you the code to literally paste into your own application. And Chris uses the situation it’s very similar to the YouTube URL if you want to just pop that into your own application or webpage and it shows the video. Almost the same approach to that. So, that’s as easy as it gets in creating simple visualizations. Now we’ll go and show you a couple different types that we have. So, time series. So, this is a very flat data set, but it’s good for demoing, as you can see. I’m gonna have a series of data. So, if I click on Series or click on Time Series, click Explore, just like we did last time, we see the data. Very simple, right?
We’ve got the time stamp, sensor, and value. There’s five sensors in here. So, I’m going to swipe over, set up a chart. This one works pretty well with the area chart. On the dimension side, that’s gonna be the bottom part, so I’m gonna click DT, which is date time. Measure is automatically selected, because that was the only numeric value. And then Series, I want to see all the sensors separately. And then these are just a couple of options, and each of these screens for the chart types are different. And I’m going to advance, and show chart. And so this is what time series looks like visualized. So, this could also be nested data. It doesn’t have to be flat, of course. And this is very simple. Again, no SQL involved, really not even any understanding of the schema itself. We’ll get into a little bit more complex workflows here in a moment. And everything you see here is interactive. And, we also have the ability to make it truly interactive. This is a workflow, or a worksheet, workspace that I’ve already created. It’s similar to what you just saw, zoom out a little bit, except that now we have some dropdowns that help define the beginning and the end of the demo or of the workspace. So, you can see that’s actually updated. You can hover over and see all the different values. These are still live. So, this was easy to create. You can use what we call our form builder, and it’ll create different fields. You can have dropdowns, date pickers, time, entry, numeric values, whatever you like. And those values then get fed into the query that produces that. Alright, so now let me show you one here. So, if you never looked at this before, we’ve got two different data sets. We’ve got Beer and Brewery, and these both come with the Couchbase default.
You can install them in the Settings tab. So, the reason this is an important workspace, and I’ll show it to you, is because this actually does a join across both of those data sets. As you can see here, this is gonna show the average, what is it, ABV, average alcohol by volume? Is that what the acronym is? ABV. So, the Beer data set has all the different beers, but you don’t know anything about the country unless you combine it with the Brewery. So, that’s why we’re doing it here. So, if I select something like German ale, I guess, it goes out there, queries it, comes back, and it just did a join and it shows that on average I guess Brazil has the highest ABV. Now, it’s not super striking, but this is a really easy thing to create, and I’ll show you the query behind it. And I’m going kind of into edit mode now where you can see how to modify and adjust these workspaces. You can drag and drop, and add new cards. You can set up something where you can download the tech or download the data behind the visualization, as well. I’m gonna go back a little bit, and now you can see the actual query that I’m running behind the scenes. So, we’re just selecting it from the Brewery data set, the Beer data set, and we’re joining it on what we call the meta ID tag, which is one level up, and then we just connect it to the Beer data set.
This category is inherited from whatever they select here inside of this dropdown. So, you can obviously pass in variables from dropdowns, but you can also pass in variables if you’re embedding this what we call deck into a third party application. So, as you pull in this, or as you pull this in through an iframe you can pass in your own variables which will then be inserted into the query. So, it provides a lot of functionality and a lot of flexibility. And what this looks like embedded wise, we’ve got this little application called Health Track, and what it does, it’s a really simple little application, but the reason for its existence is that we have this Reports section here, and if I click on Patients Dashboard, it’s gonna pull up something I’ve already created. And this is pulling it live from SlamData from Couchbase. Zoom out a little bit. So, this is the kind of stuff that you can do without knowing much. Now some of these, like this one here actually has SQL running behind the scenes.
Some of these others are just simply drag and drop and click and I didn’t do any SQL. But, this one is actually went down into the sub array of codes and looked for the sub string called flu. So, we used SQL, the sub string %flu%, and then converted that into N1QL and ran all the computation on Couchbase and just sent the results back, and then we’re visualizing it here. So again, all these are interactive. You can turn on and off the different data series. And it’s as simple as that. You saw earlier the different code that you can paste into your own application. We didn’t do any ETL for this, and this probably took about 30, maybe 40 minutes to create all of this.
Now, think about what you’d have to do on the, not just the SQL side or the Couchbase side, but also the development side of the house to get something like this to easily embed it also with security authentication and authorization. So, that’s kind of the whirlwind tour of the technical side of SlamData. I’ve got a couple of prepared questions if nobody has questions, but if you have any questions I’d be happy to take them. Yes. Yes, yeah. So in fact, if I created a new workspace like this, one of the things you can do is a setup variables card. If I click that you can have all sorts of different object types that you can pass in.
So, if I selected date time then you could have a default, and you would select that default so that if nobody passed in a value for that variable it would use this default. But, if they did pass something in then you could reference this by myVariable or something, and you could put that inside of your query. So, depending … As an example, if you got third party app and you’re pulling in a generic workspace, you can change that workspace based off of who’s looking at it, and it’s very flexible that way. Good question, thanks. Yes. So, we don’t store any data. So, we talk to Couchbase directly with N1QL, so there wouldn’t have to be … That’s part of our next advanced edition is to actually create indexes based off of the queries that you’re running, but right now we don’t create any indexes. In fact, we’re a read only SQL, or read only language, so the only thing we would ever really do is to create indexes like that. We would never update the actual documents. But yeah, we rely right now on whoever set up Couchbase to have the right indexes in place.
Yes. If I heard you correctly, we always hit live data. We don’t cache anything. We have the ability to cache, and in fact if I move forward one you can see here we’ve got a Cache button, and that materializes the view if that’s what you want. But, one of the things about using SlamData is that you’re using this for live data. So, if you wanted to cache something, there are some use cases for that, but especially if you wanted it to be interactive where a user kind of clicks things and changes the result set, then you probably wouldn’t want to cache that. But, we give you the option for sure. Physical data will always be stored in the target data source unless you cache it. Right, right, yeah. In fact, so if we have a preview table that shows like the first bit of data and you wanted to page through the rest of it it makes another call to fetch the next set, yeah. Right. I’ve not run into that. In fact, maybecould answer a little bit more about that when we get in the analytics questions probably later about read locks and if you’ve ever experienced something like that. But, we’ve never experienced anything like that.
Do you want to talk a little bit about the set up of the parallel cluster?
Well yeah, so with Couchbase you know how you have clusters and have different services running on those? You could always point this directly to a system or a server that maybe isn’t as heavily processed and maybe isn’t considered your go to production server or something. I mean, they’re all kind of available, right? But, if you know that one isn’t being hit as much you could always use that for your analytic service so you’re not too worried about it, because again, like you said, it’s read only and we don’t write, much less percent chance of running into a lock. And I’m not really sure, unfortunately, about Couchbase lock mechanisms, so I can’t speak in any way authoritatively about that. Yes. Create variables based off of values and such? Yeah, in a way, yeah. So, you could change variables that you’ve already defined. You can do case statements inside of our SQL.
You can do if case, if when case when and create things. You can even change the SQL dynamically based off of variables as well. In fact, I’ve got some other examples. I don’t have them on my local data set, but on our cloud one basically we have this thing that shows that it’s pulling from multiple data sets, and if it’s a different data type it’ll actually change the SQL to do a different type of select as it comes across it. So, it’s very dynamic in that respect. Thanks for the question, it’s good. And that’s one I’d probably want to reiterate is that, again, it doesn’t matter what the data type is.
Any of the fields will adjust dynamically to it, especially when it comes to visualizing it or just displaying the values. And if you want to do some intelligence on it, we have both SQL, where you can do some manual type of checking of the values or data types, but in our next version, 4.2.2, which is coming out in about a month, we’ll have this … It’s similar to a transformation, the extract transform and load kind of stuff, but it’s a dynamic type of thing. So, as the data comes through this pipeline or this workflow you can do different things on it while it comes through. So, if it’s a string you can check off, it has this format, then turn it into a date. Or, if it is a date, maybe just give me the quarter, or the year, or the week of the month type of thing.
So, there will be all those types of different things. And that’s called the Structure Editor, which we actually have. It’s the Structure Viewer right now, and it helps quite a bit, but the Structure Editor will be available probably by next month. So, some of you probably noticed that when I was up here you’re probably wondering what these actually are. If you look at Beer and Brewery, those are logical kind of data types that are in Couchbase when you load up that data set. So, this is a virtual file system inside of SlamData. So, when you run a query and you’re doing select star from something, it would be something like /couchbase/beer-sample/beer, and what it does is actually make Beer look like a single file, but that’s actually a collection of all the different documents that have the type of beer. That’s one way we approach it. And let’s see. So, this runs, the front end and the back end typically run on the same box, the different server parts of it.
We have our advanced edition Chris was kind of talking about earlier as authentication, authorization, and auditing. And what you can do with that is use OAuth 2 or an OpenID Connect provider, and you can use that for a single sign on. And once it authenticates we use an internal authorization mechanism. So, we don’t go out to like LDAP or Active Directory and pull down like your permissions. We have that as an internal model. So if you wanted to you could limit exactly who can look at what, whether it’s a full workspace or whether it’s a collection, or if it’s a directory, whether they can mount or unmount different databases with a very fine grain kind of security model built in there.
And so to provision new users we will have an admin interface. Right now it’s an API interface that you can use, but in Version 4, excuse me, 4.2.5, this summer, we’ll have a full admin UI. And the reason we started with an API first is because most people have this identity management or provisioning, an access management provisioning, where when you create a new user or a new hire comes on they call out and they make an account here, an account there, change passwords here and there. And so we had an API, because we didn’t want to reinvent the wheel just for analytics. But, it integrates fairly well to most things.
Yes, sir. So, how does it pull up as a beer and brewery basically? I wish I had our engineer here that wrote our connector. So, everything you see here is being done through our connector to Couchbase, and it has to do a little bit of manipulation to make it look a certain way up here. And all I really know is that these are kind of logical document types. So, after you create the index on there, I think it might actually look at the index, the type index. So like I created two different indexes on here. I think it probably samples the data and then comes up with it. That is my guess.
Yeah, I’m sure we’re gonna have a meeting in the next couple weeks, you and I and a couple other folks, and we’ll be able to get a firm response on that. But, since I’m not in engineering unfortunately I can’t tell you. Yeah, sorry. Okay, if that’s it. So, we’ve got … Do you have anything else on your slide deck that you wanted to go through?
Maybe talk about the Department of Homeland Security, the kind of use case that’s maybe non traditional that you can do when you bring together NoSQL and analytics and the data hub.
Yeah, so one of the options is you don’t have to just … Let’s say you have multiple different database types out there, an Oracle, and a Couchbase, and something else. A lot of people use NoSQL, especially Couchbase, to do like a data hub model where you have a lot of different fields from different databases pouring into Couchbase.
And then what Department of Homeland Security did is they have maybe X number of systems and they compile all that data down into a single system, and then they use SlamData to actually pull that data out and to make educated decisions based off of different things. So, they have all sorts of different systems being pushed into the database, and then we use it to pull it out and visualize all that data. So it’s–
Well not pull it out technically.
Well no, we don’t pull it out, being we read it, yeah.
But that use case really just showed up at our door. We didn’t build it thinking that it would be used that way, but all of a sudden people were saying, “This is a really good way to do a data warehouse “but do it in a day without the expense or overhead. “Just slap sync it in and put SlamData on top.” So, I’d say probably half of our customers have that as a use case. I think healthcare is also, I’d say all of our healthcare are doing that, because they’re stuck in record hell with different data format types, and they find it a lot easier than creating the unifying schema to just throw it into a bucket and read it as is. So … I guess a third use case is like if you built a Sass company or a Sass product using SlamData you get to see in every part reporting across different functions of the business.
One of our customers uses SlamData to create reports for the CFO, reports for product management, and really I remember the first meeting where we tapped into their data and we were like, “Wow, we’ve had this product for two years. “We’ve never seen this.” Like they did it in ad hoc ways and there’s no process. So they were thrilled to actually see their business, as funny as that sounds. But, I think overall the decision to buy the technology, the NoSQL technology, was made over here by this person, and then all of a sudden it comes online and then the business folks are like, “Where’s our data? “Where’s our analytics?” So we’re in that exciting space where everyone loves NoSQL data storage, and everyone wants what they’ve always had. Fortunately for us, Tableau hasn’t figured out how to do it.
Okay. Anything else? Our website is SlamData.com. This is open source software, except for the advanced edition, of course. So, you can go to GitHub.com/slamdata and download it. Easy build instructions. Or, you can go to SlamData.com and download a 30 day trial if you don’t want to bother with building it you can do it that way, too.
Yeah, we have different types of engagement levels for commercial purposes, so if you’ve got certain SLAs, or support requirements, or services requirements. get in touch with us. Otherwise, feel free to download it and go to town with this. It’s a fun product. I think you’ll enjoy it. Alright, thank you.
And thank to Qin and Couchbase for hosting us. Thanks.