Guest blog post by Bill Vorhies
It’s taken a lot of effort to keep up with the changes in RDBMS and NoSQL data base design. We chose to live in this fast changing profession so it’s up to us to keep up. But just about the time I thought I had a handle on this, the field fragments again.
NoSQL: Basically we’re talking about Hadoop here. The architecture is MPP, horizontally scalable but only eventually consistent. But the NoSQL name hardly fits anymore with the essential demise of MapReduce and the proliferation of all the SQL-on-Hadoop tools including Hive, Presto, Drill, Impala, and a few others I’ve probably missed here.
RDBMS: This hardly warrant a review since it remains the go-to architecture for many if not most applications. Yes even in this NoSQL age architects may still prefer this model. You get SQL and ACID. What you don’t get is (easy) horizontal scaling on commodity hardware.
NewSQL (now “Avant Garde RDBMS”???): Here’s where this starts to break down. I used to think I had a good handle on NewSQL. It was RDBMS on steroids because you got SQL, ACID, and distributed MPP architecture. Now Gartner wants to change the name! I wish they’d stop doing that.
In mid-November Gartner’s Adam Ronthal released a report entitled “When to Use New RDBMS Offerings in a Dynamic Data Environment.” In which he renames this category “Avant Garde” RDBMS. He says, “emerging RDBMS vendors are pushing the boundaries of scalability, distributed processing, and hybrid on-premises and cloud deployments, offering new functionality and capabilities for information leaders.” (Well is it ‘emerging RDBMS’ or ‘Avant Garde’. How about some consistency here.) He goes on to predict “Through 2019, 70% of new projects requiring scale-out elasticity, distributed processing and hybrid cloud capabilities for relational applications, as well as multi-data-center transactional consistency, will prefer an emerging RDBMS over a traditional RDBMS.” He may be completely correct, but what was wrong with just sticking with the NewSQL name?
Hybrid Transactional Analytic Platforms (HTAPs): Yes this is still a new category emerging over the last year or so. Most of the developers don’t even call them DBs prefering the term ‘platform’. The common factor is that they are completely in-memory (mostly DRAM but potentially a little SSD around the edges for less active storage). For more background see The Need for Speed. The thing that truly sets these apart is that they are optimized for BOTH transactional and analytic tasks SIMULTANEOUSLY. This left most of us scratching our heads but yes, full ACID and dually optimized. Wasn’t supposed to be possible but here it is. All the majors now offer these including SAP, Oracle, Microsoft, IBM, and Teradata. And most of the developers that used to be NewSQL have entered this space as well including VoltDB, NuoDB, Clustrix, and MemSQL.
Digging a little deeper into just how it is possible to do both these things at once I came across this December 2014 IDC report “The Analytic Transactional Data Platform: Enabling the Real Time Enterprise”. You may be able to find it gratis from one of the developers mentioned above. If you’re a motivated architecture type, the report details five different architectural strategies used by the various developers to create this magic. No doubt these tech strategies will mutate over time.
Here’s one thing Gartner probably got right. “By 2017, all leading operational DBMSs will offer multiple data models, relational and NoSQL, in a single platform.” None of the major players can afford to be without any of these capabilities. So stay tuned. Another major change is underway.
December 3, 2015
Bill Vorhies, President & Chief Data Scientist – Data-Magnum - © 2015, all rights reserved.
About the author: Bill Vorhies is President & Chief Data Scientist at Data-Magnum and has practiced as a data scientist and commercial predictive modeler since 2001. Bill is also Editorial Director for Data Science Central. He can be reached at: