Subscribe to our Newsletter
Andrew Lion posted a blog post
Originally posted on Data Science Central
Orientation
In both semantic model standards Topic Maps and RDF/OWL and in many other NoSQL approaches to solve efficiently the problem of how to represent relations and relationships one major stumbling blo…
Aug 12, 2015
Andrew Lion posted a blog post
Originally posted on Data Science Central
There are at least two definitions for Big Data: a broad sense definition and a strict sense definition. For the broad sense definition, Big Data includes all the possible available data on earth.  For the…
Aug 5, 2015
Andrew Lion posted a blog post
Originally posted on Data Science CentralIn my last post, I covered setting up the basic tools to start doing machine learning (Python, NumPy, Matplotlib and Scikit-Learn).  Now, you are probably wondering how to do this on a very large scale, invol…
Jul 28, 2015
Andrew Lion posted a blog post
Summary:  Here are five additional factors you’ll want to consider when putting together your short list.
If you stuck with us through the first eight lessons you now know most of the major considerations in selecting the right NOSQL or NewSQL datab…
Jul 22, 2015
Andrew Lion posted a blog post
Summary:  Graph databases are your go-to choice when a relationship among the data items is key.
Up to about 1999 web search engines evaluated each web page as a standalone entity, ranking them based on content without regard to any other pages.  Bu…
Jul 14, 2015
Andrew Lion posted a blog post
Summary:  Unless you have special needs Document Oriented DBs are your most likely default choice.
Second in popularity in the business world behind Key-Value-Stores are Document Oriented Databases.  Here an entire document is treated as a record. …
Jul 13, 2015
Andrew Lion posted a blog post
Summary:  These are the features common to most NOSQL databases.  Be on the lookout for any fundamental differences.Before we get to the specific pros and cons of the four NOSQL database types there are some features and capabilities true of most of…
Jul 7, 2015
Andrew Lion posted a blog post
Summary:  In general, it is true that NOSQL databases can do everything that RDBMS can do.  And almost always when data is ‘big’ they can do it faster and cheaper.  There is one exception where you’ll need to pay close attention.
In a technical disc…
Jun 30, 2015
Andrew Lion posted a blog post
I am not an expert in database design, since most of my career I have worked with alternate data storage / data access solutions. But one of the very first projects I had to do back in 1985 when I was a student was to write the code for a fully func…
Jun 24, 2015
Andrew Lion posted a blog post
With big data, one sometimes has to compute correlations involving thousands of buckets of paired observations or time series. For instance a data bucket corresponds to a node in a decision tree, a customer segment, or a subset of observations havin…
Jun 17, 2015
Andrew Lion posted a blog post
Guest blog post by William VorhiesSummary:  If you’re making the decision to use NoSQL, how do you quantify the value of the investment?If you are exploring NoSQL, once you become educated on the basics there are two questions that will rapidly move…
Jun 10, 2015
Andrew Lion posted a blog post
Guest blog post by Vincent GranvilleIn this article, I proposes a simple metric to measure predictive power. It is used for combinatorial feature selection, where a large number of feature combinations need to be ranked automatically and very fast,…
Jun 3, 2015
Andrew Lion posted a blog post
Guest blog post by Fari Payandeh

These are not ordinary times for CIO's. The news about Netflix's Big Data success, and Big Data's involvement in predicting the last election results might have created a buzz among the IT workers, but it was the NS…
May 27, 2015
Andrew Lion posted a blog post
Guest blog post by Bernard MarrBig Data still causes a lot of confusion in people's heads: What really is it? What is new and what is old wine in new bottles? In order to bring a little more clarity to the concept I thought it might help to describe…
May 25, 2015
Andrew Lion posted a blog post
Guest blog post by Ashu KumarLet's get over the hype. Unless you are a data driven giant like Amazon or Yahoo, there is no need to spend millions to dig and fill a big data lake. You might be better off with data junk yard strategy. It will let you…
May 19, 2015
Andrew Lion posted a blog post
Guest blog post by Mohammad Tariq Iqbal
Quite often, while working with Hbase, I used to feel how cool it would be to have a database that can replicate my data to datacenters across the world consistently. So that I can take the pleasure of global…
May 12, 2015
More…