Guest blog post by Kumar Chinnakali
The term Data Lake has been gaining popularity recently as most of the enterprises have incorporated it into their analytics software’s. Every word and phrase that is used to describe Data Lake have provided us much useful information about how we interpret it.
So we at dataottam decided to understand the various ways Data Lake could be defined. So we conducted a survey and found very interesting thoughts, words and phrases used for defining Data Lake, from developers to founders, to name a few like Kirk Borne[Principal Data Scientist] from Booz Allen Hamilton, Sonal Goyal[Founder] from Nube Technologies, Sanjay Pande[CTO] from RapidGenDS, Brian Krpec[CEO] from Big Data Elephants and Dhivya R[Chief Data Scientist] from Grandatos.
Out of overwhelmed responses we have filtered top 16 definitions of Data Lake, as a new beginning for 2016.
We thank each and every one of you for your time and sharing your definition on Data Lake.
Below are 3 of these 16 definitions.
- The Data Lake is an integrated big data storage system, enriched with metatags, into which you pour all data, for discovery, decision support, and innovation. - Kirk Borne, Principal Data Scientist at Booz Allen Hamilton and founder of RocketDataScience.org
- A Data Lake as scalable storage and processing that is capable of retaining disparate data of virtually limitless volume. While at the same time providing insight into its content (metadata) and lineage. - Ryan Lane, Senior IT Systems Analyst at Genworth Financial
- The repository were all kinds of raw data are stored and processed for enterprise to drive actionable insights. - Jithin S L, Hadoop Analytics Lead - Walmart,CoE at IBM