Subscribe to our Newsletter

All Posts (382)

The Phoenix framework has been growing with popularity at a quick pace, offering the productivity of frameworks like Ruby on Rails, while also being one of the  fastest frameworks available. It breaks the myth that you have to sacrifice performance in order to increase productivity.

So what exactly is Phoenix?

Phoenix is a web framework built with the Elixir programming language. Elixir, built on the Erlang VM, is used for building low-latency, fault-tolerant, distributed systems, which are increasingly necessary qualities of modern web applications. You can learn more about Elixir from this blog post or their official guide.

If you are a Ruby on Rails developer, you should definitely take an interest in Phoenix because of the performance gains it promises. Developers of other frameworks can also follow along to see how Phoenix approaches web development.

Meet Phoenix on Elixir: A Rails-like Framework for Modern Web Apps

In this article we will learn some of the things in Phoenix you should…

Read more…

A Guide to Managing Webpack Dependencies

The concept of modularization is an inherent part of most modern programming languages. JavaScript, though, has lacked any formal approach to modularization until the arrival of the latest version of ECMAScript ES6.

In Node.js, one of today’s most popular JavaScript frameworks, module bundlers allow loading NPM modules in web browsers, and component-oriented libraries (like React) encourage and facilitate modularization of JavaScript code.

Webpack is one of the…

Read more…

Where & Why Do You Keep Big Data & Hadoop?

Guest blog post by Manish Bhoge

I am Back ! Yes, I am back (on the track) on my learning track. Sometime, it is really necessary to take a break and introspect why do we learn, before learning.  Ah ! it was 9 months safe refuge to learn how Big Data & Analytics can contribute to Data Product.

DataLake

Data strategy has always been expected to be revenue generation. As Big data and Hadoop entering into the enterprise data strategy it is also expected from big data infrastructure to be revenue addition. This is really a tough expectation from new entrant (Hadoop) when the established candidate (DataWarehouse & BI) itself struggle mostly for its existence. So, it is very pertinent for solution architects to raise a question WHERE and WHY to bring the Big data (Obviously Hadoop) in the Data Strategy. And, the safe…

Read more…

Guide To Budget Friendly Data Mining

Unlike traditional application programming, where API functions are changing every day, database programming basically remains the same. The first version of Microsoft Visual Studio .NET was released in February 2002, with a new version released about every two years, not including Service Pack releases. This rapid pace of change forces IT personnel to evaluate their corporation’s applications every couple years, leaving the functionality of their application intact but with a completely different source code in order to stay current with the latest techniques and technology.

The same cannot be said about your database source code. A standard query of SELECT/FROM/WHERE/GROUP BY,…

Read more…

Associative Data Modeling Demystified - Part2

Guest blog post by Athanassios Hatzis

Association in Topic Map Data Model

Introduction

In the previous article of this series we examined the association construct from the perspective of Entity-Relationship data model. In this post we demonstrate how Topic Map data model represents associations. In order to link the two we continue with another SQL query from our relational database

```
SELECT suppliers.sid,
suppliers.sname,
suppliers.scountry,…

Read more…

Associative Data Modeling Demystified - Part1

Guest blog post by Athanassios Hatzis

Relation, Relationship and Association

While most players in the IT sector adopted Graph or Document databases and Hadoop based solutions, Hadoop is an enabler of HBase column store, it went almost unnoticed that several new DBMS, AtomicDB previous database engine of X10SYS, and Sentences, based on associative technology appeared on the scene. We have introduced and discussed about the…

Read more…

Originally posted on Data Science Central

1328118?profile=RESIZE_1024x1024

Recently, in a previous post, we reviewed a path to leverage legacy Excel data and import CSV files thru MySQL into Spark 2.0.1. This may apply frequently in businesses where data retention did not always take the database route… However, we demonstrate here that the same result can be achieved in a more direct fashion. We’ll illustrate this on…

Read more…

Guest blog post by Marc Borowczak

1328099?profile=RESIZE_1024x1024

Moving legacy data to modern big data platform can be daunting at times. It doesn’t have to be. In this short tutorial, we’ll briefly review an approach and demonstrate on my preferred data set: This isn’t a ML repository nor a Kaggle competition data set, simply the data I accumulated over decades to keep track of my plastic model collection, and as such definitely meets the legacy standard!

We’ll describe steps followed on a laptop VirtualBox machine running Ubuntu 16.04.1 LTS Gnome. The following steps…

Read more…

I first heard of Spark in late 2013 when I became interested in Scala, the language in which Spark is written. Some time later, I did a fun data science project trying to predict survival on the Titanic. This turned out to be a great way to get further introduced to Spark concepts and programming. I highly recommend it for any aspiring Spark developers looking for a place to get started.

Today, Spark is being adopted by major players like Amazon, eBay, and Yahoo! Many organizations run Spark on clusters with thousands of nodes. According to the Spark FAQ, the largest known cluster has over 8000 nodes. Indeed, Spark is a technology well worth taking note of and learning about.

apache spark tutorial

This article provides an introduction to Spark including use cases and examples. It contains…

Read more…

Ember Data (a.k.a ember-data or ember.data) is a library for robustly managing model data in Ember.jsapplications. The developers of Ember Data state that it is designed to be agnostic to the underlying persistence mechanism, so it works just as well with JSON APIs over HTTP as it does with streaming WebSockets or local IndexedDB storage. It provides many of the facilities you’d find in server-side object relational mappings (ORMs) like ActiveRecord, but is designed specifically for the unique environment of JavaScript in the browser.

While Ember Data may take some time to…

Read more…

Fast Forward transformation with SPARK

Fast forward transformation process in data science with Apache Spark

Data Curation :

Curation is a critical process in data science that helps to prepare data for feature extraction to run with machine learning algorithms. Curation generally involves extracting, organising, integrating data from different sources. Curation may be a difficult and time consuming process depending on the complexity and volume of the data involved.

Most of the time data won't be readily available for feature extraction process, data may be hidden is unobstructed and complex data sources and has to undergo multiple transformational process before feature extraction .

Also when the volume of data is huge this will be a huge time consuming process and can be a bottle neck for the…

Read more…

11 Great Hadoop, Spark and Map-Reduce Articles

This reference is a part of a new series of DSC articles, offering selected tutorials, references/resources, and interesting articles on subjects such as deep learning, machine learning, data science, deep data science, artificial intelligence, Internet of Things, algorithms, and related topics. It is designed for the busy reader who does not have a lot of time digging into long lists of advanced publications.

1328103?profile=original

11 Great Hadoop, Spark and Map-Reduce Articles

Read more…
Google formally announced Android 7.0 a few weeks ago, but as usual, you’ll have to wait for it. Thanks to the Android update model, most users won’t get their Android 7.0 over-the-air (OTA) updates for months. However, this does not mean developers can afford to ignore Android Nougat. In this article, Toptal Technical Editor Nermin Hajdarbegovic takes a closer look at Android 7.0, outlining new features and changes. While Android 7.0 is by no means revolutionary, the introduction of a new graphics API, a new JIT compiler, and a range of UI and performance tweaks will undoubtedly unlock more potential and generate a few new possibilities.
Read more…

Java versus Python

Originally posted on Data Science Central

Interesting picture that went viral on Facebook. We've had plenty of discussions about Python versus R on DSC. This picture is trying to convince us that Python is superior to Java. It is about a tiny piece of code to draw a pyramid.

1328061?profile=original

This raises several questions:

  • Is Java faster than Python? If yes, under what circumstances? And by how much? 
  • Does the speed of an algorithm depend more on the…
Read more…

Originally posted on Data Science Central

These are the findings from a CrowdFlower survey. Data preparation accounts for about 80% of the work of data scientists. Cleaning data is the least enjoyable and most time consuming data science task, according to the survey. Interestingly, when we asked the question to our data scientist, his answer was:

Automating the task of cleaning data is the most time consuming aspect of data science, though once done, it applies to most data sets; it is also the most enjoyable because as you automate more and more, it frees a lot of time to focus on other things.

Below are the three charts…

Read more…

Why Not So Hadoop?

Guest blog post by Kashif Saiyed

Does Big Data mean Hadoop? Not really, however when one thinks of the term Big Data, the first thing that comes to mind is Hadoop along with heaps of unstructured data. An exceptional lure for data scientists having the opportunity to work with large amounts data to train their models and businesses getting knowledge previously never imagined. But has it lived up to the hype? In this article, we will look at a brief history of Hadoop and see how it stands today.

2015 Hype Cycle – Gartner

 
hadoophype

Some key takeaways from the Hype cycle of 2015:

  1. ‘Big Data’ was at the Trough of Disillusionment stage in 2014, but is not seen in the 2015 Hype cycle.
  2. Another interesting point is that ‘Internet of Things’ which suggests a network of interconnected devices around us, is at peak for 2 years consistently…
Read more…

Originally posted on Data Science Central

Summary

Introducing Data Science teaches you how to accomplish the fundamental tasks that occupy data scientists. Using the Python language and common Python libraries, you'll experience firsthand the challenges of dealing with data at scale and gain a solid foundation in data science.

About the Technology

Many companies need developers with data science skills to work on projects ranging from social media marketing to machine learning. Discovering what you need to learn to begin a career as a data scientist can seem bewildering. This book is designed to help you get started.

About the Book

Introducing Data ScienceIntroducing Data Science explains vital data science concepts and teaches you…

Read more…

Originally posted on Data Science Central

Summary:  This is the first in a series of articles aimed at providing a complete foundation and broad understanding of the technical issues surrounding an IoT or streaming system so that the reader can make intelligent decisions and ask informed questions when planning their IoT system. 

In This Article

In Lesson 2

In Lesson 3

Is…

Read more…

Originally posted on Data Science Cental

Cloud giants like Amazon, Google, Azure and IBM have rushed into the big data analytics cloud market.  They claim their tools will make developer tasks simple. For machine learning, they say their cloud products will free data scientists and developers from implementation details so they can focus on business logic.  …

Read more…

Featured Blog Posts - DSC