Subscribe to our Newsletter

Here I only posted the two categories of biggest interest to DSC readers, but it covers plenty of other topics, including:

  • Distributed Programming
  • Graph Data Model
  • NewSQL Databases
  • Time-Series Databases
  • SQL-like processing
  • Data Ingestion
  • R-Studio - IDE for R.
  • Service Programming
  • Scheduling
  • Benchmarking
  • Security
  • Search engine and framework
  • Memcached forks and evolutions
  • Embedded Databases
  • Business Intelligence
  • Data Visualization
  • Interesting Readings
  • Interesting Papers

Click here to access the very long list (possibly too long, in my opinion). Other interesting lists include the data science cheat sheet and DSC's list of lists. Note that at first glance, it seems to be about open source and code publicly shared, not enterprise software (I did not find Tableau nor RapidMiner for instance). There was no section on data mining, still, it's a very impressive list.

Data Visualization

  • Arbor - graph visualization library using web workers and jQuery.
  • Chart.js - open source HTML5 Charts visualizations.
  • Cubism - JavaScript library for time series visualization.
  • D3 - javaScript library for manipulating documents.
  • Envisionjs - dynamic HTML5 visualization.
  • Grafana - graphite dashboard frontend, editor and graph composer.
  • Graphite - scalable Realtime Graphing.
  • Google Charts - simple charting API.
  • Highcharts - simple and flexible charting API.
  • Matplotlib - plotting with Python.
  • NVD3 - chart components for d3.js.
  • Peity - Progressive bar, line and pie charts.
  • Recline - simple but powerful library for building data applications in pure Javascript and HTML.
  • Sigma.js - JavaScript library dedicated to graph drawing.
  • Vega - a visualization grammar.

Machine Learning

  • Apache Mahout - machine learning library for Hadoop.
  • brain - Neural networks in JavaScript.
  • Cloudera Oryx - real-time large-scale machine learning.
  • Concurrent Pattern - machine learning library for Cascading.
  • convnetjs - Deep Learning in Javascript. Train Convolutional Neural Networks (or ordinary ones) in your browser.
  • Decider - Flexible and Extensible Machine Learning in Ruby.
  • etcML - text classification with machine learning.
  • Etsy Conjecture - scalable Machine Learning in Scalding.
  • H2O - statistical, machine learning and math runtime for Hadoop.
  • MLbase - distributed machine learning libraries for the BDAS stack.
  • MLPNeuralNet - Fast multilayer perceptron neural network library for iOS and Mac OS X.
  • nupic - Numenta Platform for Intelligent Computing: a brain-inspired machine intelligence platform, and biologically accurate neural network based on cortical learning algorithms.
  • PredictionIO - machine learning server buit on Hadoop, Mahout and Cascading.
  • scikit-learn - scikit-learn: machine learning in Python.
  • Spark MLlib - a Spark implementation of some common machine learning (ML) functionality.
  • Vowpal Wabbit - learning system sponsored by Microsoft and Yahoo!.
  • WEKA - suite of machine learning software.

Originally posted on Data Science Central

E-mail me when people leave their comments –

You need to be a member of Hadoop360 to add comments!

Join Hadoop360

Featured Blog Posts - DSC