Data Wrangling Handbook
by Open Knowledge Foundation
Publisher: School of Data 2012
The Data Wrangling Handbook is a companion text to the School of Data. Its function is something like a traditional textbook -- it will provide the detail and background theory to support the School of Data courses and challenges.
Home page url
Download or read it online for free here:
by Jan Bodnar - ZetCode
MySQL is a leading open source database management system. This is MySQL tutorial. It covers the MySQL database, various mysql command line tools and the SQL language covered by the database engine. It is an introductory tutorial for the beginners.
by Alan F Gates - O'Reilly Media
Apache Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs. The structure of Pig programs is amenable to parallelization, which enables them to handle very large data sets.
by Jacek Laskowski - GitBook
This collections of notes (what some may rashly call a 'book') serves as the ultimate place of mine to collect all the nuts and bolts of using Apache Spark. The notes aim to help me designing and developing better products with Apache Spark.
by Eric Redmond - GitBook
This is a free little book about Riak, a scalable, high availability NoSQL datastore. Riak is an open-source, distributed key/value database for high availability and near-linear scalability. Riak has remarkably high uptime and grows with you.