Big Data Books Reviews

Hadoop: The Definitive Guide


  • Level Ent.
  • Level Mid.
  • Level Adv.

Read complete this book@Dec. 2012. This is a really a good Hadoop book to recommend. I have read both 2nd and 3rd edition. The latest 3rd edition is based on the Hadoop 1.0. It covers almost everything on the Hadoop including Yarn. The author also has github site to share the code. Here is my book note


Hadoop In Practice


  • Level Mid.
  • Level Adv.

Read complete this book@May. 2013. This is pretty good book, especially in the data science chapter. The hive part is a little bit old than other latest Hadoop book. The reading experience is also good. I like the way it provides number of “TECHNIQUE”. It touches some new tool of big data that other books do not cover, such as Cloudera Crunch. There are no comments in the source code (request by publication), but there are enough comments added inline in the book.


Hadoop Real World Solution Cookbook


  • Level Mid.
  • Level Adv.

Read complete this book@May. 2013. The code has no comments with explanation below. The way is really not I like. If put the comments in code, the book may have less pages to read. In addition, there are logic mistakes in the book because copy & paste error I think at least three – five time after I read 100s of pages, eg. p143 “hashset” should be “hashmap”. The charpter 7 starts looking good and deep which requires your knowledge on data mining and graph processing. This is a good tool reference book anyway


Hadoop 实战


  • Level Ent.
  • Level Mid.

Read complete this book@May. 2013. This book is a Chinese book which has same name to below but with totally different. It covers majority Hadoop components and reading friendly. I only read the 1st edition, so the things are a little out of date. The 2ed is also on the shelf right now. Generally, it is just introduction and lacks of details and high skills.


Hadoop In Action


  • Level Ent.
  • Level Mid.

Read complete this book@Sep. 2012. I got hard copy of this. This book is a little bit old based on Hadoop 0.19. It covers majority Hadoop components. It also has Chinese version.


Hadoop MapReduce Cookbook


  • Level Ent.
  • Level Mid.
  • Level Adv.

Read complete this book@Nov. 2012. Some sample Hadoop commands lack of necessary space between command/parameters. In Ch8, it provide some data analytics implementation using Java and MapReduce, which I did not see details like this in other books. It it worthy more time of reading this part.


MapReduce Design Patterns


  • Level Mid.
  • Level Adv.

TRead complete this book@Dec. 2012. he topic is really focus. The pattern is not that exciting comparing with Java’s in description. There is small values if you already read below other books. There are typos and mistakes. I cannot find the source code either.


Programming Pig


  • Level Mid.
  • Level Adv.

Read complete this book@Dec. 2012. This is a tiny book about pig, around 200 pages. It covers everything. The extension of UDF parts lacks of enough examples. Also, these parts are a little bit hard for reading. I have also read the translation one, which is so so. You cannot find more examples of Pig than anywhere else. However, I expect there is another book I believe that could/should cover more practical examples and hands on scripts.


Hadoop Mapreduce Internals


  • Level Adv.

Read complete this book@Dec. 2013. This book tells how map and reduce are implemented in source code level. It covers lots of detail that other book never mentioned.It can help reading the source code. This is kind of book helping uderstanding instead of practicing something. There are less code samples with book. The picture and comparing form in this book are really good for reading and undersanding.


HBase Administration Cookbook


  • Level Mid.
  • Level Adv.

Read complete this book@Feb. 2014. This book is for HBase administrators, developers, and will even help Hadoop administrators. You are not required to have HBase experience, but are expected to have a basic understanding of Hadoop and MapReduce. This is very practical tookit book for HBase admin. It does not talk more about API and focus on administration only.


HBase: The Definitive Guide


  • Level Ent.
  • Level Mid.
  • Level Adv.

Read complete this book@Aug. 2014. This is a really a good HBase book to recommend. This is the 1st edition and shows you how Apache HBase can fulfill your needs. As the open source implementation of Google’s BigTable architecture, HBase scales to billions of rows and millions of columns, while ensuring that write and read performance remain constant. The author also has github site to share the code. I am still in reading for now and it is a little bit hard.


Big Data: A Revolution That Will Transform How We Live, Work, and Think


  • Level Ent.

Read complete this book@Aug. 2014. It is one of few books of big data using real example to tell what’s revolution brought by big data. The signatures of big data it describes are really impressive. This book motives readers to explore the value behind of big data. It is a good book to encourage people to explore the big data area.


Instant Apache Hive Essentials How-to


  • Level Ent.
  • Level Mid.

Read complete this book@Oct. 2014.The book creates fast way to query data using hive in few hours. This is great than searching the apache confluence to see the breaked help documents especially for new hive users. The book has few pages to read and easier to understand. The author also gives level of complex for each chapters so that different level of users could quickly pick up what he/she needs.


Hadoop Application Architectures

Designing Real-World Big Data Applications

  • Level Mid.
  • Level Adv.

Read complete this book@Dec. 2015. This book shares the best practice of archtecting enterprise hadoop platform. It contains variouse of big data tools used for each area of big data as well as their comparision. Overall, it is a great book especially for the three case studies. The pity thing is the case studies still lack of details but going through quickly. For people who are junior in the big data, this book maybe a little difficult to understand. Since there are already lots of books about big data, this books bring not much suprise in terms of content and writing styles.