Skip to content

srgsanky/HadoopBook

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hadoop Book Example Code

This repository contains the example code for Hadoop: The Definitive Guide, Fourth Edition by Tom White (O'Reilly, 2014).

Code for the First, Second, and Third Editions is also available.

Note that the chapter names and numbering has changed between editions, see Chapter Numbers By Edition.

Building and Running

To build the code, you will first need to have installed Maven and Java. Then type

% mvn package -DskipTests

This will do a full build and create example JAR files in the top-level directory (e.g. hadoop-examples.jar).

To run the examples from a particular chapter, first install the component needed for the chapter (e.g. Hadoop, Pig, Hive, etc), then run the command lines shown in the chapter.

Sample datasets are provided in the input directory, but the full weather dataset is not contained there due to size restrictions. You can find information about how to obtain the full weather dataset on the book's website at [http://www.hadoopbook.com/] (http://www.hadoopbook.com/).

Hadoop Component Versions

This edition of the book works with Hadoop 2. It has not been tested extensively with Hadoop 1, although most of it should work.

For the precise versions of each component that the code has been tested with, see book/pom.xml.

Copyright

Copyright (C) 2014 Tom White