This project use the dataset "Cencus Income" from the UCI repository, containing data about the prediction of income of some people.
The dataset can downloaded from here (specifically, the file adult.data).
For this project we implement a Naive Bayes classifier on Hadoop and test it on “Cencus Income” dataset.
For the numerical attributes we implemented a discretization with MapReduce on Hadoop.