Skip to content

Machine Learning Techniques to known dataset to predict the unknown income

License

Notifications You must be signed in to change notification settings

sefeoglu/AdultsDataSetUCI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Adults DataSet UCI

Problem Setting

A polling institute wants to be able to estimate an individual’s income from his/her personal data (see einkommen.train). To this aim, 30.000 individuals were interviewed concerning the features summarized below. For some of the individuals, not all features are available. Crucially, the income of only 5.000 of the interviewee’s is known.

Steps:

  • Data Integration
  • Feature Representation
  • EDA Pairplot
  • Correlation of Numeric Attributes
  • Missing Value Representation
  • Data Cleaning, covert categorical variables to numerical
  • Check missing values
  • Feature Selection
  • Model Selection and Evaluation
    • 'Logistic Regression'
    • 'Random Forest'
    • 'Neural Network'
    • 'GaussianNB'
    • 'DecisionTreeClassifier'
    • 'SVM'

About

Machine Learning Techniques to known dataset to predict the unknown income

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published