Challenge #4 - ML Trained to Infer Investment Areas #55
AktivGroupDS
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Summary
I built a classification model to infer if an area is for investment or not. The model trained was XGBoost. I got an Accuracy of 82.29% and 86% for Precision for YES. For NO I got an Accuracy of 82.29% and 78.26% for Precision. These numbers are very good for this type of models.
Approach
My goal was to infer if a location is an Investment Area or not. Using AWS Sagemaker Canvas I worked on some attributes of the table (about 30 attributes). Using the Data Quality functionality I discovered the attributes that have the greatest impact on the model. Then I trained a XGBoost model with the 80% of the data, about 900 rows, and used the 20 for testing. below I am presenting my findings.
Findings
I am surprised to find that the feature (attribute) that has most impact is the median_prop_value and also that any property under $385,000 will be an investment area and any above that value will NOT be.
Some charts from the model
Precision Recall Curve
Confusion Matrix Values
Files
Sagemaker Notebook
SageMakerAutopilotCandidateDefinitionNotebook.ipynb.zip
Beta Was this translation helpful? Give feedback.
All reactions