Stroke-Predictions

XGBoost model to predict stroke risk from clinical patient information, aiding in early detection and prevention.

Data

Data is from: https://www.kaggle.com/datasets/fedesoriano/stroke-prediction-dataset

10 clinical features
- Gender, Age, Hypertension, Heart Disease, Marriage Status, Work Type, Residence Type, Average Glucose Level, BMI, Smoking Status
1 binary target variable
- Stroke
5110 observations
Class Imbalance: 7% of samples are positive for having had a stroke

Exploratory Analyses

Those who have had a stroke are ~26 years older than those who have not, on average.

Those who have been married are ~4x more likely to have had a stroke than those who have not been married, on average.

Final Model

XGBClassifier with eta = 0.3, lambda = 1, max_depth = 5, and min_child_weight = 2

Model Evaluation

The final model was chosen to maximize recall, since false negatives are worse than false positives when predicting stroke. False negatives may lead to a lack of proper treatment for those who might be at risk for stroke, while false positives only lead to an unnecessary checkup.

The final model showed a recall of 0.07 and an accuracy of 0.94. The model also showed a precision of 0.40 and an f1-score of 0.13.

Recommendations

Further research and model development is required to obtain a suitable model for production use. The final model's recall of 0.07 is still low for stroke predictions, and may lead to dangerous false negative errors. The model has not learnt to predict the positive class well because of the low proportion of positive samples (7%) present in the dataset. A future project could explore using resampling techniques, such as oversampling with SMOTE, to reduce class imbalances in the target and increase recall.

Contact

For further information, contact [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
LICENSE		LICENSE
Project 2 Presentation.pptx		Project 2 Presentation.pptx
README.md		README.md
age_stroke.png		age_stroke.png
cf_matrix.png		cf_matrix.png
marriage_stroke.png		marriage_stroke.png
stroke_predictions.ipynb		stroke_predictions.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Stroke-Predictions

Data

Exploratory Analyses

Final Model

Model Evaluation

Recommendations

Contact

About

Releases

Packages

Languages

License

adey4/Stroke-Predictions

Folders and files

Latest commit

History

Repository files navigation

Stroke-Predictions

Data

Exploratory Analyses

Final Model

Model Evaluation

Recommendations

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages