Welcome to my GitHub portfolio. As a Data Scientist with an engineering background, I focus on generating insights and developing models that provide the information needed to make data-driven decisions.
I have a growing portfolio of projects demonstrating my expertise in EDA and ML modelling, featuring R, Python, and SQL queries for data manipulation, Tableau/Seaborn/Matplotib visualisations, and multiple supervised and unsupervised models, illustrating my passion for turning complex data into relevant insights.
Feel free to explore my repositories and contact me if you have any questions!
Description: Detailed, easy-to-understand guide documenting my learning, covering the fundamental theory of data science, including exploratory data analysis (EDA), statistics, regression and machine learning.
Objective: Create a living-document that simplifies as much as possible complex concepts with real-world examples, making it a compendium of information for beginners getting into the field while maintaining completeness.
( Python - Logistic Regression, Naive Bayes, Decision Tree, Random Forest, Gradient Boosting )
Description: The HR department of Salifort Motors is concerned about employee turnover, and wants to understand what features contribute most to emloyee turnover rates to help identify employees at risk of leaving.
Objective: Preform an exploratory data analysis (EDA) on employee data to identify key factors that contribute to employee turnover. Create various machine learning models to predict whom is most at risk.
( Python - NPL, ML modelling )
Description: TikTok's moderation team wants to streamline the process of reviewing content that has been flagged by developing a ML model to pre-process content, allowing them to prioritize user reports more effectively
Objective: Classify videos into either opinion-based or claim-based utilizing the data gathered.
( SQL, R , Python )
Description: A bike sharing company in Chicago aims to move away from their current strategy and establishes a new goal, converting their casual pay-per-ride customers into annual membership subscribers.
Objective: Preform an exploratory data analysis (EDA) on the customer base & consumer habits of casual riders vs annual membership users. Identify key differences and potential opportunities to influence customers into upgrading their subscription plan.
Google Data Analytics Professional Certificate (Apr 2024)
Python for Everybody (Jan 2024)
Master of Science in Chemical Engineering - Instituto Superior Técnico, Lisbon, Portugal (Dec 2022)
ERASMUS+ - Technical University of Denmark, Copenhagen, Denmark (Jan 2021)
Bachelor of Science in Chemical Engineering - Instituto Superior Técnico, Lisbon, Portugal (Jul 2020)
Here is my CV