This project involves conducting Exploratory Data Analysis (EDA) on a dataset that contains information about students' study hours and their corresponding marks or grades. The goal is to analyze the relationship between study hours and academic performance, identify patterns or trends, and derive meaningful insights.
This dataset can be downloaded from https://www.kaggle.com/datasets/samira1992/student-scores-simple-dataset The dataset used for this analysis contains the following columns:
Hours
: Number of hours spent studying per week.Scores
: Academic performance or marks obtained by the student.
- Open the
EDA student study hours.ipynb
Jupyter Notebook. - Load the dataset and perform data cleaning and preprocessing if necessary.
- Explore descriptive statistics, distributions, correlations, and visualizations to understand the relationship between study hours and marks.
- Conduct hypothesis testing or statistical analysis if applicable to validate findings.
- Summarize key insights, trends, and conclusions drawn from the analysis.
- Python 3.x
- Jupyter Notebook
- Pandas
- Matplotlib
- Seaborn
- NumPy
- SciPy (for statistical analysis, if needed)
- Launch the Jupyter Notebook by running
jupyter notebook
in the project directory. - Open the
EDA student study hours.ipynb
notebook and follow the instructions to execute code cells and analyze the data. - Modify the analysis as needed, add new visualizations or statistical tests, and document your findings.
This project is licensed under the MIT License. See the LICENSE
file for more details.
- Dataset source
- Pandas, Matplotlib, Seaborn, NumPy, SciPy, and other open-source libraries used in the analysis