Data Literacy Research

In this research project, I collaborate with Professor Sandra Cannon in measuring data literacy expectations from the ways employers describe jobs and the way they describe the people they are looking for. We find the discriminatory power between how employers describe jobs and what the actual work on the job entails.

Current Pipeline

Files:

linkedin.py This is the file you use to generated the dataframe of linkedin postings - results will be stored in data/scraping_results (tagged with "linkedin")

indeed.py Same as linkedin.py but for indeed postings - results will be stored in data/scraping_results (tagged with "indeed")

Notes

Data files:

merged_headings_df: Contains both the LinkedIn and Indeed postings in a single DataFrame

Utility Functions (in `utilities.utils`)

to_wcdf: Applies sklearn CountVectorizer
preprocess_heading_text: Takes the Heading Text, which is initially intended for merged_headings_df, and applies a preprocessing pipeline on it
visualize_counts: Takes in a Pandas series of string row entiresand visualizes using Seaborn teh top n words in that corpus
visualize_seq_lengths: Visualizes the distribution of word lengths in a sequence

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
.ipynb_checkpoints		.ipynb_checkpoints
__pycache__		__pycache__
data		data
misc		misc
notebooks		notebooks
research		research
sample.egg-info		sample.egg-info
scraping		scraping
tests		tests
utilities		utilities
visualizations		visualizations
.DS_Store		.DS_Store
.gitignore		.gitignore
2 august feature importances header.png		2 august feature importances header.png
LDA.py		LDA.py
Makefile		Makefile
README.md		README.md
__init__.py		__init__.py
bar.py		bar.py
header_analysis.py		header_analysis.py
header_rf.py		header_rf.py
header_wrangling.py		header_wrangling.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Literacy Research

Current Pipeline

Files:

Notes

Utility Functions (in `utilities.utils`)

About

Releases

Packages

Contributors 2

Languages

mtaruno/data-literacy-research

Folders and files

Latest commit

History

Repository files navigation

Data Literacy Research

Current Pipeline

Files:

Notes

Utility Functions (in utilities.utils)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Utility Functions (in `utilities.utils`)

Packages