Skip to content

Latest commit

 

History

History
 
 

4.analyze_data

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

4. Analyze Data

In this module, we analyze the normalized training data features from 3.normalize_data.

Feature Analysis

We use UMAP for analyis of features. UMAP was introduced in McInnes, L, Healy, J, 2018 as a manifold learning technique for dimension reduction. We use UMAP to reduce the feature data into 1 and 2 dimensions. We use Matplotlib to visualize the 1D and 2D UMAPS.

For each reduction with UMAP, we create two types of visualizations. The first visualization colors all points by their phenotypic class. The second visualization colors points for only certain phenotypic classes, with all other phenotypic classes being colored gray.

Note: Phenotypic classes colored in second visualization can be changed with the classes_2 variable in analyze_data.ipynb.

Step 1: Analyze Data

Use the commands below to analyze training data.

# Make sure you are located in 4.analyze_data
cd 4.analyze_data

# Activate mitocheck_data conda environment
conda activate mitocheck_data

# Analyze data
bash analyze_data.sh