Interpreting Deep NLP Models using Concept Activation Vectors

This is the repository for our paper "Interpreting Deep NLP Models using Concept Activation Vectors". The repository contains the code to implement the methodology and also to re-produce the results. We propose a plug-and-play methodology to analyze representations in deep NN models.

Train/Run

There are several configurations the models can be run and interepreted with; they are listed below.

With activations, with [MASK]
With activations, without [MASK]
With gradients, without [MASK]
With gradients, with [MASK]

We currently experiment with activations and gradients of the trained language model (BERT, XLNet). Our goal is to be able to associate the learned representations to concepts defined at a higher level.

The code base is in the code/ folder. The configuration is defined in the code/run.sh file and can be altered to run any of the listed experiments.

`code/run.sh` File Explanation

The file code/run.sh is divided into two parts. The first part consists of all the paths of extracted activations from all the layers of the trained models.

model="bert-base-cased" # this is where the model is defined

base_folder="final_experiments/with_acts/with_mask/layer_1" # change the path to your code files here. For every layer, a new folder is constructed.

DIR="/alt/mt/work/durrani/Causation_Analysis/ProbingClassifiers/POS/Representations/${model}" # this is the base directory containing the representations

data_folder="/alt/mt/tcav/data" # the folder that has the dataset

concept_path="${DIR}/wsj.train.conllx.word" # the concept sents
concept_labels="${DIR}/wsj.train.conllx.label" # the concept labels
concept_activations="${data_folder}/${model}/wsj.train.conllx.json" # the concept activations

base_path="${DIR}/wsj.20.test.conllx.word" # the base sents
base_labels="${DIR}/wsj.20.test.conllx.label" # the base labels
base_activations="${data_folder}/${model}/wsj.20.test.conllx.json" # the base activations

output_directory="/alt/mt/tcav/${model}/${base_folder}" # the directory to store the pickle files in 

layer_wise_cav_pickle_path="/alt/mt/tcav/${model}/${base_folder}/layer_wise_cavs.pickle" # the path to store layer wise CAV pickle file in
controlled_layer_wise_cav_path="/alt/mt/tcav/${model}/${base_folder}/controlled_layer_wise_cavs.pickle" # the path to store controlled layer wise CAV pickle file

The second part is related to the model configurations.

mode_tcav="wm" # this it the mode, wm -> word mask
word="[MASK]" # the type of mask [MASK] for BERT, for roBERTa, change to <mask> here.
model_type="LR" # for the classifier.
process_mode="1" # 0 is for non-MASK, 1 is for MASK.
use_grad="0" # 0 for acts, 1 for grad.
if_controlled="1" # 0 for no controlled experiment, 1 for controlled experiment as well.
workers=8 # the parallel workers.
runs=50 # the number of runs.
if_rand=1 # if you want to do the t-test.
name="fcontrolled_experiment_L1"

Once the experiment is complete, there should be several files in the folder path you set in base_folder in code/run.sh. The results are contained in the file named: inference_word_mode_masked_test.pickle.

Inference

Once inference_word_mode_masked_test.pickle has been generated, pass this to the inference/Inference_Plots_TCAV.ipynb and the TCAV matrices will be displayed showing TCAV score for each test and [MASK] concept.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
code		code
inference		inference
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Interpreting Deep NLP Models using Concept Activation Vectors

Train/Run

`code/run.sh` File Explanation

Inference

About

Releases

Packages

Languages

kjanjua26/interpreting-neural-nlp-models

Folders and files

Latest commit

History

Repository files navigation

Interpreting Deep NLP Models using Concept Activation Vectors

Train/Run

code/run.sh File Explanation

Inference

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

`code/run.sh` File Explanation

Packages