-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Runpod.sh Refactor and new implementations. #14
base: master
Are you sure you want to change the base?
Conversation
was getting errors due to unfound files testing a refactor.
runpod.sh refactor
testing new approach in main.sh
fixing main.py link
Refactor GPU Parallelization and Benchmark Execution Logic
changed to agieval branch to fix requirements
added a function to debug that lets you know its still active
shortening test
added gptq and 4bit options
fixed indentation
added gekko requirement
add bitsandbytes
add dtype to gptq
changes will need to be made to the colab to add 4bit function |
adding 4bit had increased processing times i will be looking into how to fix this i think i need to push a variable to gptq
need to push variable bnb_4bit_compute_dtype= "auto" (first attempt did not work) may need to add a function to detect the dtype then push to gptq. future ideas.... |
Running without issue: Failing to run? (Need to test further.) |
converting to draft until i am able to solve discovered issues |
change debug print statement
attempting to passthrough bnb_dtype
fixed truthful qa and bigbench
adjusting for error: ImportError: /usr/local/lib/python3.10/dist-packages/vllm/_C.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c104cuda9SetDeviceEi
TLDR: refactored the code on Runpod.sh and Fixed any issues this caused in downstream files.
Updated repo from https://github.com/dmahan93/lm-evaluation-harness to https://github.com/EleutherAI/lm-evaluation-harness and made necessary changes to Table.py to fix any issues with the format
Runpod.sh:
GPU Detection and Parallelization Logic:
Added logic to set a flag for parallelization if multiple GPUs are detected.
Test Quantization:
Setup ability to quantize models for testing.
Package Installation:
Added installation of deepspeed, gekko and AUTO_GPTQ libraries
Environment Variable Handling:
Introduced LOAD_IN_4BIT and AUTOGPTQ environment variables with default fallbacks.
Script Structure:
Introduced functions run_benchmark_nous and run_benchmark_openllm to encapsulate the logic for running any future benchmarks.
Refactoring the Benchmark Execution:
Replaced benchmark execution blocks with calls to the newly defined functions.
Improved Error Handling and Logging:
Added a check for the presence of main.py before attempting to run it.
Replaced pathing for main.py, and others to solve issues with "Cannot find file"
Infinite Sleep for Debug Mode:
Added a message to indicate that the script is in debug mode, providing clarity during script execution.