Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runpod.sh Refactor and new implementations. #14

Draft
wants to merge 27 commits into
base: master
Choose a base branch
from

Conversation

Steel-skull
Copy link
Contributor

TLDR: refactored the code on Runpod.sh and Fixed any issues this caused in downstream files.

Updated repo from https://github.com/dmahan93/lm-evaluation-harness to https://github.com/EleutherAI/lm-evaluation-harness and made necessary changes to Table.py to fix any issues with the format

Runpod.sh:

GPU Detection and Parallelization Logic:
Added logic to set a flag for parallelization if multiple GPUs are detected.

Test Quantization:
Setup ability to quantize models for testing.

Package Installation:
Added installation of deepspeed, gekko and AUTO_GPTQ libraries

Environment Variable Handling:
Introduced LOAD_IN_4BIT and AUTOGPTQ environment variables with default fallbacks.

Script Structure:
Introduced functions run_benchmark_nous and run_benchmark_openllm to encapsulate the logic for running any future benchmarks.

Refactoring the Benchmark Execution:
Replaced benchmark execution blocks with calls to the newly defined functions.

Improved Error Handling and Logging:
Added a check for the presence of main.py before attempting to run it.
Replaced pathing for main.py, and others to solve issues with "Cannot find file"

Infinite Sleep for Debug Mode:
Added a message to indicate that the script is in debug mode, providing clarity during script execution.

was getting errors due to unfound files testing a refactor.
testing new approach in main.sh
fixing main.py link
Refactor GPU Parallelization and Benchmark Execution Logic
changed to agieval branch to fix requirements
added a function to debug that lets you know its still active
shortening test
added gptq and 4bit options
fixed indentation
added gekko requirement
add bitsandbytes
add dtype to gptq
@Steel-skull
Copy link
Contributor Author

changes will need to be made to the colab to add 4bit function

@Steel-skull
Copy link
Contributor Author

Steel-skull commented Jan 30, 2024

adding 4bit had increased processing times i will be looking into how to fix this i think i need to push a variable to gptq

Warning in runpod:
UserWarning: Input type into Linear4bit is torch.float16, but bnb_4bit_compute_dtype=torch.float32 (default). This will lead to slow inference or training speed.

need to push variable bnb_4bit_compute_dtype= "auto" (first attempt did not work)

may need to add a function to detect the dtype then push to gptq.

future ideas....

@Steel-skull
Copy link
Contributor Author

Running without issue:
-AGIEval
-GPT4All

Failing to run? (Need to test further.)
-TruthfulQA
-Bigbench

@Steel-skull
Copy link
Contributor Author

converting to draft until i am able to solve discovered issues

@Steel-skull Steel-skull marked this pull request as draft February 2, 2024 16:44
change debug print statement
attempting to passthrough bnb_dtype
fixed truthful qa and bigbench
adjusting for error:
ImportError: /usr/local/lib/python3.10/dist-packages/vllm/_C.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c104cuda9SetDeviceEi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant