Python Clean-Up Pass 2: Tests & Code Improvements #1293

cosmic-snow · 2023-07-30T15:37:44Z

Describe your changes

Python tests:
- Add pytest markers for better test selection control
- Ability to set model folders other than the default with the new PYTEST_MODEL_PATH environment variable (tests only)
- Add tests for the gpt4all.gpt4all module / GPT4All class
- Add tests for backend code (ctypes related) in the gpt4all.pyllmodel module
  - not exhaustive but the important parts are there
  - uncovered minor bugs (embeddings memory management) and a minor missing feature (embeddings memory requirement)
- Add tests for the rest of the gpt4all.pyllmodel module in test_pyllmodel.py (focused on the LLModel class)
Many more type hints:
- pyllmodel module
- gpt4all module
Improve embedding logic in LLModel class
Factoring out some things into separate methods in GPT4All to make it more flexible:
- Model initialization logic: so the heavy lifting could be handled differently in a subclass (e.g. deferred).
- Chat session open/close: so it's easier to manage a session if there are separate localities (e.g. when using a framework).

There are no API breaking changes.

Pytest Markers

For the documentation on how markers work see: https://docs.pytest.org/en/7.4.x/how-to/mark.html
Now these markers are defined (see pytest.ini):
- inference: all tests that do inference on a model. These are typically very taxing on both CPU and RAM
- online: tests that require an internet connection (e.g. for models.json or the embeddings model)
- slow: tests which typically take more than a few seconds; e.g. the heavier inference tests
- c_api_call: only in test_backend.py, on tests which call the C API. While other Python tests can ultimately call the backend, too, the idea is that these can be run after backend changes.
Example: pytest -m "not inference and not online"

Specifying Model Path with `PYTEST_MODEL_PATH`

Tests now look for an environment variable PYTEST_MODEL_PATH which can be used to specify model folders. Example Linux:

$ export PYTEST_MODEL_PATH="${HOME}/.local/share/nomic.ai/GPT4All:${HOME}/.cache/gpt4all"

More details in tests/__init__.py in the docstring.

Issue ticket number and link

None

Checklist before requesting a review

I have performed a self-review of my code.
If it is a core feature, I have added thorough tests.
I have added thorough documentation for my code.
I have tagged PR with relevant project labels. I acknowledge that a PR without labels may be dismissed.
If this PR addresses a bug, I have provided both a screenshot/video of the original bug and the working solution.

Demo

None

Steps to Reproduce

mostly: pytest gpt4all-bindings/python 😉

Notes

Not sure about everything I'll include in this one, might split it up further.

cosmic-snow · 2023-07-30T15:38:54Z

~~The first few commits are still from the previous one. Ignore, will be rebased eventually.~~ Done.

cosmic-snow · 2023-08-13T21:50:09Z

This has become quite big even before refactoring.

At the moment, it's basically all tests. I think I'll do just a few code improvements now. And I won't worry about everything else I wanted to do here.

I'll open it up already in any case. From here it'll be some changes and squashing a bunch of commits.

cosmic-snow · 2023-08-14T01:45:51Z

I think I'll leave it at this, just some squashing still to do. Maybe some very minor touch-ups, as well.

- Mark decorators in test_gpt4all.py - Add note to the makefile - pytest marks need to be defined in a config file, here: pytest.ini - see: https://docs.pytest.org/en/7.4.x/how-to/mark.html

- Specify optional PYTEST_MODEL_PATH environment variable. - Format is like other PATH env vars, i.e. multiple folders allowed. - If set, these are the folder(s) in which models will be looked up for the tests.

- Tests are focused on the backend, i.e. the C API (through ctypes) - Add another pytest marker specifically for the backend: c_api_call

- Tests for the rest of the pyllmodel module which are mostly focused on the LLModel class.

- module gpt4all - module pyllmodel - edit gpt4all_python.md to match the _format_chat_prompt_template() change

) - Separate _open_chat_session() and _close_chat_session() methods so it's easier to manage a session if there are separate localities (e.g. when using a framework).

…i#1293) - Make it more flexible so e.g. the heavy lifting could be handled differently in a subclass.

apage43 · 2023-08-24T23:38:24Z

it appears that if PYTEST_MODEL_PATH is not set that many tests fail with 'Invalid model path' instead of using a reasonable default:

======================================================================================================================================== short test summary info ==========================================================================================================================================
FAILED gpt4all/tests/test_gpt4all.py::test_inference - ValueError: Invalid model directory: None
FAILED gpt4all/tests/test_gpt4all.py::test_inference_long_orca_3b - ValueError: Invalid model directory: None
FAILED gpt4all/tests/test_gpt4all.py::test_inference_long_falcon - ValueError: Invalid model directory: None
FAILED gpt4all/tests/test_gpt4all.py::test_inference_long_llama_7b - ValueError: Invalid model directory: None
FAILED gpt4all/tests/test_gpt4all.py::test_inference_long_llama_13b - ValueError: Invalid model directory: None
FAILED gpt4all/tests/test_gpt4all.py::test_inference_long_mpt - ValueError: Invalid model directory: None
FAILED gpt4all/tests/test_gpt4all.py::test_inference_long_replit - ValueError: Invalid model directory: None
FAILED gpt4all/tests/test_gpt4all.py::test_inference_long_groovy - ValueError: Invalid model directory: None
FAILED gpt4all/tests/test_gpt4all.py::test_inference_hparams - ValueError: Invalid model directory: None
FAILED gpt4all/tests/test_gpt4all.py::test_inference_falcon - ValueError: Invalid model directory: None
FAILED gpt4all/tests/test_gpt4all.py::test_inference_mpt - ValueError: Invalid model directory: None
ERROR gpt4all/tests/test_backend.py::test_llmodel_model_create - TypeError: unsupported operand type(s) for /: 'NoneType' and 'str'
ERROR gpt4all/tests/test_backend.py::test_llmodel_model_create2 - TypeError: unsupported operand type(s) for /: 'NoneType' and 'str'
ERROR gpt4all/tests/test_backend.py::test_llmodel_model_destroy - TypeError: unsupported operand type(s) for /: 'NoneType' and 'str'
ERROR gpt4all/tests/test_backend.py::test_llmodel_required_mem - TypeError: unsupported operand type(s) for /: 'NoneType' and 'str'
ERROR gpt4all/tests/test_backend.py::test_llmodel_loadModel_valid - TypeError: unsupported operand type(s) for /: 'NoneType' and 'str'
ERROR gpt4all/tests/test_backend.py::test_llmodel_loadModel_invalid - TypeError: unsupported operand type(s) for /: 'NoneType' and 'str'
ERROR gpt4all/tests/test_backend.py::test_llmodel_isModelLoaded - TypeError: unsupported operand type(s) for /: 'NoneType' and 'str'
ERROR gpt4all/tests/test_backend.py::test_llmodel_prompt - TypeError: unsupported operand type(s) for /: 'NoneType' and 'str'
ERROR gpt4all/tests/test_backend.py::test_llmodel_embedding - TypeError: unsupported operand type(s) for /: 'NoneType' and 'str'
ERROR gpt4all/tests/test_backend.py::test_llmodel_get_set_threadCount_valid[0] - TypeError: unsupported operand type(s) for /: 'NoneType' and 'str'
ERROR gpt4all/tests/test_backend.py::test_llmodel_get_set_threadCount_valid[1] - TypeError: unsupported operand type(s) for /: 'NoneType' and 'str'
ERROR gpt4all/tests/test_backend.py::test_llmodel_get_set_threadCount_valid[2] - TypeError: unsupported operand type(s) for /: 'NoneType' and 'str'
ERROR gpt4all/tests/test_backend.py::test_llmodel_get_set_threadCount_valid[4] - TypeError: unsupported operand type(s) for /: 'NoneType' and 'str'
ERROR gpt4all/tests/test_backend.py::test_llmodel_get_set_threadCount_valid[8] - TypeError: unsupported operand type(s) for /: 'NoneType' and 'str'
ERROR gpt4all/tests/test_backend.py::test_llmodel_get_set_threadCount_valid[65] - TypeError: unsupported operand type(s) for /: 'NoneType' and 'str'
ERROR gpt4all/tests/test_backend.py::test_llmodel_get_set_threadCount_exception[None] - TypeError: unsupported operand type(s) for /: 'NoneType' and 'str'
ERROR gpt4all/tests/test_backend.py::test_llmodel_get_set_threadCount_exception[10011] - TypeError: unsupported operand type(s) for /: 'NoneType' and 'str'
ERROR gpt4all/tests/test_backend.py::test_llmodel_get_set_threadCount_exception[sixteen] - TypeError: unsupported operand type(s) for /: 'NoneType' and 'str'
ERROR gpt4all/tests/test_backend.py::test_llmodel_get_set_threadCount_invalid[-1] - TypeError: unsupported operand type(s) for /: 'NoneType' and 'str'
ERROR gpt4all/tests/test_backend.py::test_llmodel_get_set_threadCount_invalid[-4] - TypeError: unsupported operand type(s) for /: 'NoneType' and 'str'
ERROR gpt4all/tests/test_backend.py::test_llmodel_get_set_threadCount_invalid[-257] - TypeError: unsupported operand type(s) for /: 'NoneType' and 'str'

apage43 · 2023-08-25T00:09:20Z

with a directory specified some tests that check output fail on Mac because the Metal implementation does not get the same result:

==================================================================================================== short test summary info ====================================================================================================
FAILED gpt4all/tests/test_gpt4all.py::test_inference_long_groovy - requests.exceptions.ChunkedEncodingError: ('Connection broken: IncompleteRead(2487988224 bytes read, 1297260057 more expected)', IncompleteRead(2487988224 bytes read, 1297260057 more expected))
FAILED gpt4all/tests/test_gpt4all.py::test_inference_hparams - AssertionError: assert 'Paris' in '1.\n'
FAILED gpt4all/tests/test_gpt4all.py::TestGPT4AllInstance::test_generate - AssertionError: assert ' peaches and...instructions?' == ' apples and ... and bananas.'
FAILED gpt4all/tests/test_gpt4all.py::TestGPT4AllInstance::test_generate_streaming - AssertionError: assert ' peaches and...instructions?' == ' apples and ... and bananas.'
FAILED gpt4all/tests/test_pyllmodel.py::TestLLModelInstance::test_prompt_model - assert ' And let him...e bed." In an' == '\nTheir eyes... on the prize'
FAILED gpt4all/tests/test_pyllmodel.py::TestLLModelInstance::test_prompt_model_streaming - assert ' And let him...e bed." In an' == '\nTheir eyes... on the prize'
===================================================================================== 6 failed, 87 passed, 1 warning in 1638.82s (0:27:18) ===============================

cosmic-snow · 2023-08-25T16:28:53Z

Thanks for having a look.

it appears that if PYTEST_MODEL_PATH is not set that many tests fail with 'Invalid model path' instead of using a reasonable default:

Good catch, I failed to consider that. Although it already does the "reasonable default" thing by returning None, because that's the default value for the GPT4All class to trigger its own fallback. I need to do the same for the backend, for now at least.

with a directory specified some tests that check output fail on Mac because the Metal implementation does not get the same result:

Ah, interesting. I'd wager I don't have Metal support on CI? @acrastt was kind enough to help me run some tests on a Mac because I don't have one, but that was without Metal, too. I guess I'll just make the tests more general.

Edit:

Not sure what that first error in test_gpt4all.py::test_inference_long_groovy is about, though. That's not something I've touched I think; I've only touched the model folder logic in that. And the test passes here. Was that on a machine without internet access?
I had not changed the logic in test_gpt4all.py::test_inference_hparams either, and it's not deterministic (no temp=0).

I have inference off most of the time here, so that slipped through. By the way, is there a way to make it truly deterministic across all architectures? temp= 0 is not enough.

cosmic-snow · 2023-08-25T17:59:58Z

By the way, looking at those pre-existing tests, I don't understand what do_long_input() is supposed to achieve in test_gpt4all. It might be a holdover from earlier, I'll investigate that later.

Edit: I guess it's fine, in that it tests whether longer inputs are possible. Although not very useful anymore, n_predict changed from 128 to 4096 in LLModel.

apage43 · 2023-09-02T05:54:26Z

that it tests whether longer inputs are possible

this is what it's for - specifically I added it to test that memory management for long contexts in the backend have not broken

apage43

outputs are not necessarily 100% consistent across backends or build configurations - tests should probably not check for exact output strings but if they do should force running on cpu (though even with Metal off the MacOS build may get different results as ggml on MacOS still has significant differences from the windows and linux builds)

test_inference_hparams still fails on my machine for Paris not appearing in the first 3 tokens

gpt4all-bindings/python/gpt4all/tests/test_backend.py

gpt4all-bindings/python/gpt4all/tests/test_pyllmodel.py

cosmic-snow · 2023-09-02T08:17:47Z

Alright, I hope that's that.

If it's ok like this, I think I'll rebase onto main and merge those fixes into their respective commits.

apage43 · 2023-09-02T19:41:35Z

all tests now pass for me if I do set PYTEST_MODEL_PATH, but some still fail without it: https://gist.github.com/apage43/75247c0111363cde4dedd81b80217caa

weird that this doesn't repro for you - if you're sure you've made sure the var is not set and it all passes it could be a MacOS specific thing, in which case I'm okay with calling that out in the py binding README and leaving it that way for now and I'll try to debug it myself later

cosmic-snow · 2023-09-02T23:01:46Z

weird that this doesn't repro for you - if you're sure you've made sure the var is not set and it all passes it could be a MacOS specific thing, in which case I'm okay with calling that out in the py binding README and leaving it that way for now and I'll try to debug it myself later

I'm not sure about it, I'll have to try again. I'm usually running all of these through VS Codium, which comes with .env file support out of the box. It's very convenient, but there's still a possibility I didn't test it quite right because of that.

If you don't set the PYTEST_MODEL_PATH, the fallback will be ~/.cache/gpt4all. Are all of your models in that directory?

Actually, this is strange:

        if model_path is None:
            try:
                os.makedirs(DEFAULT_MODEL_DIRECTORY, exist_ok=True)
            except OSError as exc:
                raise ValueError(
                    f"Failed to create model download directory at {DEFAULT_MODEL_DIRECTORY}: {exc}. "
                    "Please specify model_path."
                )
            model_path = DEFAULT_MODEL_DIRECTORY
        else:
            model_path = model_path.replace("\\", "\\\\")
    
        if not os.path.exists(model_path):
>           raise ValueError(f"Invalid model directory: {model_path}")
E           ValueError: Invalid model directory: None

How is it still None at that point? None should trigger the creation and setting of the DEFAULT_MODEL_DIRECTORY.

Maybe I'll have to remove all my models and paths to find out what's going on here.

Edit: Alright I see the problem. I guess I just failed to remove the env var here when I tried previously. There's a conversion from None to 'None' happening there, too. 🤦‍♂️

The idea is to eventually accept files/folders both as Path and str, but this version isn't there yet.

apage43 · 2023-09-05T18:24:20Z

If you don't set the PYTEST_MODEL_PATH, the fallback will be ~/.cache/gpt4all. Are all of your models in that directory?

They are, though for me it is a symlink to the path the GUI puts them in:

lrwxr-xr-x 1 aaron staff 57 Jun 30 14:20 /Users/aaron/.cache/gpt4all -> /Users/aaron/Library/Application Support/nomic.ai/GPT4All

cosmic-snow · 2023-10-13T12:18:06Z

Alright, so it looks like the version that finally brings the project near upstream llama.cpp again (i.e. 2.5.0 of chat GUI; GGUF support) is about to be ready.

I guess the PR is outdated now (haven't followed everything closely), but the idea is still the same: if/when it gets accepted, refine things in a branch of the repo so there's access to CI, i.e. CI can run when pushing commits.

How exactly that happens is up to you.

manyoso · 2024-03-06T19:02:30Z

This is outdated and to be closed?

cosmic-snow · 2024-03-09T23:50:42Z

This was outdated a while ago because you guys didn't want to move it forward or merge it when it was ready.

cebtenzzre · 2024-03-10T03:59:56Z

This was outdated a while ago because you guys didn't want to move it forward or merge it when it was ready.

If you're still interested in updating this PR to GGUF and the latest bindings, I could run CI for you and review/merge it. apage43 hasn't been assigned to GPT4All for a while.

cosmic-snow · 2024-03-10T05:24:15Z

If you're still interested in updating this PR to GGUF and the latest bindings, I could run CI for you and review/merge it. apage43 hasn't been assigned to GPT4All for a while.

Why not talk to me on Discord while I'm there? Here's the summary of what has been going on and the state this is in:

It was basically ready by the mid of August 2023
I had to pipe up on Discord to even someone to look at it
There was one issue with PYTEST_MODEL_PATH which I didn't catch although I thought I had it covered, because of my setup (through which it was set multiple times, so removing it once was not enough to uncover the issue)
However, I fixed that with 21744b8 (early September)
I never heard back from anyone even when bringing it up.

Now I get that you people have been busy, but it's simply at a point where someone would have needed to spend some time on it. That shouldn't have been too much to ask.

I can't decide this for you.

I don't have a Mac myself, so the only way I could have the tests run was through another volunteer: Acrasst (no one at Nomic volunteered). And this wasn't even on one of the new M machines, but it became clear there were differences between the platforms.

So what I would have liked to see was the following:

Running the latest commit (21744b8) to see that it's OK.
Figuring out if and how to move it to separate branch in the project itself, so it could run on CI. Note:
- It's likely that not all the tests should be active for that, some are resource-heavy.
- The work I did with Pytest markers was exactly to be able to select a subset of tests.
- I don't know what resource constraints there are on CI nor do I know what it costs you, so this is something you'd need to decide.
With that in place, try to bring everything up-to-date (that means current main)
Only at that point, see whether it'd be a good idea to actually merge it into main.

If you're still interested in updating this PR to GGUF and the latest bindings, I could run CI for you and review/merge it.

You'll have to excuse me, but at this point in time, my motivation to contribute any more time to the project is at an all-time low.

And regardless, the things I've outlined above have never been addressed and you guys basically refuse to talk to me about it.

cosmic-snow added the python-bindings gpt4all-bindings Python specific issues label Jul 30, 2023

cosmic-snow force-pushed the python-cleanup-3 branch from bc7f3fd to c46545b Compare August 6, 2023 20:37

cosmic-snow marked this pull request as ready for review August 13, 2023 21:50

cosmic-snow added 10 commits August 14, 2023 22:12

Add pytest marks for more fine-grained test control (nomic-ai#1293)

b78894d

- Mark decorators in test_gpt4all.py - Add note to the makefile - pytest marks need to be defined in a config file, here: pytest.ini - see: https://docs.pytest.org/en/7.4.x/how-to/mark.html

The 'ggml-' prefix got dropped from the Nous Hermes model

462c459

Add PYTEST_MODEL_PATH env var handling (nomic-ai#1293)

3287537

- Specify optional PYTEST_MODEL_PATH environment variable. - Format is like other PATH env vars, i.e. multiple folders allowed. - If set, these are the folder(s) in which models will be looked up for the tests.

Python: add more GPT4All tests (nomic-ai#1293)

ee0653c

Python: add backend tests (nomic-ai#1293)

8b857f0

- Tests are focused on the backend, i.e. the C API (through ctypes) - Add another pytest marker specifically for the backend: c_api_call

Python: add pyllmodel tests (nomic-ai#1293)

b02f20d

- Tests for the rest of the pyllmodel module which are mostly focused on the LLModel class.

Python: add many more type hints (nomic-ai#1293)

d50f551

- module gpt4all - module pyllmodel - edit gpt4all_python.md to match the _format_chat_prompt_template() change

Python pyllmodel: improve embedding logic (nomic-ai#1293)

9d80d1f

Python GPT4All: factor out open/close of the chat session (nomic-ai#1293

00eeb53

) - Separate _open_chat_session() and _close_chat_session() methods so it's easier to manage a session if there are separate localities (e.g. when using a framework).

Python GPT4All: factor out model loading into its own method (nomic-a…

29b02ac

…i#1293) - Make it more flexible so e.g. the heavy lifting could be handled differently in a subclass.

cosmic-snow force-pushed the python-cleanup-3 branch from a5eb7da to 29b02ac Compare August 14, 2023 20:15

cosmic-snow requested a review from AndriyMulyar August 14, 2023 20:28

apage43 self-requested a review August 24, 2023 18:28

Fix test_backend model folder path logic & inference corner cases

a6b2a0c

apage43 requested changes Sep 2, 2023

View reviewed changes

gpt4all-bindings/python/gpt4all/tests/test_backend.py Show resolved Hide resolved

apage43 reviewed Sep 2, 2023

View reviewed changes

gpt4all-bindings/python/gpt4all/tests/test_pyllmodel.py Outdated Show resolved Hide resolved

cosmic-snow added 3 commits September 2, 2023 09:42

Fix test_backend model folder path logic & inference corner cases 2

55eec32

Fix test_backend model folder path logic & inference corner cases 3

52c8e9c

Only check required memory of Orca Mini is > 0

2a3a856

Fix model folder string conversion

21744b8

cosmic-snow mentioned this pull request Nov 26, 2023

Better prompt function #927

Closed

cebtenzzre force-pushed the main branch from 157514b to dcb0e6c Compare February 9, 2024 17:15

manyoso closed this Mar 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python Clean-Up Pass 2: Tests & Code Improvements #1293

Python Clean-Up Pass 2: Tests & Code Improvements #1293

cosmic-snow commented Jul 30, 2023 •

edited

Loading

cosmic-snow commented Jul 30, 2023 •

edited

Loading

cosmic-snow commented Aug 13, 2023

cosmic-snow commented Aug 14, 2023 •

edited

Loading

apage43 commented Aug 24, 2023

apage43 commented Aug 25, 2023

cosmic-snow commented Aug 25, 2023 •

edited

Loading

cosmic-snow commented Aug 25, 2023 •

edited

Loading

apage43 commented Sep 2, 2023

apage43 left a comment •

edited

Loading

cosmic-snow commented Sep 2, 2023

apage43 commented Sep 2, 2023 •

edited

Loading

cosmic-snow commented Sep 2, 2023 •

edited

Loading

apage43 commented Sep 5, 2023

cosmic-snow commented Oct 13, 2023

manyoso commented Mar 6, 2024

cosmic-snow commented Mar 9, 2024

cebtenzzre commented Mar 10, 2024

cosmic-snow commented Mar 10, 2024 •

edited

Loading

Python Clean-Up Pass 2: Tests & Code Improvements #1293

Python Clean-Up Pass 2: Tests & Code Improvements #1293

Conversation

cosmic-snow commented Jul 30, 2023 • edited Loading

Describe your changes

Pytest Markers

Specifying Model Path with PYTEST_MODEL_PATH

Issue ticket number and link

Checklist before requesting a review

Demo

Steps to Reproduce

Notes

cosmic-snow commented Jul 30, 2023 • edited Loading

cosmic-snow commented Aug 13, 2023

cosmic-snow commented Aug 14, 2023 • edited Loading

apage43 commented Aug 24, 2023

apage43 commented Aug 25, 2023

cosmic-snow commented Aug 25, 2023 • edited Loading

cosmic-snow commented Aug 25, 2023 • edited Loading

apage43 commented Sep 2, 2023

apage43 left a comment • edited Loading

Choose a reason for hiding this comment

cosmic-snow commented Sep 2, 2023

apage43 commented Sep 2, 2023 • edited Loading

cosmic-snow commented Sep 2, 2023 • edited Loading

apage43 commented Sep 5, 2023

cosmic-snow commented Oct 13, 2023

manyoso commented Mar 6, 2024

cosmic-snow commented Mar 9, 2024

cebtenzzre commented Mar 10, 2024

cosmic-snow commented Mar 10, 2024 • edited Loading

cosmic-snow commented Jul 30, 2023 •

edited

Loading

Specifying Model Path with `PYTEST_MODEL_PATH`

cosmic-snow commented Jul 30, 2023 •

edited

Loading

cosmic-snow commented Aug 14, 2023 •

edited

Loading

cosmic-snow commented Aug 25, 2023 •

edited

Loading

cosmic-snow commented Aug 25, 2023 •

edited

Loading

apage43 left a comment •

edited

Loading

apage43 commented Sep 2, 2023 •

edited

Loading

cosmic-snow commented Sep 2, 2023 •

edited

Loading

cosmic-snow commented Mar 10, 2024 •

edited

Loading