Name		Name	Last commit message	Last commit date
parent directory ..
build_tools		build_tools
integration/models/punet		integration/models/punet
sharktank		sharktank
tests		tests
.gitignore		.gitignore
README.md		README.md
conftest.py		conftest.py
mypy.ini		mypy.ini
pyproject.toml		pyproject.toml
requirements-dev-turbine.txt		requirements-dev-turbine.txt
requirements-tests.txt		requirements-tests.txt
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
version.json		version.json

README.md

SHARK Tank

WARNING: This is an early preview that is in progress. It is not ready for general use.

Light weight inference optimized layers and models for popular genai applications.

This sub-project is a work in progress. It is intended to be a repository of layers, model recipes, and conversion tools from popular LLM quantization tooling.

Project Status

Examples

The repository will ultimately grow a curated set of models and tools for constructing them, but for the moment, it largely contains some CLI exmaples. These are all under active development and should not yet be expected to work.

Perform batched inference in PyTorch on a paged llama derived LLM:

python -m sharktank.examples.paged_llm_v1 \
  --hf-dataset=open_llama_3b_v2_f16_gguf \
  "Prompt 1" \
  "Prompt 2" ...

Export an IREE compilable batched LLM for serving:

python -m sharktank.examples.export_paged_llm_v1 \
  --hf-dataset=open_llama_3b_v2_f16_gguf \
  --output-mlir=/tmp/open_llama_3b_v2_f16.mlir \
  --output-config=/tmp/open_llama_3b_v2_f16.json

Dump parsed information about a model from a gguf file:

python -m sharktank.tools.dump_gguf --hf-dataset=open_llama_3b_v2_f16_gguf

Package Python Release Builds

To build wheels for Linux:
```
./build_tools/build_linux_package.sh
```
That should produce build_tools/wheelhouse/sharktank-{X.Y.Z}.dev0-py3-none-any.whl, which can then be installed with
```
python3 -m pip install build_tools/wheelhouse/sharktank-{X.Y.Z}.dev0-py3-none-any.whl
```

To build a wheel for your host OS/arch manually:

# Build sharktank.*.whl into the dist/ directory
#   e.g. `sharktank-3.0.0.dev0-py3-none-any.whl`
python3 -m pip wheel -v -w dist .

# Install the built wheel.
python3 -m pip install dist/*.whl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sharktank

sharktank

README.md

SHARK Tank

Project Status

Examples

Perform batched inference in PyTorch on a paged llama derived LLM:

Export an IREE compilable batched LLM for serving:

Dump parsed information about a model from a gguf file:

Package Python Release Builds

Files

sharktank

Directory actions

More options

Directory actions

More options

Latest commit

History

sharktank

Folders and files

parent directory

README.md

SHARK Tank

Project Status

Examples

Perform batched inference in PyTorch on a paged llama derived LLM:

Export an IREE compilable batched LLM for serving:

Dump parsed information about a model from a gguf file:

Package Python Release Builds