AI Singapore SEA-LION model served by vLLM inference server with Docker Compose

Requirements

Docker
GPU: compute capability 7.0 or higher (e.g., V100, T4, RTX20xx, A100, L4, H100, etc.)
CUDA 12.1
Check https://docs.vllm.ai/en/latest/getting_started/installation.html#requirements for the up to date requirements
80GB of disk storage for the model and docker image

SEA-LION

This section describes the setup of the SEA-LION models.

SEA-LION v2.1

Download LLaMA3 8B CPT SEA-LIONv2.1 Instruct.
Copy the model or add a symbolic link in the models directory. The path is ./models/llama3-8b-cpt-sea-lionv2.1-instruct. For example, if the model was downloaded to ~/downloads/llama3-8b-cpt-sea-lionv2.1-instruct, the symbolic link is added by:
```
ln -s ~/downloads/llama3-8b-cpt-sea-lionv2.1-instruct models/
```

SEA-LION v3

Download Gemma2 9B CPT SEA-LIONv3 Instruct.

Set these values to null in config.json in the model directory.

"attn_logit_softcapping": null,
"final_logit_softcapping": null,

Copy the model or add a symbolic link in the models directory. The path is ./models/gemma2-9b-cpt-sea-lionv3-instruct. For example, if the model was downloaded to ~/downloads/gemma2-9b-cpt-sea-lionv3-instruct, the symbolic link is added by:
```
ln -s ~/downloads/gemma2-9b-cpt-sea-lionv3-instruct models/
```

Set MODEL_NAME.

 export MODEL_NAME=gemma2-9b-cpt-sea-lionv3-instruct

Start vLLM

Start the service.
```
docker compose up
```
vLLM is deployed as a server that implements the OpenAI API protocol. By default, it starts the server at http://localhost:8000. This server can be queried in the same format as OpenAI API. For example, list the models:
```
curl http://localhost:8000/v1/models
```

Test the service. Update the model name accordingly.

curl http://localhost:8000/v1/completions \
  -H "Content-Type: application/json" \
  -d '{
      "model": "llama3-8b-cpt-sea-lionv2.1-instruct",
      "prompt": "Artificial Intelligence is",
      "max_tokens": 20,
      "temperature": 0.8,
      "repetition_penalty": 1.2
  }'

Customisation

To use another model:
- Download the model to the models directory.
- Update the $MODEL_NAME environment variable. For example, if the model is downloaded to ./models/foo-model-30b:
```
export MODEL_NAME=foo-model-30b
```

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
models		models
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
test.sh		test.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Singapore SEA-LION model served by vLLM inference server with Docker Compose

Requirements

SEA-LION

SEA-LION v2.1

SEA-LION v3

Start vLLM

Customisation

About

Languages

License

aisingapore/sealion-vllm

Folders and files

Latest commit

History

Repository files navigation

AI Singapore SEA-LION model served by vLLM inference server with Docker Compose

Requirements

SEA-LION

SEA-LION v2.1

SEA-LION v3

Start vLLM

Customisation

About

Topics

Resources

License

Stars

Watchers

Forks

Languages