Skip to content

Serve the AI Singapore SEA-LION model ⚛ with vLLM

License

Notifications You must be signed in to change notification settings

aisingapore/sealion-vllm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI Singapore SEA-LION model served by vLLM inference server with Docker Compose

Requirements

SEA-LION

This section describes the setup of the SEA-LION models.

SEA-LION v2.1

  • Download LLaMA3 8B CPT SEA-LIONv2.1 Instruct.
  • Copy the model or add a symbolic link in the models directory. The path is ./models/llama3-8b-cpt-sea-lionv2.1-instruct. For example, if the model was downloaded to ~/downloads/llama3-8b-cpt-sea-lionv2.1-instruct, the symbolic link is added by:
    ln -s ~/downloads/llama3-8b-cpt-sea-lionv2.1-instruct models/

SEA-LION v3

  • Download Gemma2 9B CPT SEA-LIONv3 Instruct.
  • Set these values to null in config.json in the model directory.
    "attn_logit_softcapping": null,
    "final_logit_softcapping": null,
  • Copy the model or add a symbolic link in the models directory. The path is ./models/gemma2-9b-cpt-sea-lionv3-instruct. For example, if the model was downloaded to ~/downloads/gemma2-9b-cpt-sea-lionv3-instruct, the symbolic link is added by:
    ln -s ~/downloads/gemma2-9b-cpt-sea-lionv3-instruct models/
  • Set MODEL_NAME.
     export MODEL_NAME=gemma2-9b-cpt-sea-lionv3-instruct

Start vLLM

  • Start the service.
    docker compose up
  • vLLM is deployed as a server that implements the OpenAI API protocol. By default, it starts the server at http://localhost:8000. This server can be queried in the same format as OpenAI API. For example, list the models:
    curl http://localhost:8000/v1/models
  • Test the service. Update the model name accordingly.
    curl http://localhost:8000/v1/completions \
      -H "Content-Type: application/json" \
      -d '{
          "model": "llama3-8b-cpt-sea-lionv2.1-instruct",
          "prompt": "Artificial Intelligence is",
          "max_tokens": 20,
          "temperature": 0.8,
          "repetition_penalty": 1.2
      }'

Customisation

  • To use another model:
    • Download the model to the models directory.
    • Update the $MODEL_NAME environment variable. For example, if the model is downloaded to ./models/foo-model-30b:
      export MODEL_NAME=foo-model-30b

About

Serve the AI Singapore SEA-LION model ⚛ with vLLM

Topics

Resources

License

Stars

Watchers

Forks

Languages