Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
mgoin authored Nov 12, 2023
1 parent 3800d7f commit 42d02a4
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions demos/windows-text-generation-server/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ Here is a guide for running a large language model (LLM) for text generation on
**Run the DeepSparse Server**:
- Execute: `deepsparse.server --task text-generation --integration openai --model_path hf:neuralmagic/mpt-7b-chat-pruned50-quant`
- This command downloads and starts a server hosting the model as a RESTful endpoint with an OpenAI API compatible endpoint.
- If you want to run other models, explore the [other optimized models on SparseZoo](https://sparsezoo.neuralmagic.com/?task=text_generation).
- If you would like to learn about non-server inference, [check out the text generation pipeline documentation](https://github.com/neuralmagic/deepsparse/blob/main/docs/llms/text-generation-pipeline.md).
- Keep this terminal open. The server must remain running to handle requests.

Expand Down

0 comments on commit 42d02a4

Please sign in to comment.