-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
server : add a REST Whisper server example with OAI-like API (#1380)
* Add first draft of server * Added json support and base funcs for server.cpp * Add more user input via api-request also some clean up * Add reqest params and load post function Also some general clean up * Remove unused function * Add readme * Add exception handlers * Update examples/server/server.cpp * make : add server target * Add magic curl syntax Co-authored-by: Georgi Gerganov <[email protected]> --------- Co-authored-by: Georgi Gerganov <[email protected]>
- Loading branch information
Showing
9 changed files
with
34,631 additions
and
4 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -31,6 +31,7 @@ build-sanitize-thread/ | |
/talk-llama | ||
/bench | ||
/quantize | ||
/server | ||
/lsp | ||
|
||
arm_neon.h | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
set(TARGET server) | ||
add_executable(${TARGET} server.cpp httplib.h json.hpp) | ||
|
||
include(DefaultTargetOptions) | ||
|
||
target_link_libraries(${TARGET} PRIVATE common whisper ${CMAKE_THREAD_LIBS_INIT}) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
# whisper.cpp http server | ||
|
||
Simple http server. WAV Files are passed to the inference model via http requests. | ||
|
||
``` | ||
./server -h | ||
usage: ./bin/server [options] | ||
options: | ||
-h, --help [default] show this help message and exit | ||
-t N, --threads N [4 ] number of threads to use during computation | ||
-p N, --processors N [1 ] number of processors to use during computation | ||
-ot N, --offset-t N [0 ] time offset in milliseconds | ||
-on N, --offset-n N [0 ] segment index offset | ||
-d N, --duration N [0 ] duration of audio to process in milliseconds | ||
-mc N, --max-context N [-1 ] maximum number of text context tokens to store | ||
-ml N, --max-len N [0 ] maximum segment length in characters | ||
-sow, --split-on-word [false ] split on word rather than on token | ||
-bo N, --best-of N [2 ] number of best candidates to keep | ||
-bs N, --beam-size N [-1 ] beam size for beam search | ||
-wt N, --word-thold N [0.01 ] word timestamp probability threshold | ||
-et N, --entropy-thold N [2.40 ] entropy threshold for decoder fail | ||
-lpt N, --logprob-thold N [-1.00 ] log probability threshold for decoder fail | ||
-debug, --debug-mode [false ] enable debug mode (eg. dump log_mel) | ||
-tr, --translate [false ] translate from source language to english | ||
-di, --diarize [false ] stereo audio diarization | ||
-tdrz, --tinydiarize [false ] enable tinydiarize (requires a tdrz model) | ||
-nf, --no-fallback [false ] do not use temperature fallback while decoding | ||
-ps, --print-special [false ] print special tokens | ||
-pc, --print-colors [false ] print colors | ||
-pp, --print-progress [false ] print progress | ||
-nt, --no-timestamps [false ] do not print timestamps | ||
-l LANG, --language LANG [en ] spoken language ('auto' for auto-detect) | ||
-dl, --detect-language [false ] exit after automatically detecting language | ||
--prompt PROMPT [ ] initial prompt | ||
-m FNAME, --model FNAME [models/ggml-base.en.bin] model path | ||
-oved D, --ov-e-device DNAME [CPU ] the OpenVINO device used for encode inference | ||
--host HOST, [127.0.0.1] Hostname/ip-adress for the server | ||
--port PORT, [8080 ] Port number for the server | ||
``` | ||
|
||
## request examples | ||
|
||
**/inference** | ||
``` | ||
curl 127.0.0.1:8080/inference \ | ||
-H "Content-Type: multipart/form-data" \ | ||
-F file="@<file-path>" \ | ||
-F temperature="0.2" \ | ||
-F response-format="json" | ||
``` | ||
|
||
**/load** | ||
``` | ||
curl 127.0.0.1:8080/load \ | ||
-H "Content-Type: multipart/form-data" \ | ||
-F model="<path-to-model-file>" | ||
``` |
Oops, something went wrong.