Name		Name	Last commit message	Last commit date
parent directory ..
server		server
README.md		README.md

README.md

Guarded Inference

This is an example server that runs inference with a model gated behind WildGuard to filter unsafe inputs. Before running inference, it sends the user message to WildGuard, and if the prompt is judged to be harmful the system appends an instruction to refuse to answer the prompt. Then, it forwards the potentially modified prompt to the inference model and returns the result to the user.

The API follows the OpenAI chat completions API specification, except for the n parameter.

The inference model can be provided by an external API or by running a vLLM inference server. Before running guarded_inference.py, you should start a vLLM server for the WildGuard model by running python -m vllm.entrypoints.openai.api_server --model "allenai/wildguard" --chat-template=empty_template.jinja --port {port}. Finally, you can run the forwarding server with python guarded_inference.py --generative-model wildguard-models/olmo_17_wildjailbreak python guarded_inference.py --generative-model-url {inference_api_base_url} --wildguard-url http://0.0.0.0:{port}.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

wildguard_filter

wildguard_filter

README.md

Guarded Inference

Files

wildguard_filter

Directory actions

More options

Directory actions

More options

Latest commit

History

wildguard_filter

Folders and files

parent directory

README.md

Guarded Inference