Skip to content
@neuralmagic

Neural Magic

Neural Magic helps developers in accelerating machine learning performance using automated model sparsification techniques and inference technologies.

Pinned Loading

  1. nm-vllm-certs nm-vllm-certs Public

    General Information, model certifications, and benchmarks for nm-vllm enterprise distributions

    7 1

  2. deepsparse deepsparse Public

    Sparsity-aware deep learning inference runtime for CPUs

    Python 3k 173

  3. sparseml sparseml Public

    Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models

    Python 2.1k 148

  4. docs docs Public

    Top-level directory for documentation and general content

    MDX 120 7

  5. sparsezoo sparsezoo Public

    Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes

    Python 370 25

  6. guidellm guidellm Public

    Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs

    Python 163 12

Repositories

Showing 10 of 59 repositories
  • nm-vllm-certs Public

    General Information, model certifications, and benchmarks for nm-vllm enterprise distributions

    neuralmagic/nm-vllm-certs’s past year of commit activity
    7 1 1 0 Updated Nov 17, 2024
  • vllm Public Forked from vllm-project/vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    neuralmagic/vllm’s past year of commit activity
    Python 6 Apache-2.0 4,668 0 15 Updated Nov 16, 2024
  • nm-actions Public

    Neural Magic GHA

    neuralmagic/nm-actions’s past year of commit activity
    Python 0 Apache-2.0 0 0 3 Updated Nov 16, 2024
  • graphs Public
    neuralmagic/graphs’s past year of commit activity
    0 Apache-2.0 0 0 0 Updated Nov 16, 2024
  • temp-llm-compressor Public Forked from vllm-project/llm-compressor

    Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

    neuralmagic/temp-llm-compressor’s past year of commit activity
    Python 0 Apache-2.0 62 0 0 Updated Nov 15, 2024
  • upstream-transformers Public Forked from huggingface/transformers

    🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

    neuralmagic/upstream-transformers’s past year of commit activity
    Python 1 Apache-2.0 27,404 0 0 Updated Nov 13, 2024
  • alpaca_eval Public Forked from tatsu-lab/alpaca_eval

    An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.

    neuralmagic/alpaca_eval’s past year of commit activity
    Jupyter Notebook 0 Apache-2.0 253 0 0 Updated Nov 12, 2024
  • compressed-tensors Public

    A safetensors extension to efficiently store sparse quantized tensors on disk

    neuralmagic/compressed-tensors’s past year of commit activity
    Python 49 Apache-2.0 2 1 8 Updated Nov 12, 2024
  • evalplus Public Forked from evalplus/evalplus

    NeuralMagic fork of EvalPlus (Rigourous evaluation of LLM-synthesized code - NeurIPS 2023)

    neuralmagic/evalplus’s past year of commit activity
    Python 0 Apache-2.0 108 0 0 Updated Nov 5, 2024
  • guidellm Public

    Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs

    neuralmagic/guidellm’s past year of commit activity
    Python 163 Apache-2.0 12 9 9 Updated Nov 5, 2024