v2.21.0
💡 Highlights!
LocalAI v2.21 release is out!
- Deprecation of the
exllama
backend - AIO images now have
gpt-4o
instead ofgpt-4-vision-preview
for Vision API - vLLM backend now supports embeddings
- New endpoint to list system information (
/system
) trust_remote_code
is now respected bysentencetransformers
- Auto warm-up and load models on start
coqui
backend switched to the community-maintained fork
What's Changed
Breaking Changes 🛠
- chore(exllama): drop exllama backend by @mudler in #3536
- chore(aio): rename gpt-4-vision-preview to gpt-4o by @mudler in #3597
Exciting New Features 🎉
- feat: elevenlabs
sound-generation
api by @dave-gray101 in #3355 - feat(vllm): add support for embeddings by @mudler in #3440
- feat: add endpoint to list system informations by @mudler in #3449
- feat: extract output with regexes from LLMs by @mudler in #3491
- feat: allow setting trust_remote_code for sentencetransformers backend by @Nyralei in #3552
- feat(api): allow to pass videos to backends by @mudler in #3601
- feat(api): allow to pass audios to backends by @mudler in #3603
- feat: auto load into memory on startup by @sozercan in #3627
- feat(coqui): switch to maintained community fork by @mudler in #3625
Bug fixes 🐛
- fix(p2p): correctly allow to pass extra args to llama.cpp by @mudler in #3368
- fix(model-loading): keep track of open GRPC Clients by @mudler in #3377
- fix(tts): check error before inspecting result by @mudler in #3415
- fix(shutdown): do not shutdown immediately busy backends by @mudler in #3543
- fix(parler-tts): fix install with sycl by @mudler in #3624
- fix(ci): fixup checksum scanning pipeline by @mudler in #3631
- fix(hipblas): do not push all variants to hipblas builds by @mudler in #3630
🧠 Models
- chore(model-gallery): add more quants for popular models by @mudler in #3365
- models(gallery): add phi-3.5 by @mudler in #3376
- models(gallery): add calme-2.1-phi3.5-4b-i1 by @mudler in #3383
- models(gallery): add magnum-v3-34b by @mudler in #3384
- models(gallery): add phi-3.5-vision by @mudler in #3421
- Revert "models(gallery): add phi-3.5-vision" by @mudler in #3422
- chore(model-gallery): ⬆️ update checksum by @localai-bot in #3425
- feat: Added Piper voice it-paola-medium by @fakezeta in #3434
- chore(model-gallery): ⬆️ update checksum by @localai-bot in #3442
- models(gallery): add hubble-4b-v1 by @mudler in #3444
- chore(model-gallery): ⬆️ update checksum by @localai-bot in #3446
- models(gallery): add yi-coder (and variants) by @mudler in #3482
- chore(model-gallery): ⬆️ update checksum by @localai-bot in #3486
- models(gallery): add reflection-llama-3.1-70b by @mudler in #3487
- models(gallery): add athena-codegemma-2-2b-it by @mudler in #3490
- models(gallery): add azure_dusk-v0.2-iq-imatrix by @mudler in #3538
- models(gallery): add mn-12b-lyra-v4-iq-imatrix by @mudler in #3539
- models(gallery): add datagemma models by @mudler in #3540
- models(gallery): add l3.1-8b-niitama-v1.1-iq-imatrix by @mudler in #3550
- models(gallery): add llama-3.1-8b-stheno-v3.4-iq-imatrix by @mudler in #3551
- fix:
gallery/index.yaml
comment spacing by @dave-gray101 in #3585 - models(gallery): add qwen2.5-14b-instruct by @mudler in #3607
- models(gallery): add qwen2.5-math-7b-instruct by @mudler in #3609
- models(gallery): add qwen2.5-14b_uncencored by @mudler in #3610
- models(gallery): add qwen2.5-coder-7b-instruct by @mudler in #3611
- models(gallery): add qwen2.5-math-72b-instruct by @mudler in #3612
- models(gallery): add qwen2.5-0.5b-instruct, qwen2.5-1.5b-instruct by @mudler in #3613
- models(gallery): add qwen2.5 32B, 72B, 32B Instruct by @mudler in #3614
- models(gallery): add llama-3.1-supernova-lite-reflection-v1.0-i1 by @mudler in #3615
- models(gallery): add llama-3.1-supernova-lite by @mudler in #3616
- models(gallery): add llama3.1-8b-shiningvaliant2 by @mudler in #3617
- models(gallery): add buddy2 by @mudler in #3618
- models(gallery): add llama-3.1-8b-arliai-rpmax-v1.1 by @mudler in #3619
- Fix NeuralDaredevil URL by @nyx4ris in #3621
- models(gallery): add nightygurps-14b-v1.1 by @mudler in #3633
- models(gallery): add gemma-2-9b-arliai-rpmax-v1.1 by @mudler in #3634
- models(gallery): add gemma-2-2b-arliai-rpmax-v1.1 by @mudler in #3635
- models(gallery): add acolyte-22b-i1 by @mudler in #3636
📖 Documentation and examples
- docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #3366
- chore(docs): add Vulkan images links by @mudler in #3620
👒 Dependencies
- chore: ⬆️ Update ggerganov/llama.cpp to
3ba780e2a8f0ffe13f571b27f0bbf2ca5a199efc
by @localai-bot in #3361 - chore(deps): Bump openai from 1.41.1 to 1.42.0 in /examples/functions by @dependabot in #3390
- chore(deps): Bump docs/themes/hugo-theme-relearn from
82a5e98
to3a0ae52
by @dependabot in #3391 - chore(deps): Bump idna from 3.7 to 3.8 in /examples/langchain/langchainpy-localai-example by @dependabot in #3399
- chore(deps): Bump llama-index from 0.10.65 to 0.11.1 in /examples/chainlit by @dependabot in #3404
- chore(deps): Bump llama-index from 0.10.67.post1 to 0.11.1 in /examples/langchain-chroma by @dependabot in #3406
- chore(deps): Bump marshmallow from 3.21.3 to 3.22.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3400
- chore(deps): Bump openai from 1.40.5 to 1.42.0 in /examples/langchain-chroma by @dependabot in #3405
- chore(deps): Bump openai from 1.41.1 to 1.42.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3401
- chore(deps): update edgevpn to v0.28 by @mudler in #3412
- chore(deps): Bump langchain from 0.2.14 to 0.2.15 in /examples/functions by @dependabot in #3453
- chore(deps): Bump certifi from 2024.7.4 to 2024.8.30 in /examples/langchain/langchainpy-localai-example by @dependabot in #3457
- chore(deps): Bump yarl from 1.9.4 to 1.9.7 in /examples/langchain/langchainpy-localai-example by @dependabot in #3459
- chore(deps): Bump langchain-community from 0.2.12 to 0.2.15 in /examples/langchain/langchainpy-localai-example by @dependabot in #3461
- chore(deps): Bump llama-index from 0.11.1 to 0.11.4 in /examples/chainlit by @dependabot in #3462
- chore(deps): Bump llama-index from 0.11.1 to 0.11.4 in /examples/langchain-chroma by @dependabot in #3467
- chore(deps): Bump docs/themes/hugo-theme-relearn from
3a0ae52
to550a6ee
by @dependabot in #3472 - chore(deps): Bump openai from 1.42.0 to 1.43.0 in /examples/functions by @dependabot in #3452
- chore(deps): Bump langchain from 0.2.14 to 0.2.15 in /examples/langchain/langchainpy-localai-example by @dependabot in #3460
- chore(deps): Bump openai from 1.42.0 to 1.43.0 in /examples/langchain-chroma by @dependabot in #3468
- chore(deps): Bump langchain from 0.2.14 to 0.2.15 in /examples/langchain-chroma by @dependabot in #3466
- chore(deps): Bump streamlit from 1.37.1 to 1.38.0 in /examples/streamlit-bot by @dependabot in #3465
- chore(deps): Bump openai from 1.42.0 to 1.43.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3456
- chore(deps): Bump langchain-community from 0.2.15 to 0.2.16 in /examples/langchain/langchainpy-localai-example by @dependabot in #3500
- chore(deps): Bump openai from 1.43.0 to 1.44.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3504
- chore(deps): Bump docs/themes/hugo-theme-relearn from
550a6ee
tof696f60
by @dependabot in #3505 - chore(deps): Bump langchain from 0.2.15 to 0.2.16 in /examples/langchain-chroma by @dependabot in #3507
- chore(deps): Bump peter-evans/create-pull-request from 6 to 7 by @dependabot in #3518
- chore(deps): Bump openai from 1.43.0 to 1.44.0 in /examples/functions by @dependabot in #3522
- chore(deps): Bump langchain from 0.2.15 to 0.2.16 in /examples/langchain/langchainpy-localai-example by @dependabot in #3502
- chore(deps): Bump numpy from 2.1.0 to 2.1.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #3503
- chore(deps): Bump llama-index from 0.11.4 to 0.11.7 in /examples/langchain-chroma by @dependabot in #3508
- chore(deps): Bump langchain from 0.2.15 to 0.2.16 in /examples/functions by @dependabot in #3521
- chore(deps): Bump openai from 1.43.0 to 1.44.1 in /examples/langchain-chroma by @dependabot in #3532
- chore(deps): Bump yarl from 1.9.7 to 1.11.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3501
- chore(deps): Bump llama-index from 0.11.4 to 0.11.7 in /examples/chainlit by @dependabot in #3516
- chore(deps): update llama.cpp to 6262d13e0b2da91f230129a93a996609a2fa2f2 by @mudler in #3549
- chore(deps): Bump docs/themes/hugo-theme-relearn from
f696f60
tod5a0ee0
by @dependabot in #3558 - chore(deps): Bump setuptools from 72.1.0 to 75.1.0 in /backend/python/coqui by @dependabot in #3554
- chore(deps): Bump langchain from 0.2.16 to 0.3.0 in /examples/functions by @dependabot in #3559
- chore(deps): Bump openai from 1.44.1 to 1.45.1 in /examples/langchain-chroma by @dependabot in #3556
- chore(deps): Bump setuptools from 72.1.0 to 75.1.0 in /backend/python/autogptq by @dependabot in #3553
- chore(deps): Bump securego/gosec from 2.21.0 to 2.21.2 by @dependabot in #3561
- chore(deps): Bump setuptools from 69.5.1 to 75.1.0 in /backend/python/transformers-musicgen by @dependabot in #3564
- chore(deps): Bump setuptools from 72.1.0 to 75.1.0 in /backend/python/parler-tts by @dependabot in #3565
- chore(deps): Bump sentence-transformers from 3.0.1 to 3.1.0 in /backend/python/sentencetransformers by @dependabot in #3566
- chore(deps): Bump llama-index from 0.11.7 to 0.11.9 in /examples/chainlit by @dependabot in #3567
- chore(deps): Bump weaviate-client from 4.6.7 to 4.8.1 in /examples/chainlit by @dependabot in #3568
- chore(deps): Bump setuptools from 72.1.0 to 75.1.0 in /backend/python/vall-e-x by @dependabot in #3570
- chore(deps): Bump greenlet from 3.0.3 to 3.1.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3571
- chore(deps): Bump setuptools from 70.3.0 to 75.1.0 in /backend/python/diffusers by @dependabot in #3575
- chore(deps): Bump setuptools from 70.3.0 to 75.1.0 in /backend/python/bark by @dependabot in #3574
- chore(deps): Bump setuptools from 72.1.0 to 75.1.0 in /backend/python/rerankers by @dependabot in #3578
- chore(deps): Bump setuptools from 69.5.1 to 75.1.0 in /backend/python/transformers by @dependabot in #3579
- chore(deps): Bump setuptools from 70.3.0 to 75.1.0 in /backend/python/vllm by @dependabot in #3580
- chore(deps): Bump langchain from 0.2.16 to 0.3.0 in /examples/langchain-chroma by @dependabot in #3557
- chore(deps): Bump openai from 1.44.0 to 1.45.1 in /examples/functions by @dependabot in #3560
- chore(deps): Bump langchain from 0.2.16 to 0.3.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3577
- chore(deps): Bump openai from 1.44.0 to 1.45.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #3573
- chore(deps): Bump pypinyin from 0.50.0 to 0.53.0 in /backend/python/openvoice by @dependabot in #3562
- chore(deps): Bump yarl from 1.11.0 to 1.11.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #3643
- chore(deps): Bump urllib3 from 2.2.2 to 2.2.3 in /examples/langchain/langchainpy-localai-example by @dependabot in #3646
- chore(deps): Bump idna from 3.8 to 3.10 in /examples/langchain/langchainpy-localai-example by @dependabot in #3644
- chore(deps): Bump sqlalchemy from 2.0.32 to 2.0.35 in /examples/langchain/langchainpy-localai-example by @dependabot in #3649
Other Changes
- feat: external backend launching log improvements and relative path support by @dave-gray101 in #3348
- Update quickstart.md by @grant-wilson in #3373
- feat(swagger): update swagger by @localai-bot in #3370
- fix: devcontainer
utils.sh
ssh copy improvements by @dave-gray101 in #3372 - chore(cuda): reduce binary size by @mudler in #3379
- chore(deps): update edgevpn by @mudler in #3385
- chore: ⬆️ Update ggerganov/llama.cpp to
7d787ed96c32be18603c158ab0276992cf0dc346
by @localai-bot in #3409 - chore: ⬆️ Update ggerganov/llama.cpp to
20f1789dfb4e535d64ba2f523c64929e7891f428
by @localai-bot in #3417 - chore: ⬆️ Update ggerganov/llama.cpp to
9fe94ccac92693d4ae1bc283ff0574e8b3f4e765
by @localai-bot in #3424 - chore(cli): be consistent between workers and expose ExtraLLamaCPPArgs to both by @mudler in #3428
- chore(tests): replace runaway models for tests by @mudler in #3432
- chore(model-loader): increase test coverage of model loader by @mudler in #3433
- chore(deps): update llama.cpp by @mudler in #3438
- chore: ⬆️ Update ggerganov/llama.cpp to
a47667cff41f5a198eb791974e0afcc1cddd3229
by @localai-bot in #3441 - chore: ⬆️ Update ggerganov/llama.cpp to
8f1d81a0b6f50b9bad72db0b6fcd299ad9ecd48c
by @localai-bot in #3445 - fix: untangle pkg/grpc and core/schema for Transcription by @dave-gray101 in #3419
- chore(deps): update whisper.cpp by @mudler in #3443
- chore: ⬆️ Update ggerganov/llama.cpp to
48baa61eccdca9205daf8d620ba28055c2347b64
by @localai-bot in #3474 - chore: ⬆️ Update ggerganov/whisper.cpp to
5236f0278420ab776d1787c4330678d80219b4b6
by @localai-bot in #3475 - chore: ⬆️ Update ggerganov/llama.cpp to
8962422b1c6f9b8b15f5aeaea42600bcc2d44177
by @localai-bot in #3478 - fix: purge a few remaining runway model references by @dave-gray101 in #3480
- chore: ⬆️ Update ggerganov/llama.cpp to
581c305186a0ff93f360346c57e21fe16e967bb7
by @localai-bot in #3481 - chore: ⬆️ Update ggerganov/llama.cpp to
4db04784f96757d74f74c8c110c2a00d55e33514
by @localai-bot in #3485 - feat(swagger): update swagger by @localai-bot in #3484
- chore: ⬆️ Update ggerganov/llama.cpp to
815b1fb20a53e439882171757825bacb1350de04
by @localai-bot in #3489 - chore: ⬆️ Update ggerganov/whisper.cpp to
5caa19240d55bfd6ee316d50fbad32c6e9c39528
by @localai-bot in #3494 - fix: speedup and improve cachability of docker build of
builder-sd
by @dave-gray101 in #3430 - chore: ⬆️ Update ggerganov/whisper.cpp to
a551933542d956ae84634937acd2942eb40efaaf
by @localai-bot in #3534 - chore(deps): update llama.cpp by @mudler in #3497
- chore(gosec): fix CI by @mudler in #3537
- chore: ⬆️ Update ggerganov/llama.cpp to
feff4aa8461da7c432d144c11da4802e41fef3cf
by @localai-bot in #3542 - chore: ⬆️ Update ggerganov/whisper.cpp to
049b3a0e53c8a8e4c4576c06a1a4fccf0063a73f
by @localai-bot in #3548 - feat: auth v2 - supersedes #2894 by @dave-gray101 in #3476
- chore: ⬆️ Update ggerganov/llama.cpp to
23e0d70bacaaca1429d365a44aa9e7434f17823b
by @localai-bot in #3581 - Revert "chore(deps): Bump setuptools from 69.5.1 to 75.1.0 in /backend/python/transformers" by @mudler in #3586
- chore(refactor): drop duplicated shutdown logics by @mudler in #3589
- Revert "chore(deps): Bump securego/gosec from 2.21.0 to 2.21.2" by @mudler in #3590
- chore: ⬆️ Update ggerganov/llama.cpp to
8b836ae731bbb2c5640bc47df5b0a78ffcb129cb
by @localai-bot in #3591 - chore: ⬆️ Update ggerganov/whisper.cpp to
5b1ce40fa882e9cb8630b48032067a1ed2f1534f
by @localai-bot in #3592 - chore: ⬆️ Update ggerganov/llama.cpp to
64c6af3195c3cd4aa3328a1282d29cd2635c34c9
by @localai-bot in #3598 - feat(swagger): update swagger by @localai-bot in #3604
- chore: ⬆️ Update ggerganov/llama.cpp to
6026da52d6942b253df835070619775d849d0258
by @localai-bot in #3605 - chore: ⬆️ Update ggerganov/whisper.cpp to
34972dbe221709323714fc8402f2e24041d48213
by @localai-bot in #3623 - chore: ⬆️ Update ggerganov/llama.cpp to
63351143b2ea5efe9f8b9c61f553af8a51f1deff
by @localai-bot in #3622 - chore: ⬆️ Update ggerganov/llama.cpp to
d09770cae71b416c032ec143dda530f7413c4038
by @localai-bot in #3626 - chore: ⬆️ Update ggerganov/llama.cpp to
c35e586ea57221844442c65a1172498c54971cb0
by @localai-bot in #3629 - chore: ⬆️ Update ggerganov/llama.cpp to
f0c7b5edf82aa200656fd88c11ae3a805d7130bf
by @localai-bot in #3653 - test: preliminary tests and merge fix for authv2 by @dave-gray101 in #3584
New Contributors
- @grant-wilson made their first contribution in #3373
- @Nyralei made their first contribution in #3552
- @nyx4ris made their first contribution in #3621
Full Changelog: v2.20.1...v2.21.0