Bump transformers from 4.44.2 to 4.46.2 #2182

dependabot · 2024-11-06T10:19:49Z

Bumps transformers from 4.44.2 to 4.46.2.

Release notes

Patch release v4.46.2

Mostly had to finish the gradient accumulation ! Thanks to @techkang and @Ryukijano 🤗

VLMs: fix number of image tokens (#34332) by @zucchini-nlp

fix pixtral processor (#34486) by @@molbap

enable average tokens across devices (#34373) by @techkang and @muellerzr

Update trainer for easier handling of accumulate, compile fixes, and … by @muellerzr and @Ryukijano

MPS: isin_mps_friendly can support 0D tensors (#34538) by @gante

Patch release v4.46.1

Patch release v4.4.61

This is mostly for fx and onnx issues!

** Fix regression loading dtype #34409 by @SunMarc ** LLaVa: latency issues #34460 by @zucchini-nlp ** Fix pix2struct #34374 by @IlyasMoutawwakil ** Fix onnx non-exposable inplace aten op #34376 by @IlyasMoutawwakil ** Fix torch.fx issue related to the new loss_kwargs keyword argument #34380 by @michaelbenayoun

Release v4.46.0

New model additions

Moshi

The Moshi model was proposed in Moshi: a speech-text foundation model for real-time dialogue by Alexandre Défossez, Laurent Mazaré, Manu Orsini, Amélie Royer, Patrick Pérez, Hervé Jégou, Edouard Grave and Neil Zeghidour.

Moshi is a speech-text foundation model that casts spoken dialogue as speech-to-speech generation. Starting from a text language model backbone, Moshi generates speech as tokens from the residual quantizer of a neural audio codec, while modeling separately its own speech and that of the user into parallel streams. This allows for the removal of explicit speaker turns, and the modeling of arbitrary conversational dynamics. Moshi also predicts time-aligned text tokens as a prefix to audio tokens. This “Inner Monologue” method significantly improves the linguistic quality of generated speech and provides streaming speech recognition and text-to-speech. As a result, Moshi is the first real-time full-duplex spoken large language model, with a theoretical latency of 160ms, 200ms in practice.

Moshi integration by @ylacombe in #33624

Zamba

Zamba-7B-v1 is a hybrid between state-space models (Specifically Mamba) and transformer, and was trained using next-token prediction. Zamba uses a shared transformer layer after every 6 mamba blocks. It uses the Mistral v0.1 tokenizer. We came to this architecture after a series of ablations at small scales. Zamba-7B-v1 was pre-trained on 1T tokens of text and code data.

... (truncated)

Commits

ccbd57a MPS: isin_mps_friendly can support 0D tensors (#34538)
e66224b v4.46.2
8c62a92 Update trainer for easier handling of accumulate, compile fixes, and proper r...
5b36cda enable average tokens across devices (#34373)
f784d95 fix pixtral processor (#34486)
7da0eef VLMs: fix number of image tokens (#34332)
bc598c0 v4.41.1
94ed13c Fix regression loading dtype (#34409)
72c716d LLaVA: latency issues (#34460)
97bb929 Fix pix2struct (#34374)
Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR
@dependabot recreate will recreate this PR, overwriting any edits that have been made to it
@dependabot merge will merge this PR after your CI passes on it
@dependabot squash and merge will squash and merge this PR after your CI passes on it
@dependabot cancel merge will cancel a previously requested merge and block automerging
@dependabot reopen will reopen this PR if it is closed
@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
@dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [transformers](https://github.com/huggingface/transformers) from 4.44.2 to 4.46.2. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](huggingface/transformers@v4.44.2...v4.46.2) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]>

JasonLovesDoggo · 2024-11-13T17:59:54Z

shouldn't cause any issues

dependabot bot added the dependencies Pull requests that update a dependency file label Nov 6, 2024

dependabot bot mentioned this pull request Nov 6, 2024

Bump transformers from 4.44.2 to 4.46.1 #2177

Closed

JasonLovesDoggo merged commit deb1c95 into develop Nov 13, 2024
4 checks passed

JasonLovesDoggo deleted the dependabot/pip/develop/transformers-4.46.2 branch November 13, 2024 17:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bump transformers from 4.44.2 to 4.46.2 #2182

Bump transformers from 4.44.2 to 4.46.2 #2182

dependabot bot commented on behalf of github Nov 6, 2024

JasonLovesDoggo commented Nov 13, 2024

Bump transformers from 4.44.2 to 4.46.2 #2182

Bump transformers from 4.44.2 to 4.46.2 #2182

Conversation

dependabot bot commented on behalf of github Nov 6, 2024

Patch release v4.46.2

Patch release v4.46.1

Patch release v4.4.61

Release v4.46.0

New model additions

Moshi

Zamba

JasonLovesDoggo commented Nov 13, 2024