Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The timestamp issue of Whisper #27

Open
huimoran opened this issue Nov 15, 2024 · 0 comments
Open

The timestamp issue of Whisper #27

huimoran opened this issue Nov 15, 2024 · 0 comments

Comments

@huimoran
Copy link

Hello, when I was using Whisper and pyannote to complete a speech-to-text task, during runtime, part of the output was:

"Whisper did not predict an ending timestamp, which can happen if audio is cut off in the middle of a word. Also make sure WhisperTimeStampLogitsProcessor was used during generation."

Also, when recognizing my speech, there are instances where the speech from the same speaker is divided into several segments. I would like to know, should I use pyannote's segmentation method? Or can I still use Whisper's timestamps? If I want to use Whisper's timestamps, do you have any good suggestions? Or do you have any good advice for the issues I'm encountering?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant