You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, when I was using Whisper and pyannote to complete a speech-to-text task, during runtime, part of the output was:
"Whisper did not predict an ending timestamp, which can happen if audio is cut off in the middle of a word. Also make sure WhisperTimeStampLogitsProcessor was used during generation."
Also, when recognizing my speech, there are instances where the speech from the same speaker is divided into several segments. I would like to know, should I use pyannote's segmentation method? Or can I still use Whisper's timestamps? If I want to use Whisper's timestamps, do you have any good suggestions? Or do you have any good advice for the issues I'm encountering?
The text was updated successfully, but these errors were encountered:
Hello, when I was using Whisper and pyannote to complete a speech-to-text task, during runtime, part of the output was:
"Whisper did not predict an ending timestamp, which can happen if audio is cut off in the middle of a word. Also make sure WhisperTimeStampLogitsProcessor was used during generation."
Also, when recognizing my speech, there are instances where the speech from the same speaker is divided into several segments. I would like to know, should I use pyannote's segmentation method? Or can I still use Whisper's timestamps? If I want to use Whisper's timestamps, do you have any good suggestions? Or do you have any good advice for the issues I'm encountering?
The text was updated successfully, but these errors were encountered: