Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No longer diarizes #22

Open
tristan-mcinnis opened this issue Feb 18, 2024 · 4 comments
Open

No longer diarizes #22

tristan-mcinnis opened this issue Feb 18, 2024 · 4 comments

Comments

@tristan-mcinnis
Copy link

Seems that it only performs the transcription and no longer diarization. See below is based on the shared example file (of which the repo is sitll using yinruiqing's HF token - as poined out by Jordi in another thread) 太可怕~

Screenshot 2024-02-18 at 11 35 25
@yinruiqing
Copy link
Owner

This token is deactivated. You can use your own token.

@tristan-mcinnis
Copy link
Author

have changed the HF token to my own in the /cli/transcribe.py file...

And used the example code:
python -m pyannote_whisper.cli.transcribe data/afjiv.wav --model tiny --diarization True

Still doesn't work? Am i missing something?

@tristan-mcinnis
Copy link
Author

tristan-mcinnis commented Feb 19, 2024

import whisper
from pyannote.audio import Pipeline
from pyannote_whisper.utils import diarize_text
pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization",
                                    use_auth_token="hf_xxxxx -replace with my own")
model = whisper.load_model("tiny.en")
asr_result = model.transcribe("data/afjiv.wav")
diarization_result = pipeline("data/afjiv.wav")
final_result = diarize_text(asr_result, diarization_result)

for seg, spk, sent in final_result:
    line = f'{seg.start:.2f} {seg.end:.2f} {spk} {sent}'
    print(line)
The code in the readme also doesn't work.

@wagesj45
Copy link

@nexuslux Have you affirmed access through Huggingface repositories? You'll need to agree to the terms for each of the repositories pyannote uses. That would be pyannote/segmentation and pyannote/speaker-diarization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants