Can I Fine-Tune the Diarization Model to Recognize a Specific Individual's Voice? #234

shivamtawari · 2024-10-04T08:48:52Z

shivamtawari
Oct 4, 2024

I'm curious to know if it's possible to customize the diarization output. Specifically, can we assign a custom name, such as 'Mr. XYZ', to dialogues spoken by a particular person, while the rest are labeled as 'Person 0', 'Person 1', etc.?

Thanks!

MahmoudAshraf97 · 2024-10-05T09:41:55Z

MahmoudAshraf97
Oct 5, 2024
Maintainer

It's doable but not through finetuning, you will use the intermediate embeddings generated from MSDD model and compare them to reference embeddings that you generated to identify which speaker is XYZ

1 reply

shivamtawari Oct 8, 2024
Author

Thanks for getting back to me!

I get the idea of using intermediate embeddings from the MSDD model, but I'm not too sure how to actually create reference embeddings for someone like XYZ. Could you point me in the right direction on how to set that up? Also, what would be the best way to compare those embeddings to figure out who's speaking?

Any advice would be really helpful!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can I Fine-Tune the Diarization Model to Recognize a Specific Individual's Voice? #234

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Can I Fine-Tune the Diarization Model to Recognize a Specific Individual's Voice? #234

shivamtawari Oct 4, 2024

Replies: 1 comment · 1 reply

MahmoudAshraf97 Oct 5, 2024 Maintainer

shivamtawari Oct 8, 2024 Author

shivamtawari
Oct 4, 2024

Replies: 1 comment 1 reply

MahmoudAshraf97
Oct 5, 2024
Maintainer

shivamtawari Oct 8, 2024
Author