Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yeah there are some models that I played with that can do this. They only work for 2 or 3 speakers currently though. They term for this is "diarization".

https://huggingface.co/pyannote/speaker-diarization



I wonder, do any conference call services (zoom, GMeet, etc) offer the ability to record each participant's audio stream separately in a way that would make it easy to transcribe them separately then combine?


FWIW, GMeet supports meeting transcription natively.

https://support.google.com/meet/answer/12849897?hl=en


Zoom has this option.


Thanks




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: