14,958
edits
(Created page with "[https://github.com/m-bain/whisperX whisperX] is an enhanced version of OpenAI's Whisper, offering fast automatic speech recognition with word-level timestamps and speaker diarization. It uses the faster-whisper backend and can run the large-v2 model on less than 8GB of GPU memory. whisperX also includes voice activity detection (VAD) preprocessing, reducing hallucinations and supporting batch processing. == whisperX Troubleshooting Guide == === Error: HF_TOKEN environ...") |
mNo edit summary |
||
| Line 30: | Line 30: | ||
<pre>whisperx input.mp3 --model large-v3 --language zh --diarize --batch_size 24 --no_align --chunk_size 10 </pre> | <pre>whisperx input.mp3 --model large-v3 --language zh --diarize --batch_size 24 --no_align --chunk_size 10 </pre> | ||
=== ImportError: libcudnn.so.9: cannot open shared object file: No such file or directory === | === ImportError: libcudnn.so.9: cannot open shared object file: No such file or directory === | ||