Editing
Speech to text
(section)
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Speech to text software == [https://notebooklm.google.com/ NotebookLM] {{access | date = 2026-06-06}} * Input file format: Audio file (max 200 MB)<ref>[https://support.google.com/notebooklm/answer/16269187?hl=en#zippy=%2Cfile-size-limit-for-sources-in-notebooklm Frequently asked questions - NotebookLM Help]: "The current limit is 500,000 words per source or up to 200MB for local uploads. There's no page limit."</ref> * Supported languages: 80+ languages<ref>[https://support.google.com/notebooklm/answer/16261963?hl=en&co=GENIE.Platform%3DDesktop#zippy= Change output language in NotebookLM - Computer - NotebookLM Help]</ref> * Speaker identification: Prompt-based * Price: Free and paid tiers available * Output file format: TXT (Prompt-based) * Notes: (1) Timestamp formatting is not supported {{exclaim}} (2) Mixed-language audio (e.g. Mandarin Chinese/English): Automatically translated into the target language as specified in the user prompt [https://asr.yating.tw/ 雅婷逐字稿] * Input file format: Audio file or video file * Supported languages: (1) Mandarin Chinese & English, (2) Mandarin Chinese, English & Taiwanese (3) English * Speaker identification: Yes {{Gd}} * Price: Free and paid tiers available * Output file format: PDF, TXT, ODT, DOCX, SRT, CSV * Notes: [https://gemini.google.com/app Gemini] * Input file format: Audio file or video file<ref>[https://support.google.com/gemini/answer/14903178?hl=en&co=GENIE.Platform%3DDesktop&sjid=5876216419430700379-NC Upload & analyze files in Gemini Apps - Computer - Gemini Apps Help]</ref> The Gemini app doesn't support direct audio file uploads larger than 20 MB — you'll need to either use the File API or upload the file to Google Drive first and then link it from within the Gemini app.<ref>[https://ai.google.dev/gemini-api/docs/audio Audio understanding - generateContent API | Google AI for Developers]</ref> * Supported languages: * Speaker identification: Prompt-based * Price: Free and paid tiers available * Output file format: TXT (Prompt-based) * Notes: [https://app.clipchamp.com/ Clipchamp] {{access | date = 2026-06-06}} * Input file format: audio or video file * Support Language: 80+ languages<ref>[https://support.microsoft.com/en-us/topic/how-to-use-autocaptions-in-clipchamp-ccb0520b-38f6-4fa9-aca8-872c2964946a How to use autocaptions in Clipchamp - Microsoft Support]</ref><ref>[https://learn.microsoft.com/zh-tw/azure/ai-services/speech-service/language-support?tabs=stt 語言支援 - 語音服務 - Azure AI services | Microsoft Learn]</ref> * Speaker identification: * Output file format: SRT * Comments: The free version seems to have no limitation on video duration, and you can also use AI to convert videos or audio into transcripts for free. However, during testing, the subtitles displayed for each time code were not complete sentences. [https://ink.dwave.cc/en-US/pricing Meeting Ink - AI notetaker to transcribe and summarize your meetings and recordings.] * Input file: Audio files * Support Language: * Speaker identification: Yes {{Gd}} * Real-Time Subtitles or Translation: Pro plan only ''$'' * Free limit: 30 minutes max [https://huggingface.co/spaces/Xenova/whisper-web Whisper Web - a Hugging Face Space by Xenova] * Input file: Audio files * Support Language: English * Speaker identification: No {{exclaim}} * Output file format: TXT or JSON (contains timestamp info.) 影片要產生文字,可利用 youtube 的 [https://support.google.com/youtube/answer/6373554?hl=en Use automatic captioning - YouTube Help],約需要半天時間 {{access | date = 2018-09-04}} 教學: [https://www.techbang.com/posts/2107 YouTube超佛心,自動幫你加入字幕! | T客邦] * Input: Video * Language: * Sample code: * Related: [https://pulipulichen.github.io/HTML5-Speech-to-Text/ Web Speech to Text] 教學: [https://www.playpcesor.com/2019/12/Web-Speech-to-Text.html 免費!中文影片語音轉文字字幕,支援超大影片與長時間錄音] * 物件: 電腦影像、聲音、YouTube 網址 * 語言: 中文、英文、日文、韓文 [https://app.voicetapp.com/ Voicetapp - AI Voice to Text Transcription] * Language: 中文、英文等多種語言 * Sample code: * Related: * Free limit: 5 minutes [https://www.mygoodtape.com/ Good Tape] * Support Language: * Input file: Audio files * Speaker identification: Available {{Gd}} * Real-Time Subtitles or Translation: Not Available * Free limit: 20 minutes max [https://www.larksuite.com/ Lark | Business Chat & Collaboration Tool] ([https://zh.wikipedia.org/wiki/%E9%A3%9E%E4%B9%A6 飞书 - 維基百科,自由的百科全書]) * Language: * Sample code: * Related: * Free limit: [https://web.itranscribe.co/#/homepage iTranscribe: Transcribe Audio & Video to Text] * Language: * Sample code: * Related: * Free limit: [https://www.capcut.cn/ 剪映官網-全能易用的桌面端剪輯軟體-輕而易剪 上演大幕] 中國軟體 {{exclaim}} * Language: * Sample code: * Related: * Free limit: ''$'' [https://goodsnooze.gumroad.com/l/macwhisper MacWhisper] on {{Mac}} * Input file format: Audio file or video file * Supported languages: * Speaker identification: Yes {{Gd}} * Price: Free or Pro plan * Output file format: TXT, DOCX, SRT, VTT, JSON and more * Notes:
Summary:
Please note that all contributions to LemonWiki共筆 are considered to be released under the Creative Commons Attribution-NonCommercial-ShareAlike (see
LemonWiki:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
View history
More
Search
Navigation
Main page
Current events
Recent changes
Random page
Help
Categories
Tools
What links here
Related changes
Special pages
Page information