Editing
Troubleshooting of Google cloud speech to text
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Template:Speech to text}} Troubleshooting of Google [https://cloud.google.com/speech-to-text/ Cloud Speech-to-Text - 語音辨識] == Brief instruction == * If the audio file's duration is '''longer''' than 1 minute, (1) Use the uri: {{kbd | key=<nowiki>speech:longrunningrecognize</nowiki>}} NOT {{kbd | key=<nowiki>speech:recognize</nowiki>}} (2) Upload files to [https://console.cloud.google.com/storage/ Google cloud storage] (gcs) * If the audio file's duration is '''shorter''' than 1 minute, (1) Use the uri: {{kbd | key=<nowiki>speech:longrunningrecognize</nowiki>}}. Or choose to use {{kbd | key=<nowiki>speech:recognize</nowiki>}} (2) Use the files are located on the computer. Or choose to upload files to [https://console.cloud.google.com/storage/ Google cloud storage] (gcs) == ERROR: (gcloud.auth.application-default.print-access-token) The Application Default Credentials are not available == input & output <pre> $ gcloud auth application-default print-access-token ERROR: (gcloud.auth.application-default.print-access-token) The Application Default Credentials are not available. They are available if running in Google Compute Engine. Otherwise, the environment variable GOOGLE_APPLICATION_CREDENTIALS must be defined pointing to a file defining the credentials. See https://developers.google.com/accounts/docs/application-default-credentials for more information. </pre> Solution<ref>[https://cloud.google.com/docs/authentication/production Setting Up Authentication for Server to Server Production Applications | Authentication | Google Cloud]</ref><ref>[https://cloud.google.com/speech-to-text/docs/quickstart-protocol Quickstart: Using the Command Line | Cloud Speech API Documentation | Google Cloud]</ref>: Key-in the following command. And then the browser will be opened automatically. Follow the steps on the web page. <pre> $ gcloud auth application-default login </pre> == Invalid audio channel count == error output <pre> { "error": { "code": 400, "message": "Invalid audio channel count", "status": "INVALID_ARGUMENT" } } </pre> Solution: convert the audio file from stereo to mono == Invalid recognition 'config': bad encoding == error output <pre> { "error": { "code": 400, "message": "Invalid recognition 'config': bad encoding..", "status": "INVALID_ARGUMENT" } } </pre> Solution: Specify the encoding of audio file. For details, see [https://cloud.google.com/speech-to-text/docs/encoding Introduction to Audio Encoding | Cloud Speech-to-Text API | Google Cloud] & [https://cloud.google.com/speech-to-text/docs/reference/rest/v1/RecognitionConfig#AudioEncoding RecognitionConfig | Cloud Speech-to-Text API | Google Cloud]. You may use VLC player to view the encoding of audio file<ref>[https://forum.videolan.org/viewtopic.php?t=95136#p315198 How to view audio bitrate in VLC - The VideoLAN Forums]</ref>. If the codec (encoding) of audio file is not in the allowed list on [https://cloud.google.com/speech-to-text/docs/reference/rest/v1/RecognitionConfig#AudioEncoding page], the codec (encoding) of audio file should be converted by [[Audio converter | audio converter]]. == If the audio file's duration is longer than 1 minute use LongRunningRecognize with a 'uri' parameter == input <pre> $ curl -s -H "Content-Type: application/json" \ -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \ https://speech.googleapis.com/v1p1beta1/speech:recognize \ -d @sync-request.json </pre> file content of sync-request.json<ref>[https://cloud.google.com/speech-to-text/docs/languages Language Support | Cloud Speech-to-Text API | Google Cloud]</ref> <pre> { "config": { "encoding":"FLAC", "sampleRateHertz": 44100, "languageCode": "cmn-Hant-TW", "alternativeLanguageCodes": ["en-US"], "enableWordTimeOffsets": false }, "audio": { "uri":"gs://<bucket_name>/<audio file name>" } } </pre> error message<ref>[http://volkanpaksoy.com/archive/2017/12/12/Playing-with-Google-Speech-API/ Playing with Google Speech API - Playground for the mind]</ref> <pre> { "error": { "code": 400, "message": "Sync input too long. For audio longer than 1 min use LongRunningRecognize with a 'uri' parameter.", "status": "INVALID_ARGUMENT" } } </pre> Solution: (1) If the audio file's duration is shorter than 1 min, use the uri: {{kbd | key=<nowiki>speech:recognize</nowiki>}}. (2) If the audio file's duration is longer than 1 min. Upload files to [https://console.cloud.google.com/storage/ Google cloud storage] (gcs). Modify the uri from {{kbd | key=<nowiki>speech:recognize</nowiki>}} to {{kbd | key=<nowiki>speech:longrunningrecognize</nowiki>}}. <pre> $ curl -s -H "Content-Type: application/json" \ -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \ https://speech.googleapis.com/v1p1beta1/speech:longrunningrecognize \ -d @sync-request.json </pre> == Request payload size exceeds the limit: 10485760 bytes == error output <pre> { "error": { "code": 400, "message": "Request payload size exceeds the limit: 10485760 bytes.", "status": "INVALID_ARGUMENT" } } </pre> Solution: (1) Use the short audio file which shorter than 1 min or (2) Modify the uri from {{kbd | key=<nowiki>speech:recognize</nowiki>}} to {{kbd | key=<nowiki>speech:longrunningrecognize</nowiki>}} for long audio file which longer than 1 min == sample_rate_hertz (16000) in RecognitionConfig must either be unspecified or match the value in the FLAC header == input & output <pre> $ curl -s -H "Content-Type: application/json" \ -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \ https://speech.googleapis.com/v1/speech:longrunningrecognize \ -d @sync-request.json { "error": { "code": 400, "message": "sample_rate_hertz (16000) in RecognitionConfig must either be unspecified or match the value in the FLAC header (44100).", "status": "INVALID_ARGUMENT" } } </pre> Solution: verify the sample rate of audio file == Invalid JSON payload received. Unknown name \"alternative_language_codes\" at 'config': Cannot find field == input & output <pre> $ curl -s -H "Content-Type: application/json" \ -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \ https://speech.googleapis.com/v1/speech:longrunningrecognize \ -d @sync-request.json { "error": { "code": 400, "message": "Invalid JSON payload received. Unknown name \"alternative_language_codes\" at 'config': Cannot find field.", "status": "INVALID_ARGUMENT", "details": [ { "@type": "type.googleapis.com/google.rpc.BadRequest", "fieldViolations": [ { "field": "config", "description": "Invalid JSON payload received. Unknown name \"alternative_language_codes\" at 'config': Cannot find field." } ] } ] } } </pre> Solution: modify the uri from {{kbd | key=<nowiki>https://speech.googleapis.com/v1/speech:longrunningrecognize</nowiki>}} to {{kbd | key=<nowiki>https://speech.googleapis.com/v1p1beta1/speech:longrunningrecognize</nowiki>}} == Related == * {{Gd}} [https://cloud.google.com/speech-to-text/docs/best-practices Best Practices | Cloud Speech API Documentation | Google Cloud] * official document: [https://cloud.google.com/speech-to-text/docs/troubleshooting Troubleshooting of Google Speech-to-text API] * [https://groups.google.com/forum/#!forum/cloud-speech-discuss cloud-speech-discuss - Google Group] * [https://github.com/GoogleCloudPlatform/php-docs-samples/tree/master/speech/ php-docs-samples/speech at master · GoogleCloudPlatform/php-docs-samples] * [[Text to speech]] == References == <references/> {{Template:Troubleshooting}} [[Category:Google]] [[Category:NLP]] [[Category:Tool]]
Summary:
Please note that all contributions to LemonWiki共筆 are considered to be released under the Creative Commons Attribution-NonCommercial-ShareAlike (see
LemonWiki:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Templates used on this page:
Template:Gd
(
edit
)
Template:Kbd
(
edit
)
Template:Speech to text
(
edit
)
Template:Troubleshooting
(
edit
)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
View history
More
Search
Navigation
Main page
Current events
Recent changes
Random page
Help
Categories
Tools
What links here
Related changes
Special pages
Page information