Troubleshooting of Google cloud speech to text
Category:NLP > Speech to text > Troubleshooting of Google cloud speech to text
Troubleshooting of Google Cloud Speech-to-Text - 語音辨識
Brief instruction
- For audio file which longer than 1 min, (1) Use the uri: speech:longrunningrecognize NOT speech:recognize (2) Upload files to Google cloud storage (gcs)
- For audio file which shorter than 1 min, (1) Use the uri: speech:longrunningrecognize. Or choose to use speech:recognize (2) Use the files are located on the computer. Or choose to upload files to Google cloud storage (gcs)
ERROR: (gcloud.auth.application-default.print-access-token) The Application Default Credentials are not available
input & output
$ gcloud auth application-default print-access-token ERROR: (gcloud.auth.application-default.print-access-token) The Application Default Credentials are not available. They are available if running in Google Compute Engine. Otherwise, the environment variable GOOGLE_APPLICATION_CREDENTIALS must be defined pointing to a file defining the credentials. See https://developers.google.com/accounts/docs/application-default-credentials for more information.
Solution[1][2]: Key-in the following command. And then the browser will be opened automatically. Follow the steps on the web page.
$ gcloud auth application-default login
Invalid audio channel count
error output
{ "error": { "code": 400, "message": "Invalid audio channel count", "status": "INVALID_ARGUMENT" } }
Solution: convert the audio file from stereo to mono
Invalid recognition 'config': bad encoding
error output
{ "error": { "code": 400, "message": "Invalid recognition 'config': bad encoding..", "status": "INVALID_ARGUMENT" } }
Solution: Specify the encoding of audio file. For details, see Introduction to Audio Encoding | Cloud Speech-to-Text API | Google Cloud & RecognitionConfig | Cloud Speech-to-Text API | Google Cloud. You may use VLC player to view the encoding of audio file[3]. If the codec (encoding) of audio file is not in the allowed list on page, the codec (encoding) of audio file should be converted by audio converter.
For audio longer than 1 min use LongRunningRecognize with a 'uri' parameter
input
$ curl -s -H "Content-Type: application/json" \ -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \ https://speech.googleapis.com/v1p1beta1/speech:recognize \ -d @sync-request.json
file content of sync-request.json[4]
{ "config": { "encoding":"FLAC", "sampleRateHertz": 44100, "languageCode": "cmn-Hant-TW", "alternativeLanguageCodes": ["en-US"], "enableWordTimeOffsets": false }, "audio": { "uri":"gs://<bucket_name>/<audio file name>" } }
error message[5]
{ "error": { "code": 400, "message": "Sync input too long. For audio longer than 1 min use LongRunningRecognize with a 'uri' parameter.", "status": "INVALID_ARGUMENT" } }
Solution: (1) For audio file which shorter than 1 min, use the uri: speech:recognize. (2) For audio file which longer than 1 min. Upload files to Google cloud storage (gcs). Modify the uri from speech:recognize to speech:longrunningrecognize.
$ curl -s -H "Content-Type: application/json" \ -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \ https://speech.googleapis.com/v1p1beta1/speech:longrunningrecognize \ -d @sync-request.json
Request payload size exceeds the limit: 10485760 bytes
error output
{ "error": { "code": 400, "message": "Request payload size exceeds the limit: 10485760 bytes.", "status": "INVALID_ARGUMENT" } }
Solution: (1) Use the short audio file which shorter than 1 min or (2) Modify the uri from speech:recognize to speech:longrunningrecognize for long audio file which longer than 1 min
sample_rate_hertz (16000) in RecognitionConfig must either be unspecified or match the value in the FLAC header
input & output
$ curl -s -H "Content-Type: application/json" \ -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \ https://speech.googleapis.com/v1/speech:longrunningrecognize \ -d @sync-request.json { "error": { "code": 400, "message": "sample_rate_hertz (16000) in RecognitionConfig must either be unspecified or match the value in the FLAC header (44100).", "status": "INVALID_ARGUMENT" } }
Solution: verify the sample rate of audio file
Invalid JSON payload received. Unknown name \"alternative_language_codes\" at 'config': Cannot find field
input & output
$ curl -s -H "Content-Type: application/json" \ -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \ https://speech.googleapis.com/v1/speech:longrunningrecognize \ -d @sync-request.json { "error": { "code": 400, "message": "Invalid JSON payload received. Unknown name \"alternative_language_codes\" at 'config': Cannot find field.", "status": "INVALID_ARGUMENT", "details": [ { "@type": "type.googleapis.com/google.rpc.BadRequest", "fieldViolations": [ { "field": "config", "description": "Invalid JSON payload received. Unknown name \"alternative_language_codes\" at 'config': Cannot find field." } ] } ] } }
Solution: modify the uri from https://speech.googleapis.com/v1/speech:longrunningrecognize to https://speech.googleapis.com/v1p1beta1/speech:longrunningrecognize
Related
- Best Practices | Cloud Speech API Documentation | Google Cloud
- cloud-speech-discuss - Google Group
- php-docs-samples/speech at master · GoogleCloudPlatform/php-docs-samples
- Text to speech
References
- ↑ Setting Up Authentication for Server to Server Production Applications | Authentication | Google Cloud
- ↑ Quickstart: Using the Command Line | Cloud Speech API Documentation | Google Cloud
- ↑ How to view audio bitrate in VLC - The VideoLAN Forums
- ↑ Language Support | Cloud Speech-to-Text API | Google Cloud
- ↑ Playing with Google Speech API - Playground for the mind
Troubleshooting of ...
- PHP, cUrl, Python, selenium, HTTP status code errors
- Database: SQL syntax debug, MySQL errors, MySQLTuner errors or PostgreSQL errors
- HTML/Javascript: Troubleshooting of javascript, XPath
- Software: Mediawiki, Docker, FTP problems, online conference software
- Test connectivity for the web service, Web Ping, Network problem, Web user behavior, Web scrape troubleshooting
Template