Difference between revisions of "Troubleshooting of Google cloud speech to text"

From LemonWiki共筆
Jump to navigation Jump to search
(Created page with " Troubleshooting of Google [https://cloud.google.com/speech-to-text/ Cloud Speech-to-Text - 語音辨識] == ERROR: (gcloud.auth.application-default.print-access-token) The A...")
 
 
(17 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 +
{{Template:Speech to text}}
  
 
Troubleshooting of Google [https://cloud.google.com/speech-to-text/ Cloud Speech-to-Text - 語音辨識]
 
Troubleshooting of Google [https://cloud.google.com/speech-to-text/ Cloud Speech-to-Text - 語音辨識]
 +
 +
== Brief instruction ==
 +
* If the audio file's duration is '''longer''' than 1 minute, (1) Use the uri: {{kbd | key=<nowiki>speech:longrunningrecognize</nowiki>}} NOT {{kbd | key=<nowiki>speech:recognize</nowiki>}} (2) Upload files to [https://console.cloud.google.com/storage/ Google cloud storage] (gcs)
 +
* If the audio file's duration is '''shorter''' than 1 minute, (1) Use the uri: {{kbd | key=<nowiki>speech:longrunningrecognize</nowiki>}}. Or choose to use {{kbd | key=<nowiki>speech:recognize</nowiki>}} (2) Use the files are located on the computer. Or choose to upload files to [https://console.cloud.google.com/storage/ Google cloud storage] (gcs)
  
 
== ERROR: (gcloud.auth.application-default.print-access-token) The Application Default Credentials are not available ==
 
== ERROR: (gcloud.auth.application-default.print-access-token) The Application Default Credentials are not available ==
Line 9: Line 14:
 
</pre>
 
</pre>
  
Solution<ref>[https://cloud.google.com/docs/authentication/production Setting Up Authentication for Server to Server Production Applications  |  Authentication  |  Google Cloud]</ref><ref>[https://cloud.google.com/speech-to-text/docs/quickstart-protocol Quickstart: Using the Command Line  |  Cloud Speech API Documentation  |  Google Cloud]</ref>: Key-in the following command. And then the browser will be opened automatically.
+
Solution<ref>[https://cloud.google.com/docs/authentication/production Setting Up Authentication for Server to Server Production Applications  |  Authentication  |  Google Cloud]</ref><ref>[https://cloud.google.com/speech-to-text/docs/quickstart-protocol Quickstart: Using the Command Line  |  Cloud Speech API Documentation  |  Google Cloud]</ref>: Key-in the following command. And then the browser will be opened automatically. Follow the steps on the web page.
 
<pre>
 
<pre>
 
$ gcloud auth application-default login
 
$ gcloud auth application-default login
 
</pre>
 
</pre>
 
  
 
== Invalid audio channel count ==
 
== Invalid audio channel count ==
Line 41: Line 45:
 
</pre>
 
</pre>
  
Solution: Specify the encoding of audio file. For details, see [https://cloud.google.com/speech-to-text/docs/reference/rest/v1/RecognitionConfig#AudioEncoding RecognitionConfig  |  Cloud Speech-to-Text API  |  Google Cloud]. You may use VLC player to view the encoding of audio file<ref>[https://forum.videolan.org/viewtopic.php?t=95136#p315198 How to view audio bitrate in VLC - The VideoLAN Forums]</ref>. If the codec (encoding) of audio file is not in the allowed list on [https://cloud.google.com/speech-to-text/docs/reference/rest/v1/RecognitionConfig#AudioEncoding page], the codec (encoding) of audio file should be converted.
+
Solution: Specify the encoding of audio file. For details, see [https://cloud.google.com/speech-to-text/docs/encoding Introduction to Audio Encoding  |  Cloud Speech-to-Text API  |  Google Cloud] & [https://cloud.google.com/speech-to-text/docs/reference/rest/v1/RecognitionConfig#AudioEncoding RecognitionConfig  |  Cloud Speech-to-Text API  |  Google Cloud]. You may use VLC player to view the encoding of audio file<ref>[https://forum.videolan.org/viewtopic.php?t=95136#p315198 How to view audio bitrate in VLC - The VideoLAN Forums]</ref>. If the codec (encoding) of audio file is not in the allowed list on [https://cloud.google.com/speech-to-text/docs/reference/rest/v1/RecognitionConfig#AudioEncoding page], the codec (encoding) of audio file should be converted by [[Audio converter | audio converter]].
  
== For audio longer than 1 min use LongRunningRecognize with a 'uri' parameter ==
+
== If the audio file's duration is longer than 1 minute use LongRunningRecognize with a 'uri' parameter ==
 
input
 
input
 
<pre>
 
<pre>
Line 53: Line 57:
 
</pre>
 
</pre>
  
file content of sync-request.json
+
file content of sync-request.json<ref>[https://cloud.google.com/speech-to-text/docs/languages Language Support  |  Cloud Speech-to-Text API  |  Google Cloud]</ref>
 
<pre>
 
<pre>
 
{
 
{
Line 80: Line 84:
 
</pre>
 
</pre>
  
Solution: (1) Use the short audio file which shorter than 1 min or (2) Modify the uri from {{kbd | key=<nowiki>speech:recognize</nowiki>}} to {{kbd | key=<nowiki>speech:longrunningrecognize</nowiki>}} for long audio file which longer than 1 min
+
Solution: (1) If the audio file's duration is shorter than 1 min, use the uri: {{kbd | key=<nowiki>speech:recognize</nowiki>}}. (2) If the audio file's duration is longer than 1 min. Upload files to [https://console.cloud.google.com/storage/ Google cloud storage] (gcs). Modify the uri from {{kbd | key=<nowiki>speech:recognize</nowiki>}} to {{kbd | key=<nowiki>speech:longrunningrecognize</nowiki>}}.
 
<pre>
 
<pre>
 
$ curl -s -H "Content-Type: application/json" \
 
$ curl -s -H "Content-Type: application/json" \
Line 103: Line 107:
  
 
Solution: (1) Use the short audio file which shorter than 1 min or (2) Modify the uri from {{kbd | key=<nowiki>speech:recognize</nowiki>}} to {{kbd | key=<nowiki>speech:longrunningrecognize</nowiki>}} for long audio file which longer than 1 min
 
Solution: (1) Use the short audio file which shorter than 1 min or (2) Modify the uri from {{kbd | key=<nowiki>speech:recognize</nowiki>}} to {{kbd | key=<nowiki>speech:longrunningrecognize</nowiki>}} for long audio file which longer than 1 min
 
  
 
== sample_rate_hertz (16000) in RecognitionConfig must either be unspecified or match the value in the FLAC header ==
 
== sample_rate_hertz (16000) in RecognitionConfig must either be unspecified or match the value in the FLAC header ==
Line 153: Line 156:
  
 
Solution: modify the uri from {{kbd | key=<nowiki>https://speech.googleapis.com/v1/speech:longrunningrecognize</nowiki>}} to {{kbd | key=<nowiki>https://speech.googleapis.com/v1p1beta1/speech:longrunningrecognize</nowiki>}}
 
Solution: modify the uri from {{kbd | key=<nowiki>https://speech.googleapis.com/v1/speech:longrunningrecognize</nowiki>}} to {{kbd | key=<nowiki>https://speech.googleapis.com/v1p1beta1/speech:longrunningrecognize</nowiki>}}
 +
 +
== Related ==
 +
 +
* {{Gd}} [https://cloud.google.com/speech-to-text/docs/best-practices Best Practices  |  Cloud Speech API Documentation  |  Google Cloud]
 +
* official document: [https://cloud.google.com/speech-to-text/docs/troubleshooting Troubleshooting of Google Speech-to-text API]
 +
* [https://groups.google.com/forum/#!forum/cloud-speech-discuss cloud-speech-discuss - Google Group]
 +
* [https://github.com/GoogleCloudPlatform/php-docs-samples/tree/master/speech/ php-docs-samples/speech at master · GoogleCloudPlatform/php-docs-samples]
 +
* [[Text to speech]]
  
 
== References ==
 
== References ==
Line 160: Line 171:
 
{{Template:Troubleshooting}}
 
{{Template:Troubleshooting}}
  
[[Category:Gooele]]
+
[[Category:Google]] [[Category:NLP]] [[Category:Tool]]

Latest revision as of 17:13, 17 August 2020

Category:NLP > Speech to text > Troubleshooting of Google cloud speech to text

Troubleshooting of Google Cloud Speech-to-Text - 語音辨識

Brief instruction[edit]

  • If the audio file's duration is longer than 1 minute, (1) Use the uri: speech:longrunningrecognize NOT speech:recognize (2) Upload files to Google cloud storage (gcs)
  • If the audio file's duration is shorter than 1 minute, (1) Use the uri: speech:longrunningrecognize. Or choose to use speech:recognize (2) Use the files are located on the computer. Or choose to upload files to Google cloud storage (gcs)

ERROR: (gcloud.auth.application-default.print-access-token) The Application Default Credentials are not available[edit]

input & output

$ gcloud auth application-default print-access-token
ERROR: (gcloud.auth.application-default.print-access-token) The Application Default Credentials are not available. They are available if running in Google Compute Engine. Otherwise, the environment variable GOOGLE_APPLICATION_CREDENTIALS must be defined pointing to a file defining the credentials. See https://developers.google.com/accounts/docs/application-default-credentials for more information.

Solution[1][2]: Key-in the following command. And then the browser will be opened automatically. Follow the steps on the web page.

$ gcloud auth application-default login

Invalid audio channel count[edit]

error output

  {
    "error": {
      "code": 400,
      "message": "Invalid audio channel count",
      "status": "INVALID_ARGUMENT"
    }
  }

Solution: convert the audio file from stereo to mono

Invalid recognition 'config': bad encoding[edit]

error output

  {
    "error": {
      "code": 400,
      "message": "Invalid recognition 'config': bad encoding..",
      "status": "INVALID_ARGUMENT"
    }
  }

Solution: Specify the encoding of audio file. For details, see Introduction to Audio Encoding  |  Cloud Speech-to-Text API  |  Google Cloud & RecognitionConfig  |  Cloud Speech-to-Text API  |  Google Cloud. You may use VLC player to view the encoding of audio file[3]. If the codec (encoding) of audio file is not in the allowed list on page, the codec (encoding) of audio file should be converted by audio converter.

If the audio file's duration is longer than 1 minute use LongRunningRecognize with a 'uri' parameter[edit]

input

$ curl -s -H "Content-Type: application/json" \
    -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
    https://speech.googleapis.com/v1p1beta1/speech:recognize \
    -d @sync-request.json

file content of sync-request.json[4]

{
  "config": {
      "encoding":"FLAC",
      "sampleRateHertz": 44100,
      "languageCode": "cmn-Hant-TW",
      "alternativeLanguageCodes": ["en-US"],
      "enableWordTimeOffsets": false
  },
  "audio": {
      "uri":"gs://<bucket_name>/<audio file name>"
  }
}

error message[5]

{
  "error": {
    "code": 400,
    "message": "Sync input too long. For audio longer than 1 min use LongRunningRecognize with a 'uri' parameter.",
    "status": "INVALID_ARGUMENT"
  }
}

Solution: (1) If the audio file's duration is shorter than 1 min, use the uri: speech:recognize. (2) If the audio file's duration is longer than 1 min. Upload files to Google cloud storage (gcs). Modify the uri from speech:recognize to speech:longrunningrecognize.

$ curl -s -H "Content-Type: application/json" \
    -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
    https://speech.googleapis.com/v1p1beta1/speech:longrunningrecognize \
    -d @sync-request.json

Request payload size exceeds the limit: 10485760 bytes[edit]

error output

  {
    "error": {
      "code": 400,
      "message": "Request payload size exceeds the limit: 10485760 bytes.",
      "status": "INVALID_ARGUMENT"
    }
  }

Solution: (1) Use the short audio file which shorter than 1 min or (2) Modify the uri from speech:recognize to speech:longrunningrecognize for long audio file which longer than 1 min

sample_rate_hertz (16000) in RecognitionConfig must either be unspecified or match the value in the FLAC header[edit]

input & output

$ curl -s -H "Content-Type: application/json" \
    -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
    https://speech.googleapis.com/v1/speech:longrunningrecognize \
    -d @sync-request.json

{
  "error": {
    "code": 400,
    "message": "sample_rate_hertz (16000) in RecognitionConfig must either be unspecified or match the value in the FLAC header (44100).",
    "status": "INVALID_ARGUMENT"
  }
}

Solution: verify the sample rate of audio file

Invalid JSON payload received. Unknown name \"alternative_language_codes\" at 'config': Cannot find field[edit]

input & output

$ curl -s -H "Content-Type: application/json" \
    -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
    https://speech.googleapis.com/v1/speech:longrunningrecognize \
    -d @sync-request.json

{
  "error": {
    "code": 400,
    "message": "Invalid JSON payload received. Unknown name \"alternative_language_codes\" at 'config': Cannot find field.",
    "status": "INVALID_ARGUMENT",
    "details": [
      {
        "@type": "type.googleapis.com/google.rpc.BadRequest",
        "fieldViolations": [
          {
            "field": "config",
            "description": "Invalid JSON payload received. Unknown name \"alternative_language_codes\" at 'config': Cannot find field."
          }
        ]
      }
    ]
  }
}

Solution: modify the uri from https://speech.googleapis.com/v1/speech:longrunningrecognize to https://speech.googleapis.com/v1p1beta1/speech:longrunningrecognize

Related[edit]

References[edit]


Troubleshooting of ...

Template