Troubleshooting of Google cloud speech to text: Difference between revisions
Jump to navigation
Jump to search
m
Troubleshooting of Google cloud speech to text (edit)
Revision as of 17:13, 17 August 2020
, 17 August 2020→Related
m (→Related) |
|||
| (13 intermediate revisions by the same user not shown) | |||
| Line 1: | Line 1: | ||
{{Template:Speech to text}} | |||
Troubleshooting of Google [https://cloud.google.com/speech-to-text/ Cloud Speech-to-Text - 語音辨識] | Troubleshooting of Google [https://cloud.google.com/speech-to-text/ Cloud Speech-to-Text - 語音辨識] | ||
== Brief instruction == | |||
* If the audio file's duration is '''longer''' than 1 minute, (1) Use the uri: {{kbd | key=<nowiki>speech:longrunningrecognize</nowiki>}} NOT {{kbd | key=<nowiki>speech:recognize</nowiki>}} (2) Upload files to [https://console.cloud.google.com/storage/ Google cloud storage] (gcs) | |||
* If the audio file's duration is '''shorter''' than 1 minute, (1) Use the uri: {{kbd | key=<nowiki>speech:longrunningrecognize</nowiki>}}. Or choose to use {{kbd | key=<nowiki>speech:recognize</nowiki>}} (2) Use the files are located on the computer. Or choose to upload files to [https://console.cloud.google.com/storage/ Google cloud storage] (gcs) | |||
== ERROR: (gcloud.auth.application-default.print-access-token) The Application Default Credentials are not available == | == ERROR: (gcloud.auth.application-default.print-access-token) The Application Default Credentials are not available == | ||
| Line 9: | Line 14: | ||
</pre> | </pre> | ||
Solution<ref>[https://cloud.google.com/docs/authentication/production Setting Up Authentication for Server to Server Production Applications | Authentication | Google Cloud]</ref><ref>[https://cloud.google.com/speech-to-text/docs/quickstart-protocol Quickstart: Using the Command Line | Cloud Speech API Documentation | Google Cloud]</ref>: Key-in the following command. And then the browser will be opened automatically. | Solution<ref>[https://cloud.google.com/docs/authentication/production Setting Up Authentication for Server to Server Production Applications | Authentication | Google Cloud]</ref><ref>[https://cloud.google.com/speech-to-text/docs/quickstart-protocol Quickstart: Using the Command Line | Cloud Speech API Documentation | Google Cloud]</ref>: Key-in the following command. And then the browser will be opened automatically. Follow the steps on the web page. | ||
<pre> | <pre> | ||
$ gcloud auth application-default login | $ gcloud auth application-default login | ||
</pre> | </pre> | ||
== Invalid audio channel count == | == Invalid audio channel count == | ||
| Line 41: | Line 45: | ||
</pre> | </pre> | ||
Solution: Specify the encoding of audio file. For details, see [https://cloud.google.com/speech-to-text/docs/encoding Introduction to Audio Encoding | Cloud Speech-to-Text API | Google Cloud] & [https://cloud.google.com/speech-to-text/docs/reference/rest/v1/RecognitionConfig#AudioEncoding RecognitionConfig | Cloud Speech-to-Text API | Google Cloud]. You may use VLC player to view the encoding of audio file<ref>[https://forum.videolan.org/viewtopic.php?t=95136#p315198 How to view audio bitrate in VLC - The VideoLAN Forums]</ref>. If the codec (encoding) of audio file is not in the allowed list on [https://cloud.google.com/speech-to-text/docs/reference/rest/v1/RecognitionConfig#AudioEncoding page], the codec (encoding) of audio file should be converted. | Solution: Specify the encoding of audio file. For details, see [https://cloud.google.com/speech-to-text/docs/encoding Introduction to Audio Encoding | Cloud Speech-to-Text API | Google Cloud] & [https://cloud.google.com/speech-to-text/docs/reference/rest/v1/RecognitionConfig#AudioEncoding RecognitionConfig | Cloud Speech-to-Text API | Google Cloud]. You may use VLC player to view the encoding of audio file<ref>[https://forum.videolan.org/viewtopic.php?t=95136#p315198 How to view audio bitrate in VLC - The VideoLAN Forums]</ref>. If the codec (encoding) of audio file is not in the allowed list on [https://cloud.google.com/speech-to-text/docs/reference/rest/v1/RecognitionConfig#AudioEncoding page], the codec (encoding) of audio file should be converted by [[Audio converter | audio converter]]. | ||
== | == If the audio file's duration is longer than 1 minute use LongRunningRecognize with a 'uri' parameter == | ||
input | input | ||
<pre> | <pre> | ||
| Line 80: | Line 84: | ||
</pre> | </pre> | ||
Solution: (1) | Solution: (1) If the audio file's duration is shorter than 1 min, use the uri: {{kbd | key=<nowiki>speech:recognize</nowiki>}}. (2) If the audio file's duration is longer than 1 min. Upload files to [https://console.cloud.google.com/storage/ Google cloud storage] (gcs). Modify the uri from {{kbd | key=<nowiki>speech:recognize</nowiki>}} to {{kbd | key=<nowiki>speech:longrunningrecognize</nowiki>}}. | ||
<pre> | <pre> | ||
$ curl -s -H "Content-Type: application/json" \ | $ curl -s -H "Content-Type: application/json" \ | ||
| Line 103: | Line 107: | ||
Solution: (1) Use the short audio file which shorter than 1 min or (2) Modify the uri from {{kbd | key=<nowiki>speech:recognize</nowiki>}} to {{kbd | key=<nowiki>speech:longrunningrecognize</nowiki>}} for long audio file which longer than 1 min | Solution: (1) Use the short audio file which shorter than 1 min or (2) Modify the uri from {{kbd | key=<nowiki>speech:recognize</nowiki>}} to {{kbd | key=<nowiki>speech:longrunningrecognize</nowiki>}} for long audio file which longer than 1 min | ||
== sample_rate_hertz (16000) in RecognitionConfig must either be unspecified or match the value in the FLAC header == | == sample_rate_hertz (16000) in RecognitionConfig must either be unspecified or match the value in the FLAC header == | ||
| Line 154: | Line 157: | ||
Solution: modify the uri from {{kbd | key=<nowiki>https://speech.googleapis.com/v1/speech:longrunningrecognize</nowiki>}} to {{kbd | key=<nowiki>https://speech.googleapis.com/v1p1beta1/speech:longrunningrecognize</nowiki>}} | Solution: modify the uri from {{kbd | key=<nowiki>https://speech.googleapis.com/v1/speech:longrunningrecognize</nowiki>}} to {{kbd | key=<nowiki>https://speech.googleapis.com/v1p1beta1/speech:longrunningrecognize</nowiki>}} | ||
== Related | == Related == | ||
* {{Gd}} [https://cloud.google.com/speech-to-text/docs/best-practices Best Practices | Cloud Speech API Documentation | Google Cloud] | |||
* official document: [https://cloud.google.com/speech-to-text/docs/troubleshooting Troubleshooting of Google Speech-to-text API] | |||
* [https://groups.google.com/forum/#!forum/cloud-speech-discuss cloud-speech-discuss - Google Group] | * [https://groups.google.com/forum/#!forum/cloud-speech-discuss cloud-speech-discuss - Google Group] | ||
* [https://github.com/GoogleCloudPlatform/php-docs-samples/tree/master/speech/ php-docs-samples/speech at master · GoogleCloudPlatform/php-docs-samples] | |||
* [[Text to speech]] | |||
== References == | == References == | ||
| Line 163: | Line 171: | ||
{{Template:Troubleshooting}} | {{Template:Troubleshooting}} | ||
[[Category:Google]] | [[Category:Google]] [[Category:NLP]] [[Category:Tool]] | ||