Fix garbled message text: Difference between revisions

Jump to navigation Jump to search
no edit summary
m (Text replacement - "http://errerrors.blogspot.com" to "https://errerrors.blogspot.com")
No edit summary
Line 74: Line 74:
Other functions
Other functions
* [https://www.w3schools.com/js/js_json_parse.asp JSON.parse()] or [http://api.jquery.com/jquery.parsejson/ jQuery.parseJSON() | jQuery API Documentation]
* [https://www.w3schools.com/js/js_json_parse.asp JSON.parse()] or [http://api.jquery.com/jquery.parsejson/ jQuery.parseJSON() | jQuery API Documentation]
== List of the garbled text and possible root cause ==
<table border="1" style="width: 100%; table-layout: fixed;" class="wikitable sortable">
<tr>
<th>Feature</th>
<th>Example</th>
<th>Meaning</th>
<th>Restore to human readable ↔ encode text</th>
</tr>
<tr>
<td>Website address contains {{kbd | key=<nowiki>%2</nowiki>}} symbols</td>
<td style="word-wrap: break-word;">{{kbd | key=<nowiki>http%3A%2F%2Fwww.%E4%B8%AD%E6%96%87%E7%B6%B2%E5%9D%80.tw%2F</nowiki>}}</td>
<td>"converts characters into a format that can be transmitted over the Internet ... " Cited from [http://www.w3schools.com/tags/ref_urlencode.asp w3schools]</td>
<td>URL decode ↔ URL eocode</td>
</tr>
<tr>
<td>Downloaded Json or JavaScript file which its content contains {{kbd | key=<nowiki>\u</nowiki>}} symbols</td>
<td style="word-wrap: break-word;">{{kbd | key=<nowiki>\u4f5c</nowiki>}}</td>
<td>"JSON representation of the supplied value"<ref>[http://php.net/manual/en/function.json-encode.php PHP: json_encode - Manual]</ref><ref>[http://www.faqs.org/rfcs/rfc7159.html RFC 7159: The JavaScript Object Notation (JSON) Data Interchange Format]</ref></td>
<td>JSON decode ↔ JSON eocode</td>
</tr>
<tr>
<td>String contains {{kbd | key=<nowiki>\x</nowiki>}} symbols</td>
<td style="word-wrap: break-word;">{{kbd | key=<nowiki>b'\xe8\xa8\xb1'</nowiki>}}</td>
<td>"\x is a string escape code, which happens to use hex notation" (hexadecimal notation)<ref>[https://stackoverflow.com/questions/13123877/difference-between-different-hex-types-representations-in-python Difference between different hex types/representations in Python - Stack Overflow]</ref></td>
<td>hexadecimal to text ↔ text to hexadecimal</td>
</tr>
</table>
=== text start with \u symbol ===
Using PHP. Type is string
<pre>
$encoded = json_encode("象");
echo "encoded string: " . $encoded . PHP_EOL; // print "\u8c61"
echo "decoded string: " . json_decode($encoded, true) . PHP_EOL; // print 象
$encoded = json_encode("🐘");
echo "encoded string: " . $encoded . PHP_EOL; // print "\ud83d\udc18"
echo "decoded string: " . json_decode($encoded, true) . PHP_EOL; // print 🐘
</pre>
Using Python. Type is string
<pre>
x = u'象'
x.encode('ascii', 'backslashreplace') // print b'\\u8c61'
x = u'🐘'
x.encode('ascii', 'backslashreplace') // print b'\\U0001f418'
</pre>
Using PHP. Type is array
<pre>
$input = <<<EOT
["\u4f5c"]
EOT;
$input = trim($input);
var_dump(json_decode($input, true)); // print array("作")
var_dump(json_encode(array("作")); // print ["\u4f5c"]
</pre>
=== text start with \x symbol ===
Using Python<ref>[https://docs.python.org/3/library/stdtypes.html#bytes.decode bytes.decode()]</ref><ref>[https://docs.python.org/3/library/stdtypes.html#str.encode str.encode()]</ref><ref>[https://stackoverflow.com/questions/33294213/how-to-decode-unicode-in-a-chinese-text python - How to decode unicode in a Chinese text - Stack Overflow]</ref>
<pre>
data = u"許"
data
hex_notation = data.encode('utf-8')
hex_notation // print b'\xe8\xa8\xb1'
for each_unicode_character in hex_notation.decode('utf-8'):
    print(each_unicode_character)
data = u"だいじょうぶ"
data
hex_notation = data.encode('utf-8')
hex_notation // print b'\xe3\x81\xa0\xe3\x81\x84\xe3\x81\x98\xe3\x82\x87\xe3\x81\x86\xe3\x81\xb6'
for each_unicode_character in hex_notation.decode('utf-8'):
    print(each_unicode_character)
</pre>


== Ways to fix garbled message text ==
== Ways to fix garbled message text ==

Navigation menu