13,306
edits
(18 intermediate revisions by the same user not shown) | |||
Line 75: | Line 75: | ||
* [https://www.w3schools.com/js/js_json_parse.asp JSON.parse()] or [http://api.jquery.com/jquery.parsejson/ jQuery.parseJSON() | jQuery API Documentation] | * [https://www.w3schools.com/js/js_json_parse.asp JSON.parse()] or [http://api.jquery.com/jquery.parsejson/ jQuery.parseJSON() | jQuery API Documentation] | ||
== List of the garbled text and possible cause == | == List of the (look like but not) garbled text and possible cause == | ||
<table border="1" style="width: 100%; table-layout: fixed;" class="wikitable sortable"> | <table border="1" style="width: 100%; table-layout: fixed;" class="wikitable sortable"> | ||
<tr> | <tr> | ||
Line 84: | Line 84: | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
<td>Website address contains {{kbd | key=<nowiki>%2</nowiki>}} symbols</td> | <td>Website address contains {{kbd | key=<nowiki>%2</nowiki>}} or {{kbd | key=<nowiki>%20</nowiki>}} symbols</td> | ||
<td style="word-wrap: break-word;">{{kbd | key=<nowiki>http%3A%2F%2Fwww.%E4%B8%AD%E6%96%87%E7%B6%B2%E5%9D%80.tw%2F</nowiki>}}</td> | <td style="word-wrap: break-word;">{{kbd | key=<nowiki>http%3A%2F%2Fwww.%E4%B8%AD%E6%96%87%E7%B6%B2%E5%9D%80.tw%2F</nowiki>}}</td> | ||
<td>"converts characters into a format that can be transmitted over the Internet ... " Cited from [http://www.w3schools.com/tags/ref_urlencode.asp w3schools]</td> | <td>"converts characters into a format that can be transmitted over the Internet ... " Cited from [http://www.w3schools.com/tags/ref_urlencode.asp w3schools]</td> | ||
Line 90: | Line 90: | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
<td> | <td>String start from {{kbd | key=<nowiki>\u</nowiki>}}, {{kbd | key=<nowiki>\U</nowiki>}} or {{kbd | key=<nowiki>U+</nowiki>}} symbols</td> | ||
<td style="word-wrap: break-word;">{{kbd | key=<nowiki>\ | <td style="word-wrap: break-word;">{{kbd | key=<nowiki>\u8c61</nowiki>}}, {{kbd | key=<nowiki>\U0001f418</nowiki>}} or {{kbd | key=<nowiki>U+1F418</nowiki>}}</td> | ||
<td>(1) 16-bit or 32-bit hex value (2) "JSON representation of the supplied value"<ref>[http://php.net/manual/en/function.json-encode.php PHP: json_encode - Manual]</ref><ref>[http://www.faqs.org/rfcs/rfc7159.html RFC 7159: The JavaScript Object Notation (JSON) Data Interchange Format]</ref></td> | <td>Unicode number: "Unicode code point is referred to by writing "U+" followed by its hexadecimal number.<ref>[https://en.wikipedia.org/wiki/Unicode Unicode - Wikipedia]</ref>" (1) 16-bit or 32-bit hex value (2) "JSON representation of the supplied value"<ref>[http://php.net/manual/en/function.json-encode.php PHP: json_encode - Manual]</ref><ref>[http://www.faqs.org/rfcs/rfc7159.html RFC 7159: The JavaScript Object Notation (JSON) Data Interchange Format]</ref></td> | ||
<td>JSON decode ↔ JSON eocode</td> | <td>JSON decode ↔ JSON eocode</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
<td>String | <td>String starting from {{kbd | key=<nowiki>0x</nowiki>}} symbols</td> | ||
<td style="word-wrap: break-word;">{{kbd | key=<nowiki> | <td style="word-wrap: break-word;">{{kbd | key=<nowiki>0x8c61</nowiki>}}</td> | ||
<td>hexadecimal string<ref>[https://www.programiz.com/python-programming/methods/built-in/hex Python hex() - Python Standard Library]</ref></td> | |||
<td></td> | |||
</tr> | |||
<tr> | |||
<td>String starting from {{kbd | key=<nowiki>\x</nowiki>}} symbols</td> | |||
<td style="word-wrap: break-word;">{{kbd | key=<nowiki>\xe8\xa8\xb1</nowiki>}}</td> | |||
<td>"\x is a string escape code, which happens to use hex notation" (hexadecimal notation)<ref>[https://stackoverflow.com/questions/13123877/difference-between-different-hex-types-representations-in-python Difference between different hex types/representations in Python - Stack Overflow]</ref></td> | <td>"\x is a string escape code, which happens to use hex notation" (hexadecimal notation)<ref>[https://stackoverflow.com/questions/13123877/difference-between-different-hex-types-representations-in-python Difference between different hex types/representations in Python - Stack Overflow]</ref></td> | ||
<td>hexadecimal to text ↔ text to hexadecimal</td> | <td>hexadecimal to text ↔ text to hexadecimal</td> | ||
</tr> | |||
<tr> | |||
<td>String starting from {{kbd | key=<nowiki>&#</nowiki>}} symbols</td> | |||
<td style="word-wrap: break-word;">{{kbd | key=<nowiki>&#35937;</nowiki>}}</td> | |||
<td>Unicode HTML code. "Unicode number in decimal, hex or octal"<ref>[http://www.amp-what.com/help.html &what Help]</ref></td> | |||
<td></td> | |||
</tr> | </tr> | ||
</table> | </table> | ||
=== | === String contains {{kbd | key=<nowiki>%2</nowiki>}} or {{kbd | key=<nowiki>%20</nowiki>}} symbols === | ||
Using [http://php.net/manual/en/function.urlencode.php PHP: urlencode - Manual] or [https://www.w3schools.com/jsref/jsref_encodeuri.asp JavaScript encodeURI() Function] | |||
=== String starting from \u, \U or U+ symbol === | |||
Using PHP. Type is string | Using PHP. Type is string | ||
<pre> | <pre> | ||
$encoded = | $encoded = <<<EOT | ||
"\u8c61" | |||
EOT; | |||
echo "decoded string: " . json_decode($encoded, true) . PHP_EOL; // print 象 | echo "decoded string: " . json_decode($encoded, true) . PHP_EOL; // print 象 | ||
echo "encoded string: " . json_encode("象") . PHP_EOL; // print "\u8c61" | |||
$encoded = <<<EOT | |||
"\ud83d\udc18" | |||
EOT; | |||
echo "decoded string: " . json_decode($encoded, true) . PHP_EOL; // print 🐘 | echo "decoded string: " . json_decode($encoded, true) . PHP_EOL; // print 🐘 | ||
echo "encoded string: " . json_encode("🐘") . PHP_EOL; // print "\ud83d\udc18" | |||
</pre> | |||
Using PHP v. 7.0 [https://wiki.php.net/rfc/unicode_escape Unicode Codepoint Escape Syntax]<ref>[https://secure.php.net/manual/en/migration70.new-features.php#migration70.new-features.unicode-codepoint-escape-syntax PHP: New features - Manual]</ref> | |||
<pre> | |||
echo "\u{8c61}" . PHP_EOL; // print 象 | |||
echo "\u{0001f418}" . PHP_EOL; // print 🐘 | |||
</pre> | </pre> | ||
Line 139: | Line 168: | ||
</pre> | </pre> | ||
=== | === String starting from 0x symbol === | ||
Using Python [https://www.w3schools.com/python/ref_func_chr.asp chr() Function] ↔ [https://www.programiz.com/python-programming/methods/built-in/hex hex() function] | |||
<pre> | |||
int('0x8c61', 16) | |||
# print 35937 -- "An integer representing a valid Unicode code point" cited from w3schools | |||
chr(int('0x8c61', 16)) | |||
# print '象' -- "returns the character that represents the specified unicode." cited from w3schools | |||
hex(ord('象')) | |||
# print '0x8c61' -- "converts an integer number to the corresponding hexadecimal string." cited from programiz.com | |||
chr(int('0x1f418', 16)) | |||
# print '🐘' | |||
hex(ord('🐘')) | |||
# print '0x1f418' | |||
</pre> | |||
=== string starting from \x symbol === | |||
Using Python<ref>[https://docs.python.org/3/library/stdtypes.html#bytes.decode bytes.decode()]</ref><ref>[https://docs.python.org/3/library/stdtypes.html#str.encode str.encode()]</ref><ref>[https://stackoverflow.com/questions/33294213/how-to-decode-unicode-in-a-chinese-text python - How to decode unicode in a Chinese text - Stack Overflow]</ref> | Using Python<ref>[https://docs.python.org/3/library/stdtypes.html#bytes.decode bytes.decode()]</ref><ref>[https://docs.python.org/3/library/stdtypes.html#str.encode str.encode()]</ref><ref>[https://stackoverflow.com/questions/33294213/how-to-decode-unicode-in-a-chinese-text python - How to decode unicode in a Chinese text - Stack Overflow]</ref> | ||
<pre> | <pre> | ||
Line 147: | Line 192: | ||
hex_notation | hex_notation | ||
# print b'\xe8\xb1\xa1' | # print b'\xe8\xb1\xa1' | ||
for each_unicode_character in hex_notation.decode('utf-8'): | |||
print(each_unicode_character) | |||
data = u"🐘" | |||
data | |||
hex_notation = data.encode('utf-8') | |||
hex_notation | |||
# print b'\xf0\x9f\x90\x98' | |||
for each_unicode_character in hex_notation.decode('utf-8'): | for each_unicode_character in hex_notation.decode('utf-8'): | ||
print(each_unicode_character) | print(each_unicode_character) | ||
Line 158: | Line 212: | ||
for each_unicode_character in hex_notation.decode('utf-8'): | for each_unicode_character in hex_notation.decode('utf-8'): | ||
print(each_unicode_character) | print(each_unicode_character) | ||
</pre> | |||
=== String starting from &# symbols === | |||
Using PHP [https://www.w3schools.com/php/func_string_html_entity_decode.asp html_entity_decode() Function]<ref>[https://blog.longwin.com.tw/2011/06/php-html-unicode-convert-2011/ PHP 將 文字 轉換成 &#xxxxx; UNICODE 碼 | Tsung's Blog]</ref><ref>[http://hinablue.blogspot.com/2008/01/php-tech-unicode-html-convert.html [php tech.] unicode html convert | HINA::工程幼稚園] unicode html 字碼來元是由原本的編碼,轉換為 UCS-2 之後,再取二進制轉換,再取一次 16 to 10 進制轉換,在加上 &# 而得到這個字碼。</ref> | |||
<pre> | |||
$unicode_html = '&#128024;'; | |||
echo html_entity_decode($unicode_html) . PHP_EOL; // print 🐘 | |||
$unicode_html = '&#128024;'; | |||
echo mb_convert_encoding($unicode_html, 'UTF-8', 'HTML-ENTITIES') . PHP_EOL; // print 🐘 | |||
$input = "🐘"; | |||
$unicode_html = base_convert(bin2hex(mb_convert_encoding($input, 'UTF-32', 'utf-8')), 16, 10); | |||
$unicode_html = '&#' . $unicode_html . ';'; | |||
echo 'unicode_html: ' . $unicode_html . PHP_EOL; // print 🐘 | |||
</pre> | </pre> | ||
Line 226: | Line 295: | ||
* [[Batch Process#簡繁體文件轉換 | 簡繁體文件轉換]] | * [[Batch Process#簡繁體文件轉換 | 簡繁體文件轉換]] | ||
* [http://en.wikipedia.org/wiki/Character_encoding Character encoding - Wikipedia, the free encyclopedia] | * [http://en.wikipedia.org/wiki/Character_encoding Character encoding - Wikipedia, the free encyclopedia] | ||
* [https://pjchender.blogspot.com/2018/06/guide-unicode-javascript.html (Guide) 瞭解網頁中看不懂的編碼:Unicode 在 JavaScript 中的使用 ~ PJCHENder 那些沒告訴你的小細節] | |||
* [[Regular extract url from text]] | * [[Regular extract url from text]] | ||
* [[URL Encoding]] | * [[URL Encoding]] | ||
Unicode table | |||
* [https://unicode-table.com/en/ Unicode® Character Table] | |||
* [http://www.amp-what.com/ &what: Discover Unicode & HTML Character Entities] | |||
* [https://www.toptal.com/designers/htmlarrows/ HTML Symbols, Entities, Characters and Codes — HTML Arrows] | |||
* [https://www.utf8-chartable.de/unicode-utf8-table.pl?utf8=0x Unicode/UTF-8-character table] | |||
== References == | == References == |