Fix garbled message text: Difference between revisions

From LemonWiki共筆
Jump to navigation Jump to search
(40 intermediate revisions by one other user not shown)
Line 7: Line 7:




<table border="1">
Possible approaches to encode the message text:
<table border="1" style="width: 100%; table-layout: fixed;" class="wikitable sortable">
<tr>
<tr>
<th width="40%"> Case
<th style="width: 20%;"> Approach
</th>
</th>
<th width="40%"> Is Chinese text garbled/encoded?
<th style="width: 25%"> Goal
</th>
</th>
<th width="20%"> Sample text
<th style="width: 20%"> Is Chinese text garbled/encoded?
</th>
<th style="width: 35%;"> Sample text before encoded or after encoded
</th>
</th>
</tr>
</tr>
<tr>
<tr>
<th> [http://www.w3schools.com/jsref/jsref_decodeuricomponent.asp JavaScript decodeURIComponent() Function]<ref>[http://stackoverflow.com/questions/9901027/how-to-encode-url-contains-unicode-characters-with-php urlencode - How to Encode URL Contains Unicode Characters with PHP - Stack Overflow]</ref>
<th> [https://www.w3schools.com/jsref/jsref_encodeURIComponent.asp JavaScript encodeURIComponent()] <br />↔<br /> [http://www.w3schools.com/jsref/jsref_decodeuricomponent.asp JavaScript decodeURIComponent()]<ref>[http://stackoverflow.com/questions/9901027/how-to-encode-url-contains-unicode-characters-with-php urlencode - How to Encode URL Contains Unicode Characters with PHP - Stack Overflow]</ref>
</th>
</th>
<td>  "converts characters into a format that can be transmitted over the Internet ... " Cited from [http://www.w3schools.com/tags/ref_urlencode.asp w3schools]
</td>
<td> TRUE
<td> TRUE
</td>
</td>
<td> <ul><li>before: http://www.中文網址.tw/my test.asp?name=ståle&car=saab <li>after: http%3A%2F%2Fwww.%E4%B8%AD%E6%96%87%E7%B6%B2%E5%9D%80.tw%2Fmy%20test.asp%3Fname%3Dst%C3%A5le%26car%3Dsaab </ul>
<td style="word-wrap: break-word;"> <ul><li>before: {{kbd | key=<nowiki>http://www.中文網址.tw/my test.asp?name=ståle&car=saab</nowiki>}} <li>after: {{kbd | key=<nowiki>http%3A%2F%2Fwww.%E4%B8%AD%E6%96%87%E7%B6%B2%E5%9D%80.tw%2Fmy%20test.asp%3Fname%3Dst%C3%A5le%26car%3Dsaab</nowiki>}} </ul>
 
</td>
</td>
</tr>
</tr>
Line 28: Line 32:
<th> [http://meyerweb.com/eric/tools/dencoder/ URL Decoder/Encoder]<ref>PHP [http://php.net/manual/en/function.urlencode.php urlencode()]</ref>
<th> [http://meyerweb.com/eric/tools/dencoder/ URL Decoder/Encoder]<ref>PHP [http://php.net/manual/en/function.urlencode.php urlencode()]</ref>
</th>
</th>
<td> (same as above)
</td>
<td> TRUE
<td> TRUE
</td>
</td>
<td> <ul><li>before: http://www.中文網址.tw <li>after: http%3A%2F%2Fwww.%E4%B8%AD%E6%96%87%E7%B6%B2%E5%9D%80.tw</ul>
<td style="word-wrap: break-word;"> (same as above)
</td>
</td>
</tr>
</tr>
<tr>
<tr>
<th> [http://php.net/serialize PHP: serialize - Manual]
<th> [http://php.net/manual/en/function.json-encode.php PHP: json_encode]<br />↔<br />[http://php.net/manual/en/function.json-decode.php PHP: json_decode]
</th>
</th>
<td> FALSE
<td>Save array in mysql database
</td>
<td> TRUE
</td>
</td>
<td> <ul><li>before: array("作者" => "馬克吐溫"); <li>after: a:1:{s:6:"作者";s:12:"馬克吐溫";}</ul>
<td style="word-wrap: break-word;"> <ul><li>before: {{kbd | key=<nowiki>array("作者" => "馬克吐溫", "名言" => "\"To a man with a hammer, everything looks like a nail.\" He said.");</nowiki>}} <li>after: {{kbd | key=<nowiki>{"\u4f5c\u8005":"\u99ac\u514b\u5410\u6eab","\u540d\u8a00":"\"To a man with a hammer, everything looks like a nail.\" He said."}</nowiki>}}</ul>
</td>
</tr>
<tr>
<th> [http://php.net/serialize PHP: serialize] <br />↔<br /> [http://php.net/manual/en/function.unserialize.php PHP: unserialize]
</th>
<td>[http://stackoverflow.com/questions/10686333/save-array-in-mysql-database Save array in mysql database]
</td>
<td> <span style="color: #999">FALSE</span>
</td>
<td style="word-wrap: break-word;"> <ul><li>before: {{kbd | key=<nowiki>array("作者" => "馬克吐溫", "名言" => "\"To a man with a hammer, everything looks like a nail.\" He said.");</nowiki>}} <li>after: {{kbd | key=<nowiki>a:2:{s:6:"作者";s:12:"馬克吐溫";s:6:"名言";s:64:""To a man with a hammer, everything looks like a nail." He said.";}</nowiki>}}</ul>
</td>
</tr>
<tr>
<th> [http://php.net/manual/en/function.htmlentities.php PHP: htmlentities][http://www.w3schools.com/html/html_entities.asp] <br />↔<br /> [http://php.net/manual/en/function.html-entity-decode.php PHP:  html_entity_decode]
</th>
<td> Replace reserved characters e.g. double quote symbol
</td>
<td> <span style="color: #999">FALSE</span>
</td>
<td style="word-wrap: break-word;"> <ul><li>before: {{kbd | key=<nowiki>馬克吐溫名言 "To a man with a hammer, everything looks like a nail."</nowiki>}} <li>after: {{kbd | key=<nowiki>
馬克吐溫名言 &amp;quot;To a man with a hammer, everything looks like a nail.&amp;quot;</nowiki>}}</ul>
</td>
</td>
</tr>
</tr>
</table>
</table>
Other functions
* [https://www.w3schools.com/js/js_json_parse.asp JSON.parse()] or [http://api.jquery.com/jquery.parsejson/ jQuery.parseJSON() | jQuery API Documentation]
== List of the (look like but not) garbled text and possible cause ==
<table border="1" style="width: 100%; table-layout: fixed;" class="wikitable sortable">
<tr>
<th>Feature</th>
<th>Example</th>
<th>Meaning</th>
<th>Restore to human readable ↔ encode text</th>
</tr>
<tr>
<td>Website address contains {{kbd | key=<nowiki>%2</nowiki>}} or {{kbd | key=<nowiki>%20</nowiki>}} symbols</td>
<td style="word-wrap: break-word;">{{kbd | key=<nowiki>http%3A%2F%2Fwww.%E4%B8%AD%E6%96%87%E7%B6%B2%E5%9D%80.tw%2F</nowiki>}}</td>
<td>"converts characters into a format that can be transmitted over the Internet ... " Cited from [http://www.w3schools.com/tags/ref_urlencode.asp w3schools]</td>
<td>URL decode ↔ URL eocode</td>
</tr>
<tr>
<td>String start from {{kbd | key=<nowiki>\u</nowiki>}}, {{kbd | key=<nowiki>\U</nowiki>}} or {{kbd | key=<nowiki>U+</nowiki>}} symbols</td>
<td style="word-wrap: break-word;">{{kbd | key=<nowiki>\u8c61</nowiki>}}, {{kbd | key=<nowiki>\U0001f418</nowiki>}} or {{kbd | key=<nowiki>U+1F418</nowiki>}}</td>
<td>Unicode number: "Unicode code point is referred to by writing "U+" followed by its hexadecimal number.<ref>[https://en.wikipedia.org/wiki/Unicode Unicode - Wikipedia]</ref>" (1) 16-bit or 32-bit hex value (2) "JSON representation of the supplied value"<ref>[http://php.net/manual/en/function.json-encode.php PHP: json_encode - Manual]</ref><ref>[http://www.faqs.org/rfcs/rfc7159.html RFC 7159: The JavaScript Object Notation (JSON) Data Interchange Format]</ref></td>
<td>JSON decode ↔ JSON eocode</td>
</tr>
<tr>
<td>String starting from {{kbd | key=<nowiki>0x</nowiki>}} symbols</td>
<td style="word-wrap: break-word;">{{kbd | key=<nowiki>0x8c61</nowiki>}}</td>
<td>hexadecimal string<ref>[https://www.programiz.com/python-programming/methods/built-in/hex Python hex() - Python Standard Library]</ref></td>
<td></td>
</tr>
<tr>
<td>String starting from {{kbd | key=<nowiki>\x</nowiki>}} symbols</td>
<td style="word-wrap: break-word;">{{kbd | key=<nowiki>\xe8\xa8\xb1</nowiki>}}</td>
<td>"\x is a string escape code, which happens to use hex notation" (hexadecimal notation)<ref>[https://stackoverflow.com/questions/13123877/difference-between-different-hex-types-representations-in-python Difference between different hex types/representations in Python - Stack Overflow]</ref></td>
<td>hexadecimal to text ↔ text to hexadecimal</td>
</tr>
<tr>
<td>String starting from {{kbd | key=<nowiki>&#</nowiki>}} symbols</td>
<td style="word-wrap: break-word;">{{kbd | key=<nowiki>&amp;#35937;</nowiki>}}</td>
<td>Unicode HTML code. "Unicode number in decimal, hex or octal"<ref>[http://www.amp-what.com/help.html &what Help]</ref></td>
<td></td>
</tr>
</table>
=== String contains {{kbd | key=<nowiki>%2</nowiki>}} or {{kbd | key=<nowiki>%20</nowiki>}} symbols ===
Using [http://php.net/manual/en/function.urlencode.php PHP: urlencode - Manual] or [https://www.w3schools.com/jsref/jsref_encodeuri.asp JavaScript encodeURI() Function]
=== String starting from \u, \U or U+ symbol ===
Using PHP. Type is string
<pre>
$encoded = <<<EOT
"\u8c61"
EOT;
echo "decoded string: " . json_decode($encoded, true) . PHP_EOL; // print 象
echo "encoded string: " . json_encode("象") . PHP_EOL; // print "\u8c61"
$encoded = <<<EOT
"\ud83d\udc18"
EOT;
echo "decoded string: " . json_decode($encoded, true) . PHP_EOL; // print 🐘
echo "encoded string: " . json_encode("🐘") . PHP_EOL; // print "\ud83d\udc18"
</pre>
Using PHP v. 7.0 [https://wiki.php.net/rfc/unicode_escape Unicode Codepoint Escape Syntax]<ref>[https://secure.php.net/manual/en/migration70.new-features.php#migration70.new-features.unicode-codepoint-escape-syntax PHP: New features - Manual]</ref>
<pre>
echo "\u{8c61}" . PHP_EOL; // print 象
echo "\u{0001f418}" . PHP_EOL; // print 🐘
</pre>
Using Python. Type is string
<pre>
x = u'象'
x.encode('ascii', 'backslashreplace')
# print b'\\u8c61'
x = u'🐘'
x.encode('ascii', 'backslashreplace')
# print b'\\U0001f418'
</pre>
Using PHP. Type is array
<pre>
$input = <<<EOT
["\u8c61"]
EOT;
$input = trim($input);
var_dump(json_decode($input, true)); // print array("象")
var_dump(json_encode(array("象")); // print ["\u8c61"]
</pre>
=== String starting from 0x symbol ===
Using Python [https://www.w3schools.com/python/ref_func_chr.asp chr() Function] ↔ [https://www.programiz.com/python-programming/methods/built-in/hex hex() function]
<pre>
int('0x8c61', 16)
# print 35937 -- "An integer representing a valid Unicode code point" cited from w3schools
chr(int('0x8c61', 16))
# print '象' -- "returns the character that represents the specified unicode." cited from w3schools
hex(ord('象'))
# print '0x8c61' -- "converts an integer number to the corresponding hexadecimal string." cited from programiz.com
chr(int('0x1f418', 16))
# print '🐘'
hex(ord('🐘'))
# print '0x1f418'
</pre>
=== string starting from \x symbol ===
Using Python<ref>[https://docs.python.org/3/library/stdtypes.html#bytes.decode bytes.decode()]</ref><ref>[https://docs.python.org/3/library/stdtypes.html#str.encode str.encode()]</ref><ref>[https://stackoverflow.com/questions/33294213/how-to-decode-unicode-in-a-chinese-text python - How to decode unicode in a Chinese text - Stack Overflow]</ref>
<pre>
data = u"象"
data
hex_notation = data.encode('utf-8')
hex_notation
# print b'\xe8\xb1\xa1'
for each_unicode_character in hex_notation.decode('utf-8'):
    print(each_unicode_character)
data = u"🐘"
data
hex_notation = data.encode('utf-8')
hex_notation
# print b'\xf0\x9f\x90\x98'
for each_unicode_character in hex_notation.decode('utf-8'):
    print(each_unicode_character)
data = u"だいじょうぶ"
data
hex_notation = data.encode('utf-8')
hex_notation
# print b'\xe3\x81\xa0\xe3\x81\x84\xe3\x81\x98\xe3\x82\x87\xe3\x81\x86\xe3\x81\xb6'
for each_unicode_character in hex_notation.decode('utf-8'):
    print(each_unicode_character)
</pre>
=== String starting from &# symbols ===
Using PHP [https://www.w3schools.com/php/func_string_html_entity_decode.asp html_entity_decode() Function]<ref>[https://blog.longwin.com.tw/2011/06/php-html-unicode-convert-2011/ PHP 將 文字 轉換成 &#xxxxx; UNICODE 碼 | Tsung's Blog]</ref><ref>[http://hinablue.blogspot.com/2008/01/php-tech-unicode-html-convert.html [php tech.] unicode html convert | HINA::工程幼稚園] unicode html 字碼來元是由原本的編碼,轉換為 UCS-2 之後,再取二進制轉換,再取一次 16 to 10 進制轉換,在加上 &# 而得到這個字碼。</ref>
<pre>
$unicode_html = '&amp;#128024;';
echo html_entity_decode($unicode_html) . PHP_EOL; // print 🐘
$unicode_html = '&amp;#128024;';
echo mb_convert_encoding($unicode_html, 'UTF-8', 'HTML-ENTITIES') . PHP_EOL; // print 🐘
$input = "🐘";
$unicode_html = base_convert(bin2hex(mb_convert_encoding($input, 'UTF-32', 'utf-8')), 16, 10);
$unicode_html = '&#' . $unicode_html . ';';
echo 'unicode_html: ' . $unicode_html . PHP_EOL; // print &#128024
</pre>


== Ways to fix garbled message text ==
== Ways to fix garbled message text ==
Line 70: Line 256:


=== Microsoft notepad (記事本) for Windows ===
=== Microsoft notepad (記事本) for Windows ===
method 1: [http://errerrors.blogspot.com/2010/11/notepadtxt.html Err: 解決用記事本(notepad)開啟簡體字txt檔,出現亂碼的問題](2010):  notepad + [http://notepad-plus-plus.org/ Notepad++ ]
method 1: [https://errerrors.blogspot.com/2010/11/notepadtxt.html Err: 解決用記事本(notepad)開啟簡體字txt檔,出現亂碼的問題](2010):  notepad + [http://notepad-plus-plus.org/ Notepad++ ]
* choose encode: manually
* choose encode: manually
* convert to UTF-8: available by Notepad++
* convert to UTF-8: available by Notepad++
Line 106: Line 292:
* [http://www.openoffice.org/ OpenOffice.org] 3.3.0 - Writer is not supported but OpenOffice.org Calc is supported.
* [http://www.openoffice.org/ OpenOffice.org] 3.3.0 - Writer is not supported but OpenOffice.org Calc is supported.


== further reading ==
== Further reading ==
* [[Batch Process#簡繁體文件轉換 | 簡繁體文件轉換]]
* [[Batch Process#簡繁體文件轉換 | 簡繁體文件轉換]]
* [http://en.wikipedia.org/wiki/Character_encoding Character encoding - Wikipedia, the free encyclopedia]
* [http://en.wikipedia.org/wiki/Character_encoding Character encoding - Wikipedia, the free encyclopedia]
* [https://pjchender.blogspot.com/2018/06/guide-unicode-javascript.html (Guide) 瞭解網頁中看不懂的編碼:Unicode 在 JavaScript 中的使用 ~ PJCHENder 那些沒告訴你的小細節]
* [[URL Encoding]]
Unicode table
* [https://unicode-table.com/en/ Unicode® Character Table]
* [http://www.amp-what.com/ &what: Discover Unicode & HTML Character Entities]
* [https://www.toptal.com/designers/htmlarrows/ HTML Symbols, Entities, Characters and Codes — HTML Arrows]
* [https://www.utf8-chartable.de/unicode-utf8-table.pl?utf8=0x Unicode/UTF-8-character table]
== References ==


<references />


[[Category:Software]]
[[Category:Software]]
[[Category:Data Science]]
[[Category:Data Science]]
[[Category:Text file processing]]
[[Category:Text file processing]]
[[Category:Programming]]

Revision as of 10:49, 5 June 2020

Ideas on how to fix garbled message text

  1. Possible cause
    • Encoding issue: Choose the correct the language/encode of message text or auto detect the encode by tools
    • PHP utf8_encode() & utf8_decode()
  2. (optional) convert the current encode to UTF-8
  3. (optional) Making text wrap to window size


Possible approaches to encode the message text:

Approach Goal Is Chinese text garbled/encoded? Sample text before encoded or after encoded
JavaScript encodeURIComponent()

JavaScript decodeURIComponent()[1]
"converts characters into a format that can be transmitted over the Internet ... " Cited from w3schools TRUE
  • before: http://www.中文網址.tw/my test.asp?name=ståle&car=saab
  • after: http%3A%2F%2Fwww.%E4%B8%AD%E6%96%87%E7%B6%B2%E5%9D%80.tw%2Fmy%20test.asp%3Fname%3Dst%C3%A5le%26car%3Dsaab
URL Decoder/Encoder[2] (same as above) TRUE (same as above)
PHP: json_encode

PHP: json_decode
Save array in mysql database TRUE
  • before: array("作者" => "馬克吐溫", "名言" => "\"To a man with a hammer, everything looks like a nail.\" He said.");
  • after: {"\u4f5c\u8005":"\u99ac\u514b\u5410\u6eab","\u540d\u8a00":"\"To a man with a hammer, everything looks like a nail.\" He said."}
PHP: serialize

PHP: unserialize
Save array in mysql database FALSE
  • before: array("作者" => "馬克吐溫", "名言" => "\"To a man with a hammer, everything looks like a nail.\" He said.");
  • after: a:2:{s:6:"作者";s:12:"馬克吐溫";s:6:"名言";s:64:""To a man with a hammer, everything looks like a nail." He said.";}
PHP: htmlentities[1]

PHP: html_entity_decode
Replace reserved characters e.g. double quote symbol FALSE
  • before: 馬克吐溫名言 "To a man with a hammer, everything looks like a nail."
  • after: 馬克吐溫名言 &quot;To a man with a hammer, everything looks like a nail.&quot;

Other functions

List of the (look like but not) garbled text and possible cause

Feature Example Meaning Restore to human readable ↔ encode text
Website address contains %2 or %20 symbols http%3A%2F%2Fwww.%E4%B8%AD%E6%96%87%E7%B6%B2%E5%9D%80.tw%2F "converts characters into a format that can be transmitted over the Internet ... " Cited from w3schools URL decode ↔ URL eocode
String start from \u, \U or U+ symbols \u8c61, \U0001f418 or U+1F418 Unicode number: "Unicode code point is referred to by writing "U+" followed by its hexadecimal number.[3]" (1) 16-bit or 32-bit hex value (2) "JSON representation of the supplied value"[4][5] JSON decode ↔ JSON eocode
String starting from 0x symbols 0x8c61 hexadecimal string[6]
String starting from \x symbols \xe8\xa8\xb1 "\x is a string escape code, which happens to use hex notation" (hexadecimal notation)[7] hexadecimal to text ↔ text to hexadecimal
String starting from &# symbols &#35937; Unicode HTML code. "Unicode number in decimal, hex or octal"[8]

String contains %2 or %20 symbols

Using PHP: urlencode - Manual or JavaScript encodeURI() Function

String starting from \u, \U or U+ symbol

Using PHP. Type is string

$encoded = <<<EOT

"\u8c61"

EOT;
echo "decoded string: " . json_decode($encoded, true) . PHP_EOL; // print 象
echo "encoded string: " . json_encode("象") . PHP_EOL; // print "\u8c61"

$encoded = <<<EOT

"\ud83d\udc18"

EOT;
echo "decoded string: " . json_decode($encoded, true) . PHP_EOL; // print 🐘
echo "encoded string: " . json_encode("🐘") . PHP_EOL; // print "\ud83d\udc18"

Using PHP v. 7.0 Unicode Codepoint Escape Syntax[9]

echo "\u{8c61}" . PHP_EOL; // print 象
echo "\u{0001f418}" . PHP_EOL; // print 🐘

Using Python. Type is string

x = u'象'
x.encode('ascii', 'backslashreplace') 
# print b'\\u8c61'

x = u'🐘'
x.encode('ascii', 'backslashreplace') 
# print b'\\U0001f418'

Using PHP. Type is array

$input = <<<EOT

["\u8c61"]

EOT;

$input = trim($input);
var_dump(json_decode($input, true)); // print array("象")
var_dump(json_encode(array("象")); // print ["\u8c61"]

String starting from 0x symbol

Using Python chr() Functionhex() function

int('0x8c61', 16)
# print 35937 -- "An integer representing a valid Unicode code point" cited from w3schools
chr(int('0x8c61', 16))
# print '象' -- "returns the character that represents the specified unicode." cited from w3schools
hex(ord('象'))
# print '0x8c61' -- "converts an integer number to the corresponding hexadecimal string." cited from programiz.com

chr(int('0x1f418', 16))
# print '🐘'
hex(ord('🐘'))
# print '0x1f418'

string starting from \x symbol

Using Python[10][11][12]

data = u"象"
data
hex_notation = data.encode('utf-8')
hex_notation
# print b'\xe8\xb1\xa1'
for each_unicode_character in hex_notation.decode('utf-8'):
    print(each_unicode_character)


data = u"🐘"
data
hex_notation = data.encode('utf-8')
hex_notation
# print b'\xf0\x9f\x90\x98'
for each_unicode_character in hex_notation.decode('utf-8'):
    print(each_unicode_character)


data = u"だいじょうぶ"
data
hex_notation = data.encode('utf-8')
hex_notation 
# print b'\xe3\x81\xa0\xe3\x81\x84\xe3\x81\x98\xe3\x82\x87\xe3\x81\x86\xe3\x81\xb6'
for each_unicode_character in hex_notation.decode('utf-8'):
    print(each_unicode_character)

String starting from &# symbols

Using PHP html_entity_decode() Function[13][14]

$unicode_html = '&#128024;';
echo html_entity_decode($unicode_html) . PHP_EOL; // print 🐘

$unicode_html = '&#128024;';
echo mb_convert_encoding($unicode_html, 'UTF-8', 'HTML-ENTITIES') . PHP_EOL; // print 🐘

$input = "🐘";
$unicode_html = base_convert(bin2hex(mb_convert_encoding($input, 'UTF-32', 'utf-8')), 16, 10);
$unicode_html = '&#' . $unicode_html . ';';
echo 'unicode_html: ' . $unicode_html . PHP_EOL; // print &#128024

Ways to fix garbled message text

ConvertZ v.8.02

  • choose encode: manually (mainly in Asia language)
  • convert to UTF-8: available
  • convert to big5 from UTF-8: available Icon_exclaim.gif the wording may be changed by the software ex: 余美人 -> 於美人
  • allow to wrap long text: available

EmEditor v.14.3.1 ($)

Google Chrome v.10 (viewer)

  • choose encode: manually and auto-detect
  • allow to wrap long text: available (auto) Good.gif

MadEdit v.0.2.9.1

  • choose encode: manually and auto-detect Good.gif
  • convert to UTF-8: available
  • allow to wrap long text: available

Microsoft Internet Explorer v.8 (viewer)

  • choose encode: manually and auto-detect
  • allow to wrap long text:

Microsoft notepad (記事本) for Windows

method 1: Err: 解決用記事本(notepad)開啟簡體字txt檔,出現亂碼的問題(2010): notepad + Notepad++

  • choose encode: manually
  • convert to UTF-8: available by Notepad++
  • allow to wrap long text: available


method 2: Microsoft AppLocale 公用程式(patched: piaip pAppLocale) + notepad

  • choose encode: manually
  • convert to UTF-8: not available
  • allow to wrap long text: available

Microsoft Office Word 2003 ($)

  • choose encode: manually
  • convert to UTF-8: available
  • allow to wrap long text: available

Mozilla Firefox v.3.6 (viewer)

javascript:(function() { var D = document; F(D.body); function F(n) { var u, r, c, x; if (n.nodeType == 3) { u = n.data.search(/\S{45}/); if (u >= 0) { r = n.splitText(u + 45); n.parentNode.insertBefore(D.createElement('wbr'), r); } } else if ((n.tagName != 'STYLE') && (n.tagName != 'SCRIPT')) { for (c = 0; x = n.childNodes[c]; ++c) { F(x); } } } D.body.innerHTML += ' '; })();


Notepad++ v.5.8

  • choose encode: manually
  • convert to UTF-8: available
  • allow to wrap long text: available


not supported at this moment

Further reading

Unicode table

References