Byte order mark
Jump to navigation
Jump to search
Byte order mark (BOM, 位元組順序記號, 部分編輯器稱為「簽名」)
How to see Byte order mark
MySQL way
Using MySQL HEX() function "returns a string representation of a hexadecimal value of a decimal or string value specified as an argument."
Run sql on sqlfiddle.com or Download the Sql file directly.
CREATE TABLE `articles` ( `id` varchar(50) NOT NULL, `notes` text NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8; INSERT INTO `articles` (`id`, `notes`) VALUES ('1234567890', 'no BOM'), ('1234567890', 'BOM'); ALTER TABLE `articles` ADD UNIQUE KEY `id` (`id`) USING BTREE; SELECT HEX(`id`), `id`, `notes` FROM `articles`;
HEX(id) | id | notes |
---|---|---|
31323334353637383930 | 1234567890 | UTF-8 without BOM |
EFBBBF31323334353637383930 | 1234567890 | UTF-8 with BOM |
If the column `id` was only allowed integer in column value, you can use the following sql query to find the records contains BOM:
SELECT * FROM `articles` WHERE HEX(`id`) REGEXP '[^0-9]+'
PHP way
$string = "1234567890"; echo $string . " NOT contains BOM --> after str2hex: " . str2hex($string) . PHP_EOL; $string = "\xEF\xBB\xBF" . "1234567890"; echo $string . " contains BOM --> after str2hex: " . str2hex($string) . PHP_EOL; function str2hex($string) { $hexstr = unpack('H*', $string); return array_shift($hexstr); }
Result:
1234567890 NOT contains BOM --> after str2hex: 31323334353637383930 1234567890 contains BOM --> after str2hex: efbbbf31323334353637383930
Excel / Google sheet way
Using the CODE function to check the "numeric code for the first character in a text string". If the cell A1 contains BOM,
- =CODE(A1) returns 63 on Excel 2016 of Win [4]
- =CODE(A1) returns 95 on Excel 2016 of Mac
- =CODE(A1) returns 65279 or other numeric value e.g. 28201 on Google sheet
File command
Using file (command): file filename.txt on Linux , Mac [5] & Cygwin on Win . See details on Text file encoding
Hex editor
Using Hext editor to open the text file. 位元組順序記號 - 維基百科,自由的百科全書