Data cleaning: Difference between revisions

Jump to navigation Jump to search
88 bytes added ,  26 April 2018
Line 257: Line 257:
* If the data was imported from Excel, you should notice the 15 digit precision issue.
* If the data was imported from Excel, you should notice the 15 digit precision issue.


=== Numeric only ===
=== Numeric ===
List of the possible abnormal values:
List of the possible abnormal values:
* All numeric values are odd or even if the data were generated by user naturally.
* All numeric values are odd or even if the data were generated by user naturally.
Line 322: Line 322:


Using [https://www.w3resource.com/mysql/date-and-time-functions/mysql-unix_timestamp-function.php UNIX_TIMESTAMP() function] to check the abnormality of birthday data is not appropriate. Because the birthdays which are earlier {{kbd | key=<nowiki>1970-01-01 00:00:00 UTC</nowiki>}} will all become zero.
Using [https://www.w3resource.com/mysql/date-and-time-functions/mysql-unix_timestamp-function.php UNIX_TIMESTAMP() function] to check the abnormality of birthday data is not appropriate. Because the birthdays which are earlier {{kbd | key=<nowiki>1970-01-01 00:00:00 UTC</nowiki>}} will all become zero.
=== String contains special characters ===
* [[Byte order mark]] (BOM)
* [[Return symbol]]


== Duplicate data ==
== Duplicate data ==

Navigation menu