Data cleaning: Difference between revisions
Jump to navigation
Jump to search
m
→Data Validation
Tags: Mobile edit Mobile web edit |
Tags: Mobile edit Mobile web edit |
||
| Line 264: | Line 264: | ||
=== Check if the column value is numeric === | === Check if the column value is numeric === | ||
Possibile values | |||
<pre> | |||
test data: | |||
3.141592654 | |||
1.36184E+14 | |||
123,456.789 | |||
20740199601 | |||
346183773390240 | |||
="5" | |||
</pre> | |||
MySQL: | MySQL: | ||
Check if a value is integer | * Check if a value is integer e.g. 1234567 | ||
* Find the records which the value of `my_column` is numeric values entirely {{code | code = SELECT * FROM `my_table` WHERE `my_column` REGEXP '^[0-9]+$'}}<ref>[http://stackoverflow.com/questions/14343767/mysql-regexp-with-and-numbers-only regex - Mysql REGEXP with . and numbers only - Stack Overflow]</ref> | ** Find the records which the value of `my_column` is numeric values entirely {{code | code = SELECT * FROM `my_table` WHERE `my_column` REGEXP '^[0-9]+$'}}<ref>[http://stackoverflow.com/questions/14343767/mysql-regexp-with-and-numbers-only regex - Mysql REGEXP with . and numbers only - Stack Overflow]</ref><ref>[https://stackoverflow.com/questions/75704/how-do-i-check-to-see-if-a-value-is-an-integer-in-mysql How do I check to see if a value is an integer in MySQL? - Stack Overflow]</ref> | ||
Check if a value is integer which may contains comma and dot symbols e.g. 1,234.567 or 3.414 | * Check if a value is integer which may contains comma and dot symbols e.g. 1,234.567 or 3.414 | ||
* | ** {{code | code = SELECT * FROM `my_table` WHERE `my_column` REGEXP '^[0-9,\.]+$'}}<ref>[https://community.denodo.com/answers/question/details?questionId=9060g000000XelhAAC&title=How+to+identify+if+values+in+a+column+is+numeric+%28+Function+similar+to+Isnumeric+is+SQL%29 How to identify if values in a column is numeric ( Function similar to Isnumeric is SQL)]</ref> | ||
Check if a value is NOT integer | * Check if a value is NOT integer | ||
* Find the records which the value of `my_column` is '''NOT''' numeric values entirely {{code | code = SELECT * FROM `my_table` WHERE `my_column` NOT REGEXP '^[0-9]+$'}} | ** Find the records which the value of `my_column` is '''NOT''' numeric values entirely {{code | code = SELECT * FROM `my_table` WHERE `my_column` NOT REGEXP '^[0-9]+$'}} | ||
| Line 280: | Line 291: | ||
* The {{kbd | key=tax_id}} column is 8 digits only. Find the well-formatted {{kbd | key=tax_id}} records by using {{code | code = SELECT * FROM `tax_id` WHERE `tax_id` REGEXP '^[0-9]{8}$'}} | * The {{kbd | key=tax_id}} column is 8 digits only. Find the well-formatted {{kbd | key=tax_id}} records by using {{code | code = SELECT * FROM `tax_id` WHERE `tax_id` REGEXP '^[0-9]{8}$'}} | ||
PHP: | PHP: | ||
* [http://php.net/manual/en/function.is-numeric.php is_numeric] | * [http://php.net/manual/en/function.is-numeric.php is_numeric] | ||
| Line 300: | Line 306: | ||
** Return 0 if the cell value is (1) Text (2) Numbers in scientific (exponential) notation e.g. {{code | code = <nowiki>1.23E+16</nowiki>}} (3) Decimal numbers e.g. {{code | code = <nowiki>3.141592654</nowiki>}} (4) Negative numbers | ** Return 0 if the cell value is (1) Text (2) Numbers in scientific (exponential) notation e.g. {{code | code = <nowiki>1.23E+16</nowiki>}} (3) Decimal numbers e.g. {{code | code = <nowiki>3.141592654</nowiki>}} (4) Negative numbers | ||
=== Time data: Validate the data format === | === Time data: Validate the data format === | ||