Data cleaning: Difference between revisions
Jump to navigation
Jump to search
| Line 15: | Line 15: | ||
([http://sqlfiddle.com/#!2/9b01e/3/0 demo on sqlfiddle]) | ([http://sqlfiddle.com/#!2/9b01e/3/0 demo on sqlfiddle]) | ||
# {{kbd | key = NULL}} value: {{kbd | key = SELECT * FROM table_name WHERE column_name IS NULL;}} | # {{kbd | key = NULL}} value: | ||
# empty value: {{kbd | key = SELECT * FROM table_name WHERE LENGTH(TRIM( column_name )) = 0;}} | #* solution: {{kbd | key = SELECT * FROM table_name WHERE column_name IS NULL;}} | ||
# Excel starting date: 1900/1/0 (converted time formatted value from 0), 1900/1/1 (converted time formatted value from 1), 1900/1/2 ... | # empty value: | ||
#* solution1: {{kbd | key = SELECT * FROM table_name WHERE LENGTH(TRIM( column_name )) = 0;}} | |||
#* solution2: SQL query with the condition {{kbd | key =SELECT * FROM table_name WHERE column_name IS NOT NULL}} includes NULL & empty value {{Gd}} | |||
# Excel starting date: 1900/1/0 (converted time formatted value from 0), 1900/1/1 (converted time formatted value from 1), 1900/1/2 ... | |||
#* solution: step1: Replace the year > 100 from this year with empty value at EXCEL: {{kbd | key =<nowiki>=IF(ISERR(YEAR(A2)), "", IF(YEAR(A2)<1914, "", A2))</nowiki>}} (this formula also handle empty value and non well-formatted column value ex: 0000-12-31 ) ; step2: change the format of cell to time format | |||
== check if field contains value == | == check if field contains value == | ||
Revision as of 17:57, 22 September 2014
is null
Finds whether a variable is NULL
- PHP is_null
- Google spreadsheet / Excel:
- ISERR(value) " value - The value to be verified as an error type other than #N/A." ex: #NULL!
- If the cell value is exactly NULL not #NULL!, You may use COUNTIF(value, "NULL") or EXACT(value, "NULL")
- MySQL SQL syntax: SELECT * FROM table WHERE column IS NULL;[1] demo
Finds whether a variable is NOT NULL
- MySQL SQL syntax: SELECT * FROM table WHERE column IS NOT NULL; demo
check if field value was not fulfilled: NULL, empty value
NOT include those data which its field value fulfilled with default value automatically
- NULL value:
- solution: SELECT * FROM table_name WHERE column_name IS NULL;
- empty value:
- solution1: SELECT * FROM table_name WHERE LENGTH(TRIM( column_name )) = 0;
- solution2: SQL query with the condition SELECT * FROM table_name WHERE column_name IS NOT NULL includes NULL & empty value

- Excel starting date: 1900/1/0 (converted time formatted value from 0), 1900/1/1 (converted time formatted value from 1), 1900/1/2 ...
- solution: step1: Replace the year > 100 from this year with empty value at EXCEL: =IF(ISERR(YEAR(A2)), "", IF(YEAR(A2)<1914, "", A2)) (this formula also handle empty value and non well-formatted column value ex: 0000-12-31 ) ; step2: change the format of cell to time format
check if field contains value
- MySQL: SELECT * FROM table_name WHERE LENGTH(TRIM( column_name )) != 0; demo
outlier
(left blank intentionally)
data handling
remove first, last or certain characters from text
- Excel: using RIGHT[2] + LEN[3] functions [4]
- Excel: if the text length will be removed was fixed, you may try to use REPLACE[5] + LEN functions (demo)
Data modeling: Data type