Data cleaning
Jump to navigation
Jump to search
is null
Finds whether a variable is NULL. online demo
- PHP is_null
- Google spreadsheet / Excel:
- ISERR(value) " value - The value to be verified as an error type other than #N/A." ex: #NULL!
- If the cell value is exactly NULL not #NULL!, You may use COUNTIF(value, "NULL") or EXACT(value, "NULL")
- MySQL SQL syntax: SELECT * FROM table WHERE column IS NULL;[1]
Finds whether a variable is NOT NULL
- MySQL SQL syntax: SELECT * FROM table WHERE column IS NOT NULL;
check if field value was not fulfilled: NULL, empty value
NOT include those data which its field value fulfilled with default value automatically
- find records with NULL value: (note: not #NULL!)
- MySQL solution: SELECT * FROM table_name WHERE column_name IS NULL;
- EXCEL: =EXACT(A2, "NULL")
- find records with empty value: (not contains NULL value)
- MySQL: SELECT * FROM table_name WHERE LENGTH(TRIM( column_name )) = 0;
SQL query SELECT * FROM table_name WHERE column_name IS NOT NULL includes empty value
- MySQL: SELECT * FROM table_name WHERE LENGTH(TRIM( column_name )) = 0;
- find NOT empty records means records without NULL or empty value:
- MySQL: SELECT * FROM table_name WHERE LENGTH(TRIM( column_name )) != 0;
- MySQL: SELECT * FROM table_name WHERE column_name != '' AND column_name IS NOT NULL;
- Excel starting date: 1900/1/0 (converted time formatted value from 0), 1900/1/1 (converted time formatted value from 1), 1900/1/2 ...
- solution: step1: Replace the year > 100 from this year with empty value at EXCEL: =IF(ISERR(YEAR(A2)), "", IF(YEAR(A2)<1914, "", A2)) (this formula also handle empty value and non well-formatted column value ex: 0000-12-31 ) ; step2: change the format of cell to time format
- trivial approach : EXCEL: =IF(ISERR(YEAR(A2)), "", IF(YEAR(A2)-YEAR(NOW())>100, "", A2))
this formula could not handle empty value because it return 0. If I change the format of cell to time format, 0 will become 1900/1/0.
- Using PHP empty() function to find 0, null, false, empty string, empty array values.
check if field contains value
length of string > 0
- MySQL: SELECT * FROM table_name WHERE LENGTH(TRIM( column_name )) != 0; demo
column value is not null or 0
- Excel: COUNTIFS(criteria_range1, "<>NULL", criteria_range1, "<>0")[2]
find if number or cell value is positive integer
- EXCEL: =IFERROR(IF(AND(INT( value )= value, value>0), TRUE, FALSE), FALSE)[3] online demo
verify the format of field value
email:
- EXCEL: =IF(ISERR(FIND("@", A2, 1)), FALSE, TRUE) only check the field if contains @ symbol or not
- result: (1) normal condition: return TRUE; (2) exceptional condition: return FALSE if @ symbol was not found
- EXCEL: =FIND("@", A2, 2) only check the field if contains @ symbol or not
- syntax: FIND(find_text, with_text, [start_num]) the start_num is 2 because the position of @ symbol should be larger than 1 (position of first char is 1)
- result: (1) normal condition: return the number larger than 1; (2) exceptional condition: return #VALUE! if @ symbol was not found
- PHP: PHP FILTER_VALIDATE_EMAIL Filter
- "Returns the filtered data, or FALSE if the filter fails." quoted from PHP.net
duplicate data
outlier
(left blank intentionally)
data handling
remove first, last or certain characters from text
Data modeling: Data type
references
- ↑ MySQL :: MySQL 5.0 Reference Manual :: 3.3.4.6 Working with NULL Values
- ↑ Excel COUNTIFS and COUNTIF with multiple criteria – examples of usage
- ↑ Check if number is an Integer
- ↑ RIGHT、RIGHTB 函數 - Excel - Office.com
- ↑ LEN、LENB 函數 - Excel - Office.com
- ↑ How to remove first, last or certain characters from text in Excel?
- ↑ REPLACE、REPLACEB 函數 - Excel - Office.com