Data cleaning: Difference between revisions

From LemonWiki共筆
Jump to navigation Jump to search
mNo edit summary
Line 17: Line 17:
# {{kbd | key = NULL}} value: {{kbd | key = SELECT * FROM table_name WHERE column_name IS NULL;}}  
# {{kbd | key = NULL}} value: {{kbd | key = SELECT * FROM table_name WHERE column_name IS NULL;}}  
# empty value: {{kbd | key = SELECT * FROM table_name WHERE LENGTH(TRIM( column_name )) = 0;}} {{exclaim}} SQL query with the condition {{kbd | key =SELECT * FROM table_name WHERE column_name IS NOT NULL}} includes empty value
# empty value: {{kbd | key = SELECT * FROM table_name WHERE LENGTH(TRIM( column_name )) = 0;}} {{exclaim}} SQL query with the condition {{kbd | key =SELECT * FROM table_name WHERE column_name IS NOT NULL}} includes empty value
#  Excel starting date: 1900/1/1, 1900/1/2 ... {{exclaim}} left blank intentionally
#  Excel starting date: 1900/1/0 (converted time formatted value from 0), 1900/1/1 (converted time formatted value from 1), 1900/1/2 ... {{exclaim}} Replace the year > 100 from this year with empty value at EXCEL: {{kbd | key =<nowiki>=IF(ISERR(YEAR(A2)), "", IF(YEAR(A2)<1914, "", A2))</nowiki>}} (this formula also handle empty value and non well-formatted column value ex: 0000-12-31 )


== check if field contains value ==
== check if field contains value ==

Revision as of 17:49, 22 September 2014

is null

Finds whether a variable is NULL

  • PHP is_null
  • Google spreadsheet / Excel:
    • ISERR(value) " value - The value to be verified as an error type other than #N/A." ex: #NULL!
    • If the cell value is exactly NULL not #NULL!, You may use COUNTIF(value, "NULL") or EXACT(value, "NULL")
  • MySQL SQL syntax: SELECT * FROM table WHERE column IS NULL;[1] demo

Finds whether a variable is NOT NULL

  • MySQL SQL syntax: SELECT * FROM table WHERE column IS NOT NULL; demo

check if field value was not fulfilled: NULL, empty value

Icon_exclaim.gif NOT include those data which its field value fulfilled with default value automatically

(demo on sqlfiddle)

  1. NULL value: SELECT * FROM table_name WHERE column_name IS NULL;
  2. empty value: SELECT * FROM table_name WHERE LENGTH(TRIM( column_name )) = 0; Icon_exclaim.gif SQL query with the condition SELECT * FROM table_name WHERE column_name IS NOT NULL includes empty value
  3. Excel starting date: 1900/1/0 (converted time formatted value from 0), 1900/1/1 (converted time formatted value from 1), 1900/1/2 ... Icon_exclaim.gif Replace the year > 100 from this year with empty value at EXCEL: =IF(ISERR(YEAR(A2)), "", IF(YEAR(A2)<1914, "", A2)) (this formula also handle empty value and non well-formatted column value ex: 0000-12-31 )

check if field contains value

  1. MySQL: SELECT * FROM table_name WHERE LENGTH(TRIM( column_name )) != 0; demo

outlier

(left blank intentionally)

data handling

remove first, last or certain characters from text

  • Excel: using RIGHT[2] + LEN[3] functions [4]
  • Excel: if the text length will be removed was fixed, you may try to use REPLACE[5] + LEN functions (demo)

related pages

Data modeling: Data type

references