Anomaly detection: Difference between revisions

From LemonWiki共筆
Jump to navigation Jump to search
 
Line 18: Line 18:
For consumer data
For consumer data


* Season issue: consumption data of coat (大衣) and cold weather (winter 冬天)
* Season issue: consumption data of coat should increase in cold weather
* Holiday issue: consumption data of special holiday e.g. Mid-Autumn Festival / Moon Festival
* Holiday issue: consumption data of some gift e.g. moon cake should increase in special holiday e.g. Mid-Autumn Festival


== Anomaly detection for string data ==
== Anomaly detection for string data ==

Latest revision as of 15:55, 3 October 2022

Outlier / Anomaly detection

Anomaly detection of numeric data[edit]

  • Median
  • Range Checks
  • All values is event or odd
  • The values are the same even the column is totally different

Anomaly detection of categorical data (qualitative variable)[edit]

  • Normal distribution e.g. The interest of audiences should be very different NOT coherent

Anomaly detection for time series data[edit]

  • Trend
  • Dramatically Increase or decrease of rows count for each time period

Anomaly detection for consumer data[edit]

For consumer data

  • Season issue: consumption data of coat should increase in cold weather
  • Holiday issue: consumption data of some gift e.g. moon cake should increase in special holiday e.g. Mid-Autumn Festival

Anomaly detection for string data[edit]

  • created time of the text message
  • time frequency of the text message
  • length of the text message

More on: Outlier - Wikipedia