Outlier Detection in Big Data

J. Wang (Editor), Victoria J. Hodge

Research output: Chapter in Book/Report/Conference proceedingChapter (peer-reviewed)peer-review

Abstract

Outlier detection (or anomaly detection) is a fundamental task in data mining. Outliers are data that deviate from the norm and outlier detection is often compared to “finding a needle in a haystack”. However, the outliers may generate high value if they are found, value in terms of cost savings, improved efficiency, compute time savings, fraud reduction and failure prevention. Detection can identify faults before they escalate with potentially catastrophic consequences. Big Data refers to large, dynamic collections of data. These vast and complex data appear problematic for traditional outlier detection methods to process but, Big Data provides considerable opportunity to uncover new outliers and data relationships. This chapter highlights some of the research issues for outlier detection in Big Data and covers the solutions used and research directions taken along with an analysis of some current outlier detection approaches for Big Data applications.
Original languageEnglish
Title of host publicationEncyclopedia of Business Analytics and Optimization
EditorsJ. Wang
PublisherHershey, PA: IGI Global
Pages1762-1771
Number of pages10
DOIs
Publication statusPublished - 1 Apr 2014

Publication series

NameEncyclopedia of Business Analytics and Optimization

Bibliographical note

I have been given permission to publish this version of the chapter on Uni of York research database. I have a signed authorisation form from IGI in PDF format giving authorisation.

Cite this