Comparison of Different Methods of Outlier Detection in Univariate Time Series Data

Egbo Mary Nkechinyere; Iheagwara Andrew  I.; Okenwe Idochi

doi:10.53555/ms.v1i1.912

Authors

Egbo Mary Nkechinyere Department of Statistics, Federal University of Technology, Owerri, Nigeria
Iheagwara Andrew I. Procurement Officer/Director Planning, Research & Statistics, Nigeria Erosion & Watershed Management Project (World Bank-Assisted), Ministry of Petroleum & Environment, Ploy 36, chief Executive Quarters, Area “B”, New Owerri, Imo State Nigeria
Okenwe Idochi Department of Statistics, School of Applied Sciences, Rivers State Polytechnic, PMB 20, Bori, Rivers State, Nigeria

DOI:

https://doi.org/10.53555/ms.v1i1.912

Keywords:

Univariate time series data, outlier detection, MADe Rule,, Modified Z-Score, 2SD method, 3SD method

Abstract

Overtime, different methods of detecting outliers have been worked on, some detected single outliers while others detected multiple outliers, some detected outliers in univariate models while others are limited to multivariate models, some others used simple measures while a lot others used the robust measures for detecting outliers. With these numerous methods raised the problem of which method is the best given a particular set of data. The best methods are subjective to the kind of data that is under consideration in the given study. For this study, we confined our attention to univariate time series data, subjected it to different methods of outlier detection in univariate data, detected the outliers and then worked on the efficiency of these different methods of outlier detection. We as well took time to outline the procedures of detecting univariate outlier in some common statistical software packages. It can be concluded from the evidence of this study that the 3SD method and the Z-score method of outlier detection is not a good model for detecting outliers in univariate model. This can be attributed to the parameters they use for estimation of outliers in these data sets.

Downloads

Download data is not yet available.

References

Aggarwal C.C. (2001).Outlier Analysis. Berlin, Kluwer Academic Publishers.

Ahmet, K. (2010). Statistical Modelling for Outlier Factors.Ozean Journal of Applied Sciences 3(1), 2010 ISSN 1943-2429

Barnett, V. and Lewis, T. (1994).Outliers in Statistical Data. New York, John Wiley and Sons 3rd Edition.

Ben-Gal, I. (2005), Outlier Detection. Glasgow, Kluwer Academic Publishers.

Breunig, M.M., Kriegel, H., Ng, R.T., and Sander, J. (2000),“ LOF: Identifying Density Based Local

Outliers”. International Conference on Management of Data, Dallas.no.4

Carling, K. (1998), “Resistant Outlier Rules and the Non-Gaussian Case”.Computational statistics and data analysis, vol. 33, pp 249 – 258.

Chawla, S. and Sun, P. (2006), “Outlier Detection: Principles, Techniques and Applications.” School of

Information Technology, University of Sydney Australia.Cousineau, D. and Chartier, S. (2010),

“Outlier Detection and Treatment: A review” International Journal of Psychological research

vol.3, No.1

Datta, P. and Kibler, D. (1995), “Learning Prototypical Concept Definition”.In proceedings for the 12th

international conferences on machine learning. Pp. 158-166, Morgan Kaufmann.

Ferdousi, Z. and Maeda, A. (2006), “Unsupervised Outlier Detection in Time Series Data”.Proceedings of

the 22nd International Conference on Data Engineering Workshops.no. 2

Grubbs, F.E. (1969), “Procedures for Detecting Outlying Observations in Samples”. Technometrics 11, 1-21

Gupta, M., Gao, J., Aggrawal, C.C., and Han, J. (2014), “Outlier Detection for Temporal Data; A Survey”

Institute of electrical and electronics engineering transactions of knowledge and data engineering

vol.25, No.1 .

Hau, M.C. and Tong H., (1989), “A Practical Method for Outlier Detection in Autoregressive Time Series

Modelling”. Stochastic Hydrol. Hydraul.3, 241-260.

Hawkins, D. (1980). Identification of Outliers.Massachuset, Chapman and Hall.

Heymann, S., Letapy, M., and Magnien, C. (2012), “Outskewer: Using Skewness to Spot Outliers in Samples and Time Series”. Universite Pierre et Marie Curie, 4 Place Jussien, 75252 Paris, France.

Hodge, V.J., and Austin, J. (2004). A Survey of Outlier Detection Methodologies. Sydney, Kluwer Academic

Publishers.

Hsiao, C. and Tian, X. (2011),“ Intelligent Decisions: Towards Interpreting the D-algorithm”.International Journal of Psychological research vol.3, No.2 “Human Longevity Facts.” http:www.myth-one.com/chapter_19.htm

Iglewicz, B., Hoaglin, D. (1993), “How to Detect and Handle Outliers”.ASQC Quality Press.

Jagaddish, H.V., Koudas, N., and Muthukrishnan, S., (1999), “Mining Deviants in A Time Series Database”.

In procurement of the 25th international conference on Very Large Data Bases (VLDB), pp 102-

Kaya, A. (2010),“ StatisticalModelling for Outlier Factors”.Ozean Journal of Applied Sciences vol.3(1).pp

-121

Kiware, S. (2010),“ Detection of Outliers on Time Series Data”. Marquette University e-publication.

Knorr, E.M. and Ng, R.T. (1998), “Algorithm for Mining Distance Based Outliers in Large Data

Sets”.University of British Columbia Vancouver, BC V6T I24 Canada.

Kriegel, H.P., Kroger, P., and Zimek, A. (2010), “Outlier Detection Techniques”.Society of Industrial and

Applied Mathematics international conference on data mining.no 8

Nare, H., Maposa D., and Lesaona, M. (2012),“ A Method for Detection and Correction of Outlier in Time

Series Data.” African Journal of Business Management.vol 3, no.2, pp 43-57

Olewuezi, N.P .(2011), “Note on the Comparison of Some Outlier Labeling Techniques”.Journal of

Mathematics and Statistics, vol 7, pp353-355.

Ranjit, K.P. (2001), “Some Methods of Detection of Outliers in Linear Regression Model”. Indian

Agricultural Statistics Research Institute, Library avenue, New Delhi.

Ratcliff, R. (1993), “Methods of Dealing with Reaction Time Outliers”. Psychological bulletin, vol. 114, pp

– 532.

Rousseeuw, P. and Leroy, A. (1996), Robust Regression and Outlier Detection. John Wiley and Sons, 3rd

Edition.

Regina, K.and Agustin, M. (2001). Seasonal Outliers in Time Series. Partially supported by the Spanish grant PB95-0299 of CICYT.

Seo, S. (2006), “A Review and Comparison of Methods for Detecting Outliers in Univariate Data sets”. University of Pittsburgh.

Skalak, D.B. (1994), “Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing

Algorithms”. In: Machine learning proceedings of the 11th international conference. Pp 293-301.

Tukey, J.W. (1977). Explanatory Data Analysis. Kyiv, Addison-Wesley.

Yamanishi, K. and Takeuchi, J. (2002), “A Unifying Framework for Detecting Outliers and Change Points

From Non-Stationary Time Series Data”.NEC Corporations, Alberta Canada.

Comparison of Different Methods of Outlier Detection in Univariate Time Series Data

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

Most read articles by the same author(s)

crossref

Make a Submission

sidebar