Article contents
Large-scale data analysis on aviation accident database using different data mining techniques
Published online by Cambridge University Press: 22 November 2016
Abstract
Data mining is an iterative process in which progress is defined by discovery through either automatic or manual methods. A data cleaning procedure is proposed to improve the quality of classification tasks in the knowledge discovery process by taking into account both redundant and conflicting data. The redundancy check is performed on the original dataset and the resultant dataset is preserved. This resultant dataset is then checked for conflicting data and, if any are found, they are corrected and updated on the original aircraft dataset. This updated dataset is then classified using a variety of classifiers such as Bayes, functions, lazy, MISC, rules and decision trees. The performance of the updated datasets on these classifiers is examine, and the result shows a significant improvement in the classification accuracy after redundancy and conflicts are removed. The conflicts after correction are updated in the original dataset, and when the performance of the classifier is evaluated, great improvement is observed. This paper aims to address how data mining techniques can be used to understand complex system accidents in the aviation domain. Decision trees are considered to be the one of the most powerful and popular approaches in knowledge discovery and data mining. The objective is to develop a classification model for aviation risk investigation and reduction using a decision tree induction method that enhances the ability to form decision trees and thereby proves that the classification accuracy of decision trees is greater. Different feature selectors are used in this study in order to reduce the number of initial attributes.
- Type
- Research Article
- Information
- Copyright
- Copyright © Royal Aeronautical Society 2016
References
REFERENCES
- 6
- Cited by