Volume 31, Issue 4, January 2021, Pages 763–775
Kone Dramane1, Goore Bi Tra2, and Kimou Kouadio Prosper3
1 UMRI 78- Electronics and Electricity, LARIT: Computer and Telecommunications Research Laboratory, EDP/INP-HB, Yamoussoukro, Cote d’Ivoire
2 UMRI 78- Electronics and Electricity, LARIT: Computer and Telecommunications Research Laboratory, EDP/INP-HB, Yamoussoukro, Cote d’Ivoire
3 UMRI 78- Electronics and Electricity, LARIT: Computer and Telecommunications Research Laboratory, EDP/INP-HB, Yamoussoukro, Cote d’Ivoire
Original language: English
Copyright © 2021 ISSR Journals. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
In this paper, we present a hybrid method for efficiently estimating missing discrete attributes appearing in data manipulation or processing. The principle of the method consists first of all in determining the segment to which the missing value belongs and then estimating it by majority vote when possible. Otherwise, the average of the missing attribute is determined from the complete data of the segment. Several cases may arise. The case where the non-missing attributes have the same modality (they are in the same interval) is dealt with by calculating the centre of the missing attribute. M of the class and the average m attributes that are not missing. If m is less than M then the value e of the missing attribute is estimated by the value of the non-missing attribute within the interval [a, M [ where a is the lower bound of the modality. Otherwise, the value of the other non-missing attribute is used for estimation. The second case, where the non-missing attributes have different modalities, is treated by calculating the average m attributes that are not missing and then estimate the missing value. e by the not-missing attribute having the same modality as m. Finally, an error test based on RMSE demonstrates the effectiveness of our method.
Author Keywords: Cleaning, Estimation, Segmentation, Classification, MAR, Data Mining.
Kone Dramane1, Goore Bi Tra2, and Kimou Kouadio Prosper3
1 UMRI 78- Electronics and Electricity, LARIT: Computer and Telecommunications Research Laboratory, EDP/INP-HB, Yamoussoukro, Cote d’Ivoire
2 UMRI 78- Electronics and Electricity, LARIT: Computer and Telecommunications Research Laboratory, EDP/INP-HB, Yamoussoukro, Cote d’Ivoire
3 UMRI 78- Electronics and Electricity, LARIT: Computer and Telecommunications Research Laboratory, EDP/INP-HB, Yamoussoukro, Cote d’Ivoire
Original language: English
Copyright © 2021 ISSR Journals. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
In this paper, we present a hybrid method for efficiently estimating missing discrete attributes appearing in data manipulation or processing. The principle of the method consists first of all in determining the segment to which the missing value belongs and then estimating it by majority vote when possible. Otherwise, the average of the missing attribute is determined from the complete data of the segment. Several cases may arise. The case where the non-missing attributes have the same modality (they are in the same interval) is dealt with by calculating the centre of the missing attribute. M of the class and the average m attributes that are not missing. If m is less than M then the value e of the missing attribute is estimated by the value of the non-missing attribute within the interval [a, M [ where a is the lower bound of the modality. Otherwise, the value of the other non-missing attribute is used for estimation. The second case, where the non-missing attributes have different modalities, is treated by calculating the average m attributes that are not missing and then estimate the missing value. e by the not-missing attribute having the same modality as m. Finally, an error test based on RMSE demonstrates the effectiveness of our method.
Author Keywords: Cleaning, Estimation, Segmentation, Classification, MAR, Data Mining.
How to Cite this Article
Kone Dramane, Goore Bi Tra, and Kimou Kouadio Prosper, “New Hybrid Method for Efficient Imputation of Discrete Missing Attributes,” International Journal of Innovation and Applied Studies, vol. 31, no. 4, pp. 763–775, January 2021.