TY - GEN
T1 - Comparison of Heuristics for Inhibitory Rule Optimization
AU - Alsolami, Fawaz
AU - Chikalov, Igor
AU - Moshkov, Mikhail
N1 - KAUST Repository Item: Exported on 2020-10-01
PY - 2014/9/13
Y1 - 2014/9/13
N2 - Knowledge representation and extraction are very important tasks in data mining. In this work, we proposed a variety of rule-based greedy algorithms that able to obtain knowledge contained in a given dataset as a series of inhibitory rules containing an expression “attribute ≠ value” on the right-hand side. The main goal of this paper is to determine based on rule characteristics, rule length and coverage, whether the proposed rule heuristics are statistically significantly different or not; if so, we aim to identify the best performing rule heuristics for minimization of rule length and maximization of rule coverage.
Friedman test with Nemenyi post-hoc are used to compare the greedy algorithms statistically against each other for length and coverage. The experiments are carried out on real datasets from UCI Machine Learning Repository. For leading heuristics, the constructed rules are compared with optimal ones obtained based on dynamic programming approach. The results seem to be promising for the best heuristics: the average relative difference between length (coverage) of constructed and optimal rules is at most 2.27% (7%, respectively). Furthermore, the quality of classifiers based on sets of inhibitory rules constructed by the considered heuristics are compared against each other, and the results show that the three best heuristics from the point of view classification accuracy coincides with the three well-performed heuristics from the point of view of rule length minimization.
AB - Knowledge representation and extraction are very important tasks in data mining. In this work, we proposed a variety of rule-based greedy algorithms that able to obtain knowledge contained in a given dataset as a series of inhibitory rules containing an expression “attribute ≠ value” on the right-hand side. The main goal of this paper is to determine based on rule characteristics, rule length and coverage, whether the proposed rule heuristics are statistically significantly different or not; if so, we aim to identify the best performing rule heuristics for minimization of rule length and maximization of rule coverage.
Friedman test with Nemenyi post-hoc are used to compare the greedy algorithms statistically against each other for length and coverage. The experiments are carried out on real datasets from UCI Machine Learning Repository. For leading heuristics, the constructed rules are compared with optimal ones obtained based on dynamic programming approach. The results seem to be promising for the best heuristics: the average relative difference between length (coverage) of constructed and optimal rules is at most 2.27% (7%, respectively). Furthermore, the quality of classifiers based on sets of inhibitory rules constructed by the considered heuristics are compared against each other, and the results show that the three best heuristics from the point of view classification accuracy coincides with the three well-performed heuristics from the point of view of rule length minimization.
UR - http://hdl.handle.net/10754/550701
UR - http://linkinghub.elsevier.com/retrieve/pii/S1877050914010837
UR - http://www.scopus.com/inward/record.url?scp=84924215742&partnerID=8YFLogxK
U2 - 10.1016/j.procs.2014.08.118
DO - 10.1016/j.procs.2014.08.118
M3 - Conference contribution
SP - 378
EP - 387
BT - Procedia Computer Science
PB - Elsevier BV
ER -