|
*NEWS* Check the
program's new Usage Tutorial
here
*NEWS* New
version of the program (version 2) that includes added value options and
a module to precisely visualize the output decision tree. Get it from
here
What is GATree?
This work is an attempt to overcome the use of greedy
heuristics and search the decision tree space in a natural way.
More specifically, we make use of genetic algorithms to directly
evolve binary decision trees in the conquest for the one that most
closely matches the target concept. On doing so we adopt a natural
representation of the search space using
actual decision trees and not binary strings. We couple our objective
with a simplification motivation. We use GAs to robustly evolve
accurate as well as simple decision trees.
Preference VS Procedural Bias
A preference bias is based on the learner’s behavior
while a procedural bias is based on the learner’s design. For example,
C4.5 is biased towards accurate, small trees (preference bias) and
uses the gain-ratio metric and minimum-error pruning (different
procedural biases). A preference bias is most often desirable since it
determines the characteristics of the produced tree. On the other
hand, an inadequate procedural bias may severely affect the quality of
the output. The proposed search imposes a new weak procedural bias,
one that allows the concept learner to consider a relative large
number of hypotheses, in a relative efficient manner. The
proposed weak bias employs global metrics of tree quality. We thus
shift from “how to induce a tree” (standard, impurity-based
induction) to “what criteria an induced tree must satisfy”.
What is special about GATree?
(a) GATree can continue decision tree evolution for as long as needed.
If we have ample resources then we can expect an increasingly best-fit
decision tree. Also, we can stop the evolution whenever the results are
satisfactory since we evolve complete solutions to the problem.
(b) GAtree allows the user to select the characteristics of the
resulting decision tree. Its easy to prefer smaller or more accurate
trees.
(c) GAtree can provide a set of totally different decision trees that
are close matches to the solution space. All those trees can be used
alternatively to the best-fit one.
(d) There are certain domains where statistical inducers can not produce
optimal trees. GATree can overcome global or local minimums. Please read
the papers that present this approach and its benefits. |