Random Forests\xAE represents a major advance in data mining and knowledge discovery, offering high levels of predictive accuracy and an innovative set of graphical displays to reveal unexpected patterns in data.
Random Forests is best suited for the analysis of complex data structures embedded in small to moderate data sets containing typically less than 10,000 rows but allowing for more than 1 million columns.
Random Forests has therefore been enthusiastically endorsed by many biomedical and pharmaceutical researchers.
A Random Forest is a collection of CART trees that are used to predict via a consensus or voting mechanism, where each tree is grown at least partially at random. A large number of large trees are grown and results can be remarkably accurate. Much of the insight provided by
Random Forests is generated by methods applied after the trees are grown and include new technology for identifying clusters or segments in data as well as new methods for ranking the importance of variables. Ongoing research on
Random Forests is being undertaken by Salford Systems in collaboration with Professor Adele Cutler, the surviving co-author (with Leo Breiman) of
Random Forests.