Genetic Algorithm for Random Tree Generation in Bioinformatics Data
Applied Information and Communication Technologies (AICT 2012): Proceedings of the 5th International Scientific Conference 2012
Inese Poļaka

In the last few decades the progress in medical technology has made it possible to analyze biologic material like DNA, protein profiles and antibody displays. But it has also created new challenges in data analysis related to processing of high-dimensional data. One of the significant tasks in bioinformatics is finding diagnostic rules in biomedical data, that can be solved using data mining techniques like classification, which has to be highly accurate. But there is an issue related to the bioinformatics data called ‘the curse of dimensionality’ because the data have very high dimensionality and relatively few records that not only influences the classification accuracy but also makes the classification process very costly calculation-wise. Many dimensionality reduction methods hinder the interpretation of classifiers ergo there is a necessity for scalable methods like Decision trees that have a built-in dimensionality reduction algorithms that need only the variables that are used in the classifier. And the other issue (calculation costs) can be tackled by using random subspace methods. Both of these possible solutions are combined in the proposed approach that uses Genetic Algorithm to implement the random subspace selection and search for the best solutions while building a Random Decision tree, that can be used to find diagnostic signatures in biomedical data. This article gives information about the method as well as examines its performance in comparison with other classification methods using bioinformatics data (antibody phage display data with more than 1000 attributes and few hundred records).


Atslēgas vārdi
bioinformatics, classification, data mining, genetic algorithm, random tree
Hipersaite
http://aict.itf.llu.lv/files/rakstkraj/2012/polaka_aict2012.pdf

Poļaka, I. Genetic Algorithm for Random Tree Generation in Bioinformatics Data. No: Applied Information and Communication Technologies (AICT 2012): Proceedings of the 5th International Scientific Conference, Latvija, Jelgava, 26.-27. aprīlis, 2012. Jelgava: 2012, 335.-340.lpp. ISBN 978-9984-48-065-7. ISSN 2255-8586.

Publikācijas valoda
English (en)
RTU Zinātniskā bibliotēka.
E-pasts: uzzinas@rtu.lv; Tālr: +371 28399196