Zinātniskās darbības atbalsta sistēma
Latviešu English

Publikācija: Clustering Algorithm Specifics in Class Decomposition

Publikācijas veids Publikācijas konferenču materiālos, kas ir indeksēti Web of Science un/vai SCOPUS
Pamatdarbībai piesaistītais finansējums Nav zināms
Aizstāvēšana: ,
Publikācijas valoda English (en)
Nosaukums oriģinālvalodā Clustering Algorithm Specifics in Class Decomposition
Pētniecības nozare 2. Inženierzinātnes un tehnoloģijas
Pētniecības apakšnozare 2.2. Elektrotehnika, elektronika, informācijas un komunikāciju tehnoloģijas
Autori Inese Poļaka
Atslēgas vārdi boinformatics, class decomposition, clustering
Anotācija The task of the presented study is to find different disease phenotypes of cancer (breast cancer, carcinoma, gastric cancer, melanoma, prostate cancer) and gastrointestinal inflammatory disease using clustering algorithms. The article analyzes the performance of two different approaches to clustering data for class decomposition. One of them is using agglomerative hierarchical clustering and analyzing the obtained dendrogram to determine the number of disease subtypes; another is using k-means algorithm and determining the number of disease subtypes by analyzing the cluster compactness after several runs (using different numbers of clusters/cluster centers). After clustering is done, the clusters are analyzed to assess their specifics and the potential of clusters being different phenotypes or disease subtypes. The initial analysis of clustering results consisted of analyzing records belonging to clusters, cluster sizes and specifics. The secondary analysis of clustering was cluster quality evaluation that was done using classification algorithms (C4.5, Random Forest and SVM). The hypothesis is that well formed clusters would create disease subtypes that would be easily split using classification algorithms. The main results of the study show that the secondary analysis of the clusters is very similar for both clustering approaches and increases the classification results compared to the results of initial full data classification. The results also point to the sensitivity of k-means algorithm to noise and outliers because the initial analysis showed that the clusters formed a main group of records and several clusters of very few records. Although hierarchical clustering asks for expert opinion in cluster number determination, it also showed that it formed several large clusters that could point to phenotypical subtypes of the diseases.
Hipersaite: http://aict.itf.llu.lv/proceedings/2013 
Atsauce Poļaka, I. Clustering Algorithm Specifics in Class Decomposition. No: Applied Information and Communication Technology (AICT2013): Proceedings of the 6th International Scientific Conference, Latvija, Jelgava, 25.-26. aprīlis, 2013. Jelgava: Latvia University of Agriculture, 2013, 29.-35.lpp. ISSN 2255-8586.
Papildinformācija Citējamību skaits:
  • Scopus  0
ID 16079