Full metadata record
DC FieldValueLanguage
dc.contributor.authorIsarankura-Na-Ayudhya, Chartchalermde
dc.contributor.authorNaenna, Thanakornde
dc.contributor.authorNantasenamat, Chaninde
dc.contributor.authorPrachayasittikul, Virapongde
dc.date.accessioned2008-06-17T13:59:11Z-
dc.date.available2008-06-17T13:59:11Z-
dc.date.issued2005-10-14de
dc.identifier.issn1611-2156de
dc.identifier.urihttp://hdl.handle.net/2003/25654-
dc.identifier.urihttp://dx.doi.org/10.17877/DE290R-15804-
dc.description.abstractSuccessful recognition of splice junction sites of human DNA sequences was achieved via three machine learning approaches. Both unsupervised (Kohonen's Self-Organizing Map, KSOM) and supervised (Back-propagation Neural Network, BNN; and Support Vector Machine, SVM) machine learning techniques were used for the classification of sequences from the testing set into one of three categories: transition from exon to intron, transition from intron to exon, and no transition. The dataset used in this study is comprised of 1,424 DNA sequences obtained from the National Center for Bioinformatics Information (NCBI). Performance of the machine learning approaches were assessed by the construction of learning models from 1,000 sequences of the training set and evaluated on the 424 sequences of the testing set that is unknown to the learning model. Each sequence is a window of 32 nucleotides long with regions comprising -15 to +15 nucleotides from the dinucleotide splice site. Since the nucleotides (A, C, G, and T) are represented by four digit binary code (e.g. 0001, 0010, 0100, and 1000) the number of descriptors increased from 32 to 128. The performance of machine learning techniques in order of increasing accuracy are as follows SVM > BNN > KSOM, suggesting that SVM is a robust method in the identification of unknown splice site. Although KSOM gave lower prediction accuracy than the two supervised methods, it is fascinating that it was able to make such prediction based only on knowledge of the input whereas the supervised method requires that the output be known during training. It is expected that the Support Vector Machine method can provide a powerful computational tool for predicting the splice junction sites of uncharacterized DNA.en
dc.language.isoende
dc.relation.ispartofseriesEXCLI Journal ; Vol. 4, 2005en
dc.subjectBack-propagation Neural Networken
dc.subjectDNA splice junctionen
dc.subjectKohonen's Self-Organizing Mapen
dc.subjectmachine learningen
dc.subjectSupport Vector Machineen
dc.subject.ddc610-
dc.titleRecognition of DNA Splice Junction via Machine Learning Approachesen
dc.typeTextde
dc.type.publicationtypearticlede
dcterms.accessRightsopen access-
eldorado.dnb.zdberstkatid2132560-1-
Appears in Collections:Original Articles

Files in This Item:
File Description SizeFormat 
Prachayasittikul13-05proof.pdfDNB517.15 kBAdobe PDFView/Open


This item is protected by original copyright



This item is protected by original copyright rightsstatements.org