Authors: Isarankura-Na-Ayudhya, Chartchalerm
Naenna, Thanakorn
Nantasenamat, Chanin
Prachayasittikul, Virapong
Title: Recognition of DNA Splice Junction via Machine Learning Approaches
Language (ISO): en
Abstract: Successful recognition of splice junction sites of human DNA sequences was achieved via three machine learning approaches. Both unsupervised (Kohonen's Self-Organizing Map, KSOM) and supervised (Back-propagation Neural Network, BNN; and Support Vector Machine, SVM) machine learning techniques were used for the classification of sequences from the testing set into one of three categories: transition from exon to intron, transition from intron to exon, and no transition. The dataset used in this study is comprised of 1,424 DNA sequences obtained from the National Center for Bioinformatics Information (NCBI). Performance of the machine learning approaches were assessed by the construction of learning models from 1,000 sequences of the training set and evaluated on the 424 sequences of the testing set that is unknown to the learning model. Each sequence is a window of 32 nucleotides long with regions comprising -15 to +15 nucleotides from the dinucleotide splice site. Since the nucleotides (A, C, G, and T) are represented by four digit binary code (e.g. 0001, 0010, 0100, and 1000) the number of descriptors increased from 32 to 128. The performance of machine learning techniques in order of increasing accuracy are as follows SVM > BNN > KSOM, suggesting that SVM is a robust method in the identification of unknown splice site. Although KSOM gave lower prediction accuracy than the two supervised methods, it is fascinating that it was able to make such prediction based only on knowledge of the input whereas the supervised method requires that the output be known during training. It is expected that the Support Vector Machine method can provide a powerful computational tool for predicting the splice junction sites of uncharacterized DNA.
Subject Headings: Back-propagation Neural Network
DNA splice junction
Kohonen's Self-Organizing Map
machine learning
Support Vector Machine
URI: http://hdl.handle.net/2003/25654
http://dx.doi.org/10.17877/DE290R-15804
Issue Date: 2005-10-14
Appears in Collections:Original Articles

Files in This Item:
File Description SizeFormat 
Prachayasittikul13-05proof.pdfDNB517.15 kBAdobe PDFView/Open


This item is protected by original copyright



This item is protected by original copyright rightsstatements.org