Accurate microRNA Target Prediction Using Detailed Binding Site Accessibility and Machine Learning on Proteomics Data

Front Genet. 2012 Jan 18:2:103. doi: 10.3389/fgene.2011.00103. eCollection 2011.

Abstract

MicroRNAs (miRNAs) are a class of small regulatory genes regulating gene expression by targeting messenger RNA. Though computational methods for miRNA target prediction are the prevailing means to analyze their function, they still miss a large fraction of the targeted genes and additionally predict a large number of false positives. Here we introduce a novel algorithm called DIANA-microT-ANN which combines multiple novel target site features through an artificial neural network (ANN) and is trained using recently published high-throughput data measuring the change of protein levels after miRNA overexpression, providing positive and negative targeting examples. The features characterizing each miRNA recognition element include binding structure, conservation level, and a specific profile of structural accessibility. The ANN is trained to integrate the features of each recognition element along the 3'untranslated region into a targeting score, reproducing the relative repression fold change of the protein. Tested on two different sets the algorithm outperforms other widely used algorithms and also predicts a significant number of unique and reliable targets not predicted by the other methods. For 542 human miRNAs DIANA-microT-ANN predicts 120000 targets not provided by TargetScan 5.0. The algorithm is freely available at http://microrna.gr/microT-ANN.

Keywords: binding site structure; microRNAs; target prediction.