YJMCC-07960; No. of pages: 8; 4C: 3 Journal of Molecular and Cellular Cardiology xxx (2014) xxx–xxx

Contents lists available at ScienceDirect

Journal of Molecular and Cellular Cardiology journal homepage: www.elsevier.com/locate/yjmcc

F

O

5 6 7 8 9 10 11

a

1 2

a r t i c l e

13 14 15 16 17

Article history: Received 3 September 2014 Received in revised form 5 November 2014 Accepted 29 November 2014 Available online xxxx

18 Q7 19 20 21 22 23 24 25

Keywords: MiRNAs Multiple sequence alignment RNA secondary structure Interaction Database Omics Heart

Functional Genomics and Systems Biology Group, Department of Bioinformatics, Biocenter, Würzburg, Germany Institute for Molecular and Translational Therapeutic Strategies (IMTTS), Hannover Medical School, Hannover, Germany Plant Breeding Institute, Christian-Albrechts-University of Kiel, Olshausenstr. 40, 24098 Kiel, Germany d Department of Internal Medicine I, University Hospital Würzburg, Germany and Comprehensive Heart Failure Center, University of Würzburg, Germany e Excellence Cluster REBIRTH, Hannover Medical School, Hannover, Germany f National Heart and Lung Institute, Imperial College London, London, UK g EMBL Heidelberg, BioComputing Unit, Meyerhofstraße 1, 69117 Heidelberg, Germany b

a b s t r a c t

D

MicroRNAs (miRNAs) are small ~22 nucleotide non-coding RNAs and are highly conserved among species. Moreover, miRNAs regulate gene expression of a large number of genes associated with important biological functions and signaling pathways. Recently, several miRNAs have been found to be associated with cardiovascular diseases. Thus, investigating the complex regulatory effect of miRNAs may lead to a better understanding of their functional role in the heart. To achieve this, bioinformatics approaches have to be coupled with validation and screening experiments to understand the complex interactions of miRNAs with the genome. This will boost the subsequent development of diagnostic markers and our understanding of the physiological and therapeutic role of miRNAs in cardiac remodeling. In this review, we focus on and explain different bioinformatics strategies and algorithms for the identification and analysis of miRNAs and their regulatory elements to better understand cardiac miRNA biology. Starting with the biogenesis of miRNAs, we present approaches such as LocARNA and miRBase for combining sequence and structure analysis including phylogenetic comparisons as well as detailed analysis of RNA folding patterns, functional target prediction, signaling pathway as well as functional analysis. We also show how far bioinformatics helps to tackle the unprecedented level of complexity and systemic effects by miRNA, underlining the strong therapeutic potential of miRNA and miRNA target structures in cardiovascular disease. In addition, we discuss drawbacks and limitations of bioinformatics algorithms and the necessity of experimental approaches for miRNA target identification. This article is part of a Special Issue entitled ‘Non-coding RNAs’. © 2014 Published by Elsevier Ltd.

61 62 63 64

R

1. 2. 3.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Biogenesis, structure and miRNA biology . . . . . . . . . . . . . . . . . . . . . . . . . . Novel miRNA discovery through NGS platforms and experimental identification of miRNA targets 3.1. Novel miRNA discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2. Experimental miRNA target identification . . . . . . . . . . . . . . . . . . . . . . 4. Bioinformatics approaches and tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1. Genomic localization and sequence–structure analysis . . . . . . . . . . . . . . . . . 4.2. MiRNA target prediction programs. . . . . . . . . . . . . . . . . . . . . . . . . . 4.3. Cardiovascular pathway and biological function analysis . . . . . . . . . . . . . . . . 5. Conclusion and perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

U

49 50 51 52 53 54 55 56 57 58 59 60

N C O

46 44 43

Contents

26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41

R

42 45 48 47

P

i n f o

R O

c

E

4

Meik Kunz a,b, Ke Xiao b,c, Chunguang Liang a, Janika Viereck b, Christina Pachel d, Stefan Frantz d, Thomas Thum b,e,f, Thomas Dandekar a,g

T

Q223Q5

Bioinformatics of cardiovascular miRNA biology

C

Q42Q6

Review article

E

1

1. Introduction MicroRNAs (miRNAs) are highly conserved among different species. They are small ~ 22 nucleotide non-coding RNAs [9,18,30,46,79,81].

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

0 0 0 0 0 0 0 0 0 0 0 0

They have been found to regulate gene expression of a large number of human genes by binding to the 3′-untranslated region (3′-UTR) of messenger RNAs (mRNAs) and also influence protein synthesis through interacting with the protein translation machinery. MiRNAs

http://dx.doi.org/10.1016/j.yjmcc.2014.11.027 0022-2828/© 2014 Published by Elsevier Ltd.

Please cite this article as: Kunz M, et al, Bioinformatics of cardiovascular miRNA biology, J Mol Cell Cardiol (2014), http://dx.doi.org/10.1016/ j.yjmcc.2014.11.027

65 66 67 68

94 95 96 97 98 99 100 101 102 103 104 105

110 111 112 113 114 115 116 117 Q9 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132

C

92 93

E

90 91 Q8

R

88 89

R

86 87

O

84 85

C

82 83

N

80 81

U

78 79

F

MiRNAs are located either in intronic regions of coding-genes, in non-coding genes or in intragenic regions of the genome. They are transcribed by RNA-Polymerase II (RNA-Pol II) as a primary-miRNA transcript (pri-miRNA) [31,50,53,55,65]. The pre-miRNA contains a characteristic hairpin structure, which is recognized by the RNase III enzyme Drosha. By binding of the RNase III enzyme Drosha and its cofactor DiGeorge Syndrome Critical Region 8 (DGCR8), a dsRNA binding protein, the ~ 70 nucleotide long hairpin precursor-miRNA (pre-miRNA) is processed and further transported into the cytoplasm by the nucleocytoplasmic shuttle protein Exportin 5 [15,31,65]. In the cytoplasm, another RNase III enzyme, Dicer, cleaves and unwinds the pre-miRNA to form the ~22 bp double stranded miRNA [31,65]. Finally, this miRNA duplex forms a single-stranded guiding RNA (mature miRNA), which associates with the RNA-induced silencing complex (RISC) to regulate its mRNA targets, whereas the second singlestranded passenger RNA strand is mostly degraded [15,31,32,56,65]. In general, miRNAs regulate the gene expression by binding to the 3′-UTR of mRNAs, whereas few studies also reported that they also bind to the coding region or 5′-UTR [31,54,65]. As a result of a complete or incomplete complementary binding, a single miRNA can target multiple mRNAs and a single mRNA may be regulated by multiple miRNAs [14,57,65,31]. This clarifies their complex regulatory effect and high targeting potential (about 30% to 60%) of mammalian genes [30,65], at the same time pointing out their potentially important therapeutic role. However, the experimental identification of miRNA targets is

76 77

133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 Q10

O

108 109

75

3. Novel miRNA discovery through NGS platforms and experimental 148 identification of miRNA targets 149

R O

2. Biogenesis, structure and miRNA biology

73 74

3.1. Novel miRNA discovery

150

Microarrays have been widely and extensively used as an efficient method for miRNA expression profiling on a genome-wide level. However, the discovery of novel miRNAs is still an inherent weakness of this hybridization-based technology. The short length of miRNA and the high similarity between miRNA family members make specific probe design for microarrays challenging. The development of next-generation sequencing technologies and the drop of costs in recent years open an efficient route for the rapid discovery of novel or low-expressed miRNAs. The miRDeep algorithm [29] was first introduced in 2008 and is currently widely used to detect and quantify miRNA from small RNA sequencing. This tool has been further developed as an integrated program named miRDeep* [4] which is freely available with a user-friendly interface. Sequence reads archived in FastQ and alignment profiles in BAM/SAM format can be used directly for further analysis such as miRNA detection and expression profiling. A java-based sRNA analysis tool is the UEA sRNA workbench [74]. It provides biologists with an easy solution, with a nice graphical user interface, for handling their RNA-seq data, starting from quality filters for the reads to target predictions. A major advantage of high throughput RNA sequencing in cardiovascular disease is the unambiguous and sensitive detection ability for novel miRNAs. On the other hand, deep sequencing is a comparatively new approach and no standard data analysis strategy has been suggested. Furthermore, substantial computational support is necessary for a more precise prediction and expression quantification.

151

3.2. Experimental miRNA target identification

176

The experimental identification of miRNA targets can be done by expression profiling (e.g. microarray analysis after miRNA overexpression or knockdown, proteomics) or biochemical isolation of the miRISC complex using immunoprecipitation (different experimental methods extensively reviewed in [75,87]). Generally, miRNAs regulating the mRNA level and miRNA and mRNA expression are often negatively correlated [28,31,33]. Therefore, miRNA profiling microarray experiments to identify deregulated mRNAs after ectopic miRNA expression or antagonism are useful for experimental identification of putative miRNA targets and can be further combined with bioinformatics [75]. However, results from Matkovich et al. show for cardiovascular research that compared to mRNAs, cardiac miRNAs are more sensitive to the acute functional status of end-stage heart failure [61], exemplifying that changes in mRNA level are not always a reliable method for miRNA target prediction [87]. Therefore, different additional techniques for miRNA target identification in cardiac tissues are available, such as RISC-IP and proteomics. The RISC-IP is a new biochemical method for

177 178

P

107

71 72

very complex and elaborate, indicating the necessity of computational prediction tools. As a result, several computational tools were developed, mainly using miRNA length, sequence and structural information (e.g. hairpin structure and minimal folding free energy; [31,58]), which are very efficient in the identification and analysis of miRNAs. Algorithms such as RNAfold and Mfold quickly and accurately predict the putative secondary structure of an miRNA based on the principle of minimum free energy and are used in different computational tools [31,37,90]. MiRNA detection tools can be divided into comparative and non-comparative methods [10,31,36]. Comparative algorithms use the sequence conservation for the miRNA prediction and help to identify miRNAs among species, whereas non-comparative algorithms only use the intrinsic miRNA structure without any sequence conservation and are therefore able to identify evolutionarily distant species or species-specific miRNAs (reviewed in [31]) (Fig. 1).

T

106

are associated with many biological processes and diseases, including aging, cardiac function, metabolism and cancer [8,9,15,18,31,46,65,79]. Moreover, a single miRNA can target different mRNAs and a single mRNA can also be regulated by different miRNAs [14,31,57,65], pointing to a complex regulatory network. MiRNAs influence different signaling pathways and are useful as diagnostic markers as well as potential new therapeutic targets for cardiovascular diseases [15]. Cardiovascular diseases combine together to be the leading cause of death [15]. Several miRNAs have been known to be involved in cardiovascular diseases and also play a potential therapeutic role in cardiac remodeling (139 cardiac-related miRNAs and their role in cardiovascular diseases extensively reviewed in [22,52]). Specific miRNAs are not only deregulated in various cardiovascular cell types of diseased hearts [77], but also directly involved in pathologic reactions of the heart. For example, miRNA-1 is associated with myocardial infarction and miRNA-21 and miRNA-212/ 132 with cardiac fibrosis and hypertrophy respectively (extensively reviewed in [15,24,78,80]). Importantly, some miRNAs are transcribed as part of a cistron and regulated by cardiac transcription factors (TFs), e.g. miR-1/miR-133 by myogenic transcription factor (MyoD) and serum response factor (SRF) or miR-143/145 by cardiac NK-2 transcription factor (Nkx2–5) and SRF [65,76]. On the other hand, miRNAs can directly regulate cardiac associated TF and signaling pathways, e.g. miR-212/132, the anti-hypertrophic TF forkhead box O3 and the CN–NFAT signaling pathway [80]. To understand the complex effects of cardiovascular miRNAs their genomic localization including promoter analysis as well as interaction partners all have to be taken into account. It is thus of high interest to understand the complex role and function of cardiovascular miRNAs for a better understanding of their regulatory effects as a basis for future therapeutic approaches. For this purpose, different bioinformatics methods and search programs are useful. Owing to their small length as well as their specific cardiovascular expression profiles (cell type and development dependent), experimental methods alone cannot fulfill the detection and analysis of these miRNAs, e.g. regarding cardiac miRNAs with low expression levels or detecting sequence–structure-conservation [3,31,49,63]. For this, the combined use of experimental and computational approaches has revolutionized the identification and analysis of miRNAs, and in particular, their selective function in the heart.

D

69 70

M. Kunz et al. / Journal of Molecular and Cellular Cardiology xxx (2014) xxx–xxx

E

2

Please cite this article as: Kunz M, et al, Bioinformatics of cardiovascular miRNA biology, J Mol Cell Cardiol (2014), http://dx.doi.org/10.1016/ j.yjmcc.2014.11.027

152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175

179 180 181 182 183 184 185 186 187 188 189 190 191 192 193

3

R

R

E

C

T

E

D

P

R O

O

F

M. Kunz et al. / Journal of Molecular and Cellular Cardiology xxx (2014) xxx–xxx

196 197 198 199 200 201 202 203 204 205 206 207 208 209

miRNA target identification using isolation and profiling of mRNAs bound by RISC [26]. Recent studies of Karginov et al. using a RISC-IP with myc-tagged Ago followed by microarray profiling of bound mRNAs demonstrated that this is an effective method for direct miRNA binding site identification [26,42,87]. In addition, Matkovich et al. used this method in cardiomyocytes for cardiac-specific miRNAs. By applying an optimized programed RISC-Seq approach (cardiac RISCome), they identified 209 targets for miR-133a and 81 targets for miR-499 with few overlapping targets, showing that this technique is highly specific in miRNA target identification and a powerful tool for cardiovascular research [26,60]. However, the regulatory effect of miRNA is often indirect, for which proteomics are useful [87]. Using proteomics, the direct and indirect effect of miRNAs on protein expression can be detected e.g. by pulsed stable isotope labeling with amino acids in cell culture (pSILAC) or labeling with different fluorescent probes a separation by two-dimensional gel electrophoresis (DIGE) followed by

U

194 195

N C O

Fig. 1. Biogenesis and regulatory effects of cardiovascular miRNAs. Top: miRNAs (e.g. mouse miRNA-132; genomic localization from the Ensembl genome browser; conserved in mammals, implicated in inflammation, proliferation of endothelial cells and angiogenesis, viral infection and immunity of monocytes; and, in particular, inhibition of hypertrophy and cardiomyocyte autophagy; [80]) are transcribed by RNA-Polymerase II (RNA-Pol II) as a primary-miRNA transcript (pre-miRNA) [31,50,53,55,65]. Middle: by binding of the RNAse III enzyme Drosha (recognized the characteristic miRNA hairpin structure) and the dsRNA binding protein DGCR8 the precursor-miRNA (pre-miRNA) is processed and further transported into the cytosplasm by the nucleocytoplasmic shuttle protein Exportin 5 [15,31,65]. Bottom: here, the RNAse enzyme III enzyme Dicer cleaves and unwinds the pre-miRNA to form the double stranded miRNA and further forms the single-stranded guiding RNA (mature miRNA-132; passenger RNA strand is mostly degraded; here not shown), which associates with the RNA-induced silencing complex (RISC) to regulate its mRNA targets [15,31,32,56,65]. Different bioinformatics approaches allow to analyze genomic location including folding patterns (middle: the conserved secondary structure from 145 sequences from 46 species using the Rfam database; shown as pre-miRNA) as well as cardiovascular interaction partners (bottom: using miRBase and combining predicted target from Targetscan and DIANA-microT-CDS database; here the mmu-miR-132-3p target FoxO3 is shown), which can further be tested by targeted experiments.

mass spectrometry, contributing also to more realistic cardiovascular disease phenotypes and identification of relevant targets [62,87]. For instance, the cardiovascular studies of Abonnenc et al. used proteomics to identify indirect and direct targets of miR-29b and miR-30c implicated in cardiac fibrosis providing a comprehensive view of the mouse cardiac fibrosis secretome [1]. In addition, the integration of different cellular levels and time scales, e.g. microarray data for transcriptional effects, proteomic data for translational and post-translational effects as well as metabolomics for metabolite changes, are critical in understanding the regulatory effects of miRNAs in cardiovascular diseases. The integration of data from such different levels of complexity is a challenge and needs further analysis steps, e.g. statistical analysis and comprehensive databases including gene ontology information, pathways as well as network biology [51,87]. However, this is required to yield important information on acute and chronic effects in cardiac disease and which are mediated by which miRNA. On the other hand, experimental

Please cite this article as: Kunz M, et al, Bioinformatics of cardiovascular miRNA biology, J Mol Cell Cardiol (2014), http://dx.doi.org/10.1016/ j.yjmcc.2014.11.027

210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225

230

4. Bioinformatics approaches and tools

231 232

235 236

Bioinformatic approaches to reveal miRNAs in cardiovascular disease should thus combine sequence and phylogenetic comparisons, including a detailed analysis of RNA folding patterns as well as target gene analysis including signaling pathway and Gene Ontology (GO) analysis. Several general software and database resources are summarized in Table 1.

237

4.1. Genomic localization and sequence–structure analysis

238 239

261 262

Rfam is a database and collection of different RNA families, which are divided in three groups: non-coding RNA genes, structured cisregulatory elements and self-splicing RNAs. Each of them is represented by multiple sequence alignments, consensus secondary structures and covariance models (simultaneously RNA sequence and structure modeling; [16]). Another database is miRBase (28,645 entries, release 21 June 2014; [47]) comprising all published miRNAs including information on their genomic context, sequence and predicted hairpin structure. Both databases can be easily searched with keywords or sequence. Additionally, miRBase also provides miRNA clusters and information about experimental tissue expression data as well as predicted and experimentally validated targets through crosslinkings to different databases, e.g. TargetScan, miRTarBase and Tarbase [30,39,47,81]. miR2Disease focuses on manually curated miRNAs in cardiovascular and other diseases from literature. Information includes miRNA–disease relationship, expression pattern and experimentally validated miRNA targets or predicted targets from TargetScan or Tarbase [30,40,81]. Starting with these databases, users obtain a fast overview of the genomic location of miRNAs including relevant information, which can also be downloaded, e.g., all the miRNA sequences across species including conserved structures or targets. The aforementioned databases consist of already known miRNAs. The usage of new, improved high-throughput methods and experimental identification strategies leads to the detection of previously unknown cardiovascular miRNAs, which have to be subsequently analyzed considering sequence–structure conservation and their closely associated function. In

t1:1 t1:2

Table 1 Software and database resources for miRNA analysis.

256 257 258 259 260

t1:3

Tool

t1:4 t1:5 t1:6 t1:7 t1:8 t1:9 t1:10 t1:11 Q1 t1:12 Q2 t1:13 t1:14 t1:15 t1:16 t1:17 t1:18 t1:19 t1:20 t1:21 t1:22 t1:23 t1:24 t1:25 t1:26 t1:27 t1:28

Sequencing miRDeep* UEA sRNA

O

R O

P

Cardiovascular miRNAs may have various mRNA targets involved in different biological functions and signaling pathways, in which the experimental characterization of miRNA targets is often complex, indicating the necessity of computational prediction tools. MiRNA target prediction algorithms are based on seed region matching while subsequently improving their correctness by using evolutionary conservation or thermodynamic parameters, whereas new methods focus more on thermodynamic parameters or target site accessibility instead of correct seed pairing [34,75]. In general, bioinformatic prediction algorithms are

Website

Publication

http://www.australianprostatecentre.org/research/software/mirdeep-star http://srna-workbench.cmp.uea.ac.uk

[4] [74]

http://rna.tbi.univie.ac.at/cgi-bin/RNAfold.cgi http://mfold.rna.albany.edu/?q=mfold http://foldalign.ku.dk/ http://rna.informatik.uni-freiburg.de/LocARNA/Input.jsp http://rna.tbi.univie.ac.at/cgi-bin/RNAalifold.cgi http://www.biophys.uni-duesseldorf.de/stral/ http://bibiserv.techfak.uni-bielefeld.de/rnashapes/

[37] [90] [35] [72] [11] [25] [66]

RNA databases Rfam miRBase

http://rfam.xfam.org/ http://mirbase.org/

[16] [47]

Interactions/Pathway miR2Disease KEGG Reactome Bioinformatics resource manager v2.3 miRPath v2.0 miRGator CopraRNA

http://www.mir2disease.org/ http://www.genome.jp/kegg/ http://www.reactome.org/ http://www.sysbio.org/dataresources/brm.stm http://www.microrna.gr/miRPathv2 http://mirgator.kobic.re.kr/ http://rna.informatik.uni-freiburg.de/CopraRNA/Input.jsp

[40] [41] [23] [79] [82] [21] [84]

Folding RNAfold Mfold FOLDALIGN LocARNA server RNAalifold StrAl RNAshapes

263 264 265 266 267 Q11 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292

4.2. MiRNA target prediction programs

T

254 255

C

252 253

E

250 251

R

248 249

R

246 247

O

244 245

C

242 243

N

240 241

U

233 234

this regard, consensus secondary structure prediction is an essential bioinformatics task. Dynamic programing algorithms such as RNAfold and Mfold (Table 1) predict thermodynamically optimal secondary structures for single sequences based on the principle of minimum free energy and are often applied from different computational tools; however they show a limitation in cases of large-scale application [11,25,31,37,66,90]. Therefore, algorithms capable of integrating sequence alignment and folding of a complete set of miRNA sequences improve consensus structure prediction and are the most effective approach [11,25,38,66]. The Sankoff algorithm [68] is one of these approaches, e.g. implemented in the FOLDALIGN [35] or LocARNA server [72]. Both tools generate pairwise sequence alignments (local or global), which are then used for consensus structure prediction. Users upload miRNA sequences, e.g. identified by BLAST (Basic Local Alignment Search Tool; http://blast.ncbi.nlm.nih.gov/ Blast.cgi). These are then simultaneously folded, aligned and represented, including sequence alignment and consensus of the structure [35,72]. One of the oldest tools is RNAalifold, which predicts the consensus structure based on a sequence alignment, which has to be uploaded by users [11]. With respect to large-scale application, the Sankoff algorithm is insufficient for computational use due its exponential complexity [11,25,66]. An alternative is the StrAl algorithm, a sequence-based heuristic alignment method combining structural probability vectors derived from RNAfold base pairing with sequence information speedily (using only quadratic time; [11,25]). An alignment-free alternative of the Sankoff algorithm is the consensus shape algorithm [11], which is implemented in the RNAshapes tool on the Bielefeld Bioinformatics Server [66]. This algorithm predicts the consensus structure based on a tree-like domain of shapes and the best thermodynamical structure, which represents a non-heuristic and accurate folding account in linear time [66].

F

228 229

miRNA identification in cardiovascular diseases is quite time consuming and needs multistep protocols, while bioinformatics target prediction algorithms and analysis tools dramatically speed up the identification of the best candidates [64].

D

226 227

M. Kunz et al. / Journal of Molecular and Cellular Cardiology xxx (2014) xxx–xxx

E

4

Please cite this article as: Kunz M, et al, Bioinformatics of cardiovascular miRNA biology, J Mol Cell Cardiol (2014), http://dx.doi.org/10.1016/ j.yjmcc.2014.11.027

293 294 295 296 297 298 299 300 301

M. Kunz et al. / Journal of Molecular and Cellular Cardiology xxx (2014) xxx–xxx

321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353

F

O

R O

319 320

C

317 318

E

315 316

R

313 314

R

311 312

N C O

309 310

U

308

P

Table 1a MiRNA target prediction algorithms.

306 307

t2:3

Tool

Seed matching/conservation

t2:4 t2:5 t2:6 t2:7 t2:8 Q3 t2:9 t2:10

IntaRNA microRNA.org/miRanda miRTarBase PicTar PITA TarBase/microT TargetScan

Yes Yes Yes Yes Yes – Manually curated experimentally tested miRNA targets Yes Yes – Yes Yes Yes Yes – – Yes – –

Folding energy

These bioinformatics algorithm prediction strategies are very helpful in miRNA analysis, resulting in identifying optimal cardiovascular miRNA targets for further experimental tests by gene expression and functional experiments. For instance, several studies in cardiovascular diseases combine different target prediction algorithms, e.g. Boštjančič et al. who identified 213 overlapping predicted miRNAs targeting SERCA2 (Cardiac sarco(endo)plasmic reticulum calcium ATPase-2), in which 15 of them were deregulated in human myocardial infarction [13] or Yang et al. who identified EZH2 (enhancer of zeste homolog 2) as a direct target of cardiomyocyte-specific miR-214 playing a role in cardiac hypertrophy [86]. However, each computational prediction algorithm has specific drawbacks and limitations. For instance, the different basis of parameters in target prediction often results in very diverse prediction outcomes with little overlap between them or, on the other hand, there is an over-prediction resulting in large overlapping target lists [64]. Hence, in complex cardiovascular tissues, several algorithms (e.g. miRNA target prediction algorithms coupled to tissue-specific gene expression analysis) and levels of analysis (genome, transcriptome and proteome) have to be combined to filter out over-predictions. Furthermore, miRNA target prediction may incorporate the free energy of the secondary structure. This shows target prediction success in cardiovascular diseases, for instance studies of Zhao et al. using an miRNA target algorithm based on RNA structure and target accessibility show that Hand2 is a target of miRNA-1 during cardiogenesis [89]. However, folding programs such as Mfold are inefficient in predicting large sequences or partitioned sequences and there are further cardiovascular in vivo factors such as mRNA binding proteins influencing the secondary structure [64,73,27, 89]. Moreover, most algorithms are not based on experimentally validated miRNA targets as a basis for their target predictions. It is challenging to determine which algorithm predicts the best and most trustworthy targets [64]. Several studies were accomplished to test the accuracy and comparability of different bioinformatics algorithms with experimentally validated miRNA targets, however they were not specifically carried out for cardiovascular diseases. For instance, different studies found that target predictions based on stringent seed pairing had the highest sensitivity and specificity. Here PicTar and TargetScan have a high overlap in their predicted targets (80–90% using the same dataset of 3′ UTRs; [64,71]). Similar results were obtained from Cohen et al. (~ 130 experimental miRNA–mRNA datasets from Drosophila melanogaster; without TargetScan algorithm), in which the EMBL and PicTar algorithms reflect the highest accuracy (~ 90%) and sensitivity (~70–80%), whereas predictions from other algorithms such as miRanda or RNAhybrid cannot reach such high accuracy and sensitivity [64,73, 14]. Moreover, different proteomics studies have shown that such stringent seed-based algorithms also have the highest predictive power for changes in protein levels, in which the algorithms PicTar and TargetScan have a high overlap in their predicted targets, whereas the outcomes from miRanda and PITA show less overlap compared to them [2,6,64,69,75]. However, only 33% of the predicted targets showed experimental changes in protein level, also indicating a high falsepositive rate for such algorithms [6]. Although experimental data show miRNA targets with a stringent seed matching or many targets

D

t2:1 t2:2

304 305

T

354

based on sequence and location characteristics of miRNAs [34,75], however, they differ between the weight of different parameters, e.g. matching of seed region, evolutionary conservation, heteroduplex free energy or target site accessibility [34,75]. Popular miRNA target prediction algorithms are summarized in Table 1a. The TargetScan database is one of the first algorithms predicting miRNA targets in mammalians based on the conserved 8mer and 7mer sites, matching the seed region of an miRNA, including conserved 3′-compensatory pairing site prediction [30,57]. The database also includes UTRs and their orthologs corresponding to miRNA families and further options, e.g. non-conserved site prediction and conserved targeting probability ranking [30]. Another example is the PicTar algorithm, which predicts miRNA targets by tolerating single seed mismatches [48]. Moreover, this algorithm also includes evolutionary conservation and a free energy cutoff for the allowed mismatches for improving their target prediction, which reflects a high correct target prediction rate by comparing this to experimental data [34,48]. The manually curated miRTarBase includes information on about 51,460 miRNA–target interactions, in which all of them are experimentally validated by reporter assay, western blot, microarray and next-generation sequencing [39]. The TarBase database is part of the DIANA-lab tools and also consists of more than 65,000 experimentally validated miRNA– target interactions for 21 species, also manually collected by data mining from literature [70,81]. The TarBase can be downloaded and also offers further software, e.g. microT.v4, a target prediction algorithm using seed region and conservation and high throughput experimental data [59,81]. The microRNA website also provides target predictions and expression profiles using the miRanda algorithm [12]. This algorithm predicts genomic targets using a regression model combining sequence (multiple mismatches are allowed), free energy and contextual features of the miRNA:mRNA duplex, in which expression data profiles include mammalian tissues and cell lines from an up-to-date miRNA compendium [12]. One example that includes the structural accessibility for prediction of the seed matching is the PITA algorithm [43]. To form an miRNA:mRNA duplex, the miRNA–RISC complex needs access to the target binding site, which thus represents an important factor for miRNA function [34,43]. The PITA algorithm calculates the difference between the free energy of the miRNA:mRNA heteroduplex and the free energy cost for unpairing the target binding site for accessibility [43]. One additional target prediction example incorporating seed region and target site accessibility is the IntaRNA algorithm, which effectively and accurately predicts miRNA target sites for given mRNAs [17]. Computational miRNA target prediction is extensively reviewed in [34]. By combining different target prediction databases, users can filter out overlapping cardiovascular targets as well as predicted and experimentally validated targets, enabling a validation of interaction with in silico predicted information. Additionally, users can download the aforementioned algorithms for working locally, e.g. TargetScan algorithm [30] is implemented in the Perl and miRanda algorithm [12] in C. By this approach, users can load and analyze their own generated data files of experimentally identified miRNAs and mRNA sequences, normally in FASTA format, in which the script scans the sequences from each file to identify potential miRNA targets.

E

302 303

5

Target site accessibility

Website

Publication

http://rna.informatik.uni-freiburg.de/IntaRNA/Input.jsp http://www.microrna.org/microrna/home.do http://mirtarbase.mbc.nctu.edu.tw/ http://pictar.mdc-berlin.de/ http://genie.weizmann.ac.il/pubs/mir07/mir07_prediction.html http://www.microrna.gr/tarbase http://www.targetscan.org/

[17] [12] [39] [48] [43] [70,81] [30]

Please cite this article as: Kunz M, et al, Bioinformatics of cardiovascular miRNA biology, J Mol Cell Cardiol (2014), http://dx.doi.org/10.1016/ j.yjmcc.2014.11.027

355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407

M. Kunz et al. / Journal of Molecular and Cellular Cardiology xxx (2014) xxx–xxx

472 473

5. Conclusion and perspectives

523

Advanced high-throughput methods have revolutionized the discovery of new non-coding regulatory elements and significantly increased the number of previously unknown miRNAs in the cardiovascular system. The relaxed specificity of miRNA opens a therapeutic perspective for large-scale regulation, for instance in the heart. However, improved approaches are critical to first understanding the complex interactions of miRNAs with the genome as well as in depth analyses considering sequence–structure conservation and their closely associated functions in cardiovascular tissues. Bioinformatics can capture this unprecedented level of complexity and systemic effects of miRNAs, pointing at the same time to the strong therapeutic potential of miRNAs and their targets. Bioinformatics approaches should combine sequence

524

P

R O

O

F

functional class scoring (FCS) analysis. They group statistically an input gene list in significant pathways based on GO terms. They do not take into account that genes can interact in different pathways or additional information about the gene product, whereas the third generation pathway topology (PT)-based method offers an extension of the FCS considering the functional genome annotation, for instance the type of gene interaction [44]. The third generation approach includes databases such as Kyoto Encyclopedia of Genes and Genomes (KEGG) and Reactome [41,44]. All mentioned pathway analysis methods are extensively reviewed in [44]. The GO database [5] classifies gene products across all species associated with a biological process and summarizes them into specific terms helping in functional characterization and better interpretation of generated experimental data, e.g. from microarray and proteomics [45]. For cardiovascular diseases, there exists a consortium focusing on annotation of cardiovascular-relevant genes (to date over 4000), e.g. GO:0007507 heart development and GO:0035198 miRNA binding (MEF2C), and they are also currently focusing on GO annotation of microRNAs, proteins and protein–protein-interactions (http://www. ebi.ac.uk/QuickGO/GProteinSet?id=BHF-UCL; http://www.ucl.ac.uk/ functional-gene-annotation/cardiovascular). The KEGG pathway database provides manually drawn molecular and structure pathway maps including drug development and specific human disease pathways and this includes specific information on cardiovascular diseases [41]. The Pathway mapping tool allows mapping of large-scale cardiovascular datasets to find corresponding pathways, helping to better understand their biological function [41]. The Reactome pathway database (current release version from July 15, 2014) also provides manually curated pathway maps such as transcriptional regulation and disease pathways including reactions as well as additional information by cross-referencing to different databases, e.g. GO, KEGG or PubMed [23]. It also provides different analysis tools, e.g. a pathway analysis tool for experimental datasets or a species comparison tool for mapping of identified pathways and reactions to other species. MiRPath v2.0 [82] is a another pathway analysis software that includes advanced tools for miRNA analyzing. As part of the DIANA lab tool, it provides corresponding KEGG pathways for the predicted and experimentally validated miRNA targets from the microT and TarBase algorithms as well as additional information, e.g. associated miRNA pathway target and further miRNAs targeting the same miRNA target. Moreover, it also includes analyzing pipelines, e.g. for hierarchical miRNA and pathway clustering and miRNA–pathway-interaction heat maps as well as single nucleotide polymorphism (SNP) detection in miRNA binding sites or the overrepresented miRNA identification for a pathway of interest [82]. Another helpful program to verify cardiovascular miRNAs is CopraRNA (Comparative prediction algorithm for small RNA targets), which predicts targets based on the IntaRNA tool using at least three miRNA sequences from three different organisms (comparing for instance human, murine and rat cardiomyocytes) including extending approaches, e.g. functional enrichment analysis, interaction domains and regulatory networks [17,84].

O

R

R

E

C

T

from experiments were validated by seed-based algorithms, several experiments identified an imperfect seed match in ~ 25%–45% of miRNA 410 targets [75]. In addition, Zampetaki et al. describe in their review that 411 bioinformatics miRNA target algorithms have a false-positive rate of at 412 least 40% and several miRNA targets also show different interaction 413 sites, e.g. miR-34a targets AXIN2 simultaneous on 5′-UTR and 3′-UTR 414 [54], which cannot be detected by bioinformatics algorithms [87]. 415 Therefore, target filtering only based on a stringent seed-based match 416 does not represent the most effective strategy, suggesting further 417 criteria such as gene ontology and pathway analysis for identification 418 of biologically interesting targets [75]. In addition, based on the differ419 ent parameter settings resulting in different target prediction outcomes, 420 one important prediction criterion is the sequence database used (often 421 Ensembl or UCSC (University of California Santa Cruz) databases) for 422 the prediction as this uses different annotation criteria (e.g. for the 3′ 423 UTR; [67]). For instance, Ritchie et al. investigated for the algorithms mi424 Randa, TargetScan, RNA22 and PITA whether sequences from Ensembl 425 or UCSC can influence the target prediction outcome and found that 426 TargetScan (normally based on UCSC) and Miranda (normally based 427 on Ensembl) algorithms based on the same sequences database show 428 an overlap of 39.5%, whereas the overlap is only 11.5% using two differ429 ent databases. However, the overlap for each tested algorithm also 430 differs when using different databases (e.g. for TargetScan target over431 lap of 47% between Ensembl and UCSC; for miRanda 65% target overlap 432 between Ensembl and UCSC; [67]). On this basis, the authors suggest for 433 Q12 a comparison between algorithms and an optimal overlap for the usage 434 of both databases for each algorithm [67]. Another important point is 435 that most miRNA target prediction algorithms cannot discriminate 436 between alternative tissue, development or cell type-specific mRNA 437 gene expression isoforms [75,83], e.g. miR-21 has cell type-specific 438 targets (Sprouty1 and PTEN in fibroblasts; PDCD4 in cardiomyocyte, 439 [19,20]), or different target prediction algorithms cannot differentiate 440 between isoforms SERCA2a and SERCA2b [13]. Moreover, they also do 441 not include cellular concentrations of mRNAs and miRNAs or physiolog442 ical conditions in a disease-specific context, e.g. effect of hypoxia for 443 cardiovascular disease. Hence, combined mRNA and miRNA profiling 444 experiments are indispensable in identification, if the cells under 445 study possess the predicted target site [87]. Beside this, a user-based 446 collection of specific cardiovascular data including different sequence 447 databases as well as cell-type and tissue-specific isoforms is highly 448 recommended and beneficial for usage and accuracy of miRNA target 449 prediction. This is complemented by systematic experiments, e.g. car450 diovascular tissue or disease-specific mRNA sequencing and miRNA 451 profiling (e.g. RISC-IP or proteomics; [83]).

D

408 409

E

6

4.3. Cardiovascular pathway and biological function analysis

453 454

As mentioned before, cardiovascular miRNAs are associated with several biological processes and signaling pathways. Therefore, GO term enrichment and pathway analysis is an essential step for analyzing large high-throughput miRNA datasets [5,44,85,88]. Moreover, the identification of involved signaling pathways is also a useful analysis step for reducing the complexity of large high-throughput datasets resulting in a better understanding of the underlying biological function [44,85]. For this, several research groups are focusing on integrated bioinformatics tools for the functional analysis of miRNAs, e.g. Bioinformatics resource manager v2.3, DIANA miRPath or miRGator [21,79,82]. These pipelines are especially developed for automatic integration and functional interpretation of user generated experimental data combined in one software following different workflows, e.g. miRNA–mRNA expression correlation analysis, target prediction and functional enrichment analyses (pathway or disease; [21,79,82]). In the last two decades, knowledge base-driven pathway analysis evolved into three generations, and the combined result of all of them is analyzing complex tissues including the cardiovascular system as a large, single network of interactions [44]. The first two analysis methods are over-representation (ORA) and

459 460 461 462 463 464 465 466 467 468 469 470 471

N

457 458

U

455 456

C

452

Please cite this article as: Kunz M, et al, Bioinformatics of cardiovascular miRNA biology, J Mol Cell Cardiol (2014), http://dx.doi.org/10.1016/ j.yjmcc.2014.11.027

474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 Q13 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522

525 526 527 528 529 530 531 532 533 534 535

M. Kunz et al. / Journal of Molecular and Cellular Cardiology xxx (2014) xxx–xxx

555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575

579 Q14 We thank Jennifer Heilig for stylistic and native speaker corrections. 580 We also acknowledge funding from the Land Bavaria and German Re581

search Foundation (grants SFB688/A2 (main), and TH903/10-1, Gr1243/

583

Disclosures

585 586

None.

U

Q16582 Q15 8-1, TR34/A8) and Fondation Leducq (MIRVAD project). 584

587 Q17 References 588 589 590 591 592 593 594 595 596 597 Q18

F

O

553 554

R O

551 552

C

549 550

E

547 548

R

545 546

R

543 544

N C O

542

P

Acknowledgments

540 541

[1] Abonnenc M, Nabeebaccus AA, Mayr U, Barallobre-Barreiro J, Dong X, Cuello F, et al. Extracellular matrix secretion by cardiac fibroblasts: role of microRNA-29b and microRNA-30c. Circ Res 2013;113(10):1138–47. [2] Alexiou P, Maragkakis M, Papadopoulos GL, Reczko M, Hatzigeorgiou AG. Lost in translation: an assessment and perspective for computational microRNA target identification. Bioinformatics 2009;25(23):3049–55. [3] Ambros V, Bartel B, Bartel DP, Burge CB, Carrington JC, Chen X, et al. A uniform system for microRNA annotation. RNA 2003;9(3):277–9. [4] An J, Lai J, Lehman ML, Nelson CC. miRDeep*: an integrated application tool for miRNA identification from RNA sequencing data. Nucleic Acids Res 2012.

[5] Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 2000; 25(1):25–9. [6] Baek D, Villen J, Shin C, Camargo FD, Gygi SP, Bartel DP. The impact of microRNAs on protein output. Nature 2008;455(7209):64–71. [7] Bao MH, Feng X, Zhang YW, Lou XY, Cheng Y, Zhou HH. Let-7 in cardiovascular diseases, heart development and cardiovascular differentiation from stem cells. Int J Mol Sci 2013;14(11):23086–102. [8] Bartel DP. MicroRNAs: genomics, biogenesis, mechanism, and function. Cell 2004; 116(2):281–97. [9] Bartel DP. MicroRNAs: target recognition and regulatory functions. Cell 2009; 136(2):215–33. [10] Batuwita R, Palade V. microPred: effective classification of pre-miRNAs for human miRNA gene prediction. Bioinformatics 2009;25(8):989–95. [11] Bernhart S, Hofacker I, Will S, Gruber A, Stadler P. RNAalifold: improved consensus structure prediction for RNA alignments. BMC Bioinformatics 2008;9(1):474. [12] Betel D, Wilson M, Gabow A, Marks DS, Sander C. The microRNA.org resource: targets and expression. Nucleic Acids Res 2008;36(Suppl. 1):D149–53. [13] Bostjancic E, Zidar N, Glavac D. MicroRNAs and cardiac sarcoplasmic reticulum calcium ATPase-2 in human myocardial infarction: expression and bioinformatic analysis. BMC Genomics 2012;13:552. [14] Brennecke J, Stark A, Russell RB, Cohen SM. Principles of microRNA-target recognition. PLoS Biol 2005;3(3):e85. [15] Bronze-da-Rocha E. MicroRNAs expression profiles in cardiovascular diseases. Biomed Res Int 2014;2014:985408. [16] Burge SW, Daub J, Eberhardt R, Tate J, Barquist L, Nawrocki EP, et al. Rfam 11.0: 10 years of RNA families. Nucleic Acids Res 2012. [17] Busch A, Richter AS, Backofen R. IntaRNA: efficient prediction of bacterial sRNA targets incorporating target site accessibility and seed regions. Bioinformatics 2008;24(24):2849–56. [18] Chen J, Wang M-B. The roles of miRNA-143 in colon cancer and therapeutic implications. Transl Gastrointest Cancer 2012;1(2):169–74. [19] Cheng Y, Zhang C. MicroRNA-21 in cardiovascular disease. J Cardiovasc Transl Res 2010;3(3):251–5. [20] Cheng Y, Zhu P, Yang J, Liu X, Dong S, Wang X, et al. Ischaemic preconditioningregulated miR-21 protects heart against ischaemia/reperfusion injury via antiapoptosis through its target PDCD4; 2010. [21] Cho S, Jun Y, Lee S, Choi HS, Jung S, Jang Y, et al. miRGator v2.0: an integrated system for functional investigation of microRNAs. Nucleic Acids Res 2011;39(Database issue):D158–62. [22] Condorelli G, Latronico MVG, Dorn GW. MicroRNAs in heart disease: putative novel therapeutic targets?; 2010. [23] Croft D, Mundo AF, Haw R, Milacic M, Weiser J, Wu G, et al. The reactome pathway knowledgebase. Nucleic Acids Res 2014;42(Database issue):D472–7. [24] Da Costa Martins PA, De Windt LJ. MicroRNAs in control of cardiac hypertrophy. Cardiovasc Res 2012;93(4):563–72. [25] Dalli D, Wilm A, Mainz I, Steger G. StrAl: progressive alignment of non-coding RNA using base pairing probability vectors in quadratic time. Bioinformatics 2006; 22(13):1593–9. [26] Ding J, Wang D-Z. “RISCing” the heart: in vivo identification of cardiac microRNA targets by RISCome. Circ Res 2011;108(1):3–5. [27] Farh KK, Grimson A, Jan C, Lewis BP, Johnston WK, Lim LP, et al. The widespread impact of mammalian microRNAs on mRNA repression and evolution. Science 2005;310(5755):1817–21. [28] Filipowicz W, Bhattacharyya SN, Sonenberg N. Mechanisms of post-transcriptional regulation by microRNAs: are the answers in sight? Nat Rev Genet 2008;9(2): 102–14. [29] Friedlander MR, Chen W, Adamidi C, Maaskola J, Einspanier R, Knespel S, et al. Discovering microRNAs from deep sequencing data using miRDeep. Nat Biotechnol 2008;26(4):407–15. [30] Friedman RC, Farh KK, Burge CB, Bartel DP. Most mammalian mRNAs are conserved targets of microRNAs. Genome Res 2009;19(1):92–105. [31] Gomes CPDC, Cho J-H, Hood LE, Franco OL, Pereira RW, Wang K. A review of computational tools in microRNA discovery. Front Genet 2013;4. [32] Gregory RI, Yan KP, Amuthan G, Chendrimada T, Doratotaj B, Cooch N, et al. The microprocessor complex mediates the genesis of microRNAs. Nature 2004;432(7014): 235–40. [33] Guo H, Ingolia NT, Weissman JS, Bartel DP. Mammalian microRNAs predominantly act to decrease target mRNA levels. Nature 2010;466(7308):835–40. [34] Hammell M. Computational methods to identify miRNA targets. Semin Cell Dev Biol 2010;21(7):738–44. [35] Havgaard JH, Lyngsø RB, Gorodkin J. The foldalign web server for pairwise structural RNA alignment and mutual motif search. Nucleic Acids Res 2005;33(Suppl. 2): W650–3. [36] Hertel J, Stadler PF. Hairpins in a haystack: recognizing microRNA precursors in comparative genomics data. Bioinformatics 2006;22(14):e197–202. [37] Hofacker IL. Vienna RNA secondary structure server. Nucleic Acids Res 2003;31(13): 3429–31. [38] Hofacker IL, Fekete M, Stadler PF. Secondary structure prediction for aligned RNA sequences. J Mol Biol 2002;319(5):1059–66. [39] Hsu S-D, Tseng Y-T, Shrestha S, Lin Y-L, Khaleel A, Chou C-H, et al. miRTarBase update 2014: an information resource for experimentally validated miRNA–target interactions. Nucleic Acids Res 2014;42(D1):D78–85. [40] Jiang Q, Wang Y, Hao Y, Juan L, Teng M, Zhang X, et al. miR2Disease: a manually curated database for microRNA deregulation in human disease. Nucleic Acids Res 2009;37(Database issue):D98–D104.

D

578

538 539

T

576 577

and detailed RNA folding pattern analysis as well as phylogenetic comparisons with miRNA target scans and signaling pathway analysis. Interestingly, genome and promoter analysis has to be coupled to this, as there is a large variety in how the regulatory effect of the miRNAs is mediated in the cardiovascular system (as antisense; indirectly by regulation of transcription factors; acting as specific or more general suppressor transcript). Several algorithms and strategies, as presented in this review, have been developed to improve the identification and functional analysis of cardiovascular miRNAs, and different steps are essential in understanding their complex regulatory effects, combining cardiovascular tissue-specific information, different omics datasets and miRNA as well as mRNA target structure analysis. However, based on different parameter choices and analyzing strategies, the accuracy of various programs differs, including the remaining and often high number of false positive hits, making some cardiovascular comparisons quite laborious, e.g. for predicted targets. Election of bioinformatics analysis methods depends on the specific cardiovascular question of the researchers. Rigid parameter and tool selection criteria may for instance lead to loss of important biological targets using stringent filtering. As a striking example, lin-41 and let-60/RAS are conserved biologically important targets of let-7 (potential therapeutic target for cardiovascular diseases), which are not presented on most target prediction lists [7,34]. Moreover, usage of bioinformatics approaches in this area should always be supported by experimental data on predicted cardiovascular miRNAs and their targets generated with the biological function of these interactions in mind. For this, ectopic miRNA expression followed by microarray analysis provides the easiest way in miRNA cardiovascular target identification, which should then be further validated by proteomics [75]. Sophisticated high throughput experimental methods are very effective in cardiovascular miRNA target identification. They identify a large list of hundreds of candidates and more, which requires additional bioinformatics analysis for selecting the best candidates involved in cardiovascular diseases, e.g. GO and pathway analysis, as well as further experimental validation, e.g. mRNA and protein knockdown [75]. MiRNA interactions illuminate potential future novel medical interventions in molecular medicine, however, targeted nucleotide delivery still pushes this into the distant future for therapy. In contrast, diagnostic options are already a realistic outcome of cardiovascular miRNA research. Taken together, bioinformatics approaches can significantly improve the functional characterization of cardiovascular miRNAs and optimize the identification of miRNA candidate selection for testing.

E

536 537

7

Please cite this article as: Kunz M, et al, Bioinformatics of cardiovascular miRNA biology, J Mol Cell Cardiol (2014), http://dx.doi.org/10.1016/ j.yjmcc.2014.11.027

598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 Q19 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683

D

P

R O

O

F

[68] Sankoff D. Simultaneous solution of the RNA folding, alignment and protosequence problems. SIAM J Appl Math 1985;45(5):810–25. [69] Selbach M, Schwanhausser B, Thierfelder N, Fang Z, Khanin R, Rajewsky N. Widespread changes in protein synthesis induced by microRNAs. Nature 2008; 455(7209):58–63. [70] Sethupathy P, Corda B, Hatzigeorgiou AG. TarBase: a comprehensive database of experimentally supported animal microRNA targets. RNA 2006;12(2):192–7. [71] Sethupathy P, Megraw M, Hatzigeorgiou AG. A guide through present computational approaches for the identification of mammalian microRNA targets. Nat Methods 2006;3(11):881–6. [72] Smith C, Heyne S, Richter AS, Will S, Backofen R. Freiburg RNA tools: a web server integrating INTARNA, EXPARNA and LOCARNA. Nucleic Acids Res 2010;38(Web Server issue):W373–7. [73] Stark A, Brennecke J, Bushati N, Russell RB, Cohen SM. Animal microRNAs confer robustness to gene expression and have a significant impact on 3′UTR evolution. Cell 2005;123(6):1133–46. [74] Stocks MB, Moxon S, Mapleson D, Woolfenden HC, Mohorianu I, Folkes L, et al. The UEA sRNA workbench: a suite of tools for analysing and visualizing next generation sequencing microRNA and small RNA datasets. Bioinformatics 2012;28(15): 2059–61. [75] Thomas M, Lieberman J, Lal A. Desperately seeking microRNA targets. Nat Struct Mol Biol 2010;17(10):1169–74. [76] Thum T, Catalucci D, Bauersachs J. MicroRNAs: novel regulators in cardiac development and disease. Cardiovasc Res 2008;79(4):562–70. [77] Thum T, Galuppo P, Wolf C, Fiedler J, Kneitz S, van Laake LW, et al. MicroRNAs in the human heart: a clue to fetal gene reprogramming in heart failure. Circulation 2007; 116(3):258–67. [78] Thum T, Gross C, Fiedler J, Fischer T, Kissler S, Bussen M, et al. MicroRNA-21 contributes to myocardial disease by stimulating MAP kinase signalling in fibroblasts. Nature 2008;456(7224):980–4. [79] Tilton S, Tal T, Scroggins S, Franzosa J, Peterson E, Tanguay R, et al. Bioinformatics resource manager v2.3: an integrated software environment for systems biology with microRNA and cross-species analysis tools. BMC Bioinformatics 2012;13(1):1–9. [80] Ucar A, Gupta SK, Fiedler J, Erikci E, Kardasinski M, Batkai S, et al. The miRNA-212/ 132 family regulates both cardiac hypertrophy and cardiomyocyte autophagy. Nat Commun 2012;3:1078. [81] Vergoulis T, Vlachos IS, Alexiou P, Georgakilas G, Maragkakis M, Reczko M, et al. TarBase 6.0: capturing the exponential growth of miRNA targets with experimental support. Nucleic Acids Res 2012;40(Database issue):D222–9. [82] Vlachos IS, Kostoulas N, Vergoulis T, Georgakilas G, Reczko M, Maragkakis M, et al. DIANA miRPath v. 2.0: investigating the combinatorial effect of microRNAs in pathways. Nucleic Acids Res 2012;40(Web Server issue):W498–504. [83] Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, et al. Alternative isoform regulation in human tissue transcriptomes. Nature 2008;456(7221):470–6. [84] Wright PR, Georg J, Mann M, Sorescu DA, Richter AS, Lott S, et al. CopraRNA and IntaRNA: predicting small RNA targets, networks and interaction domains. Nucleic Acids Res 2014;42(Web Server issue):W119–23. [85] Yang JH, Saucerman JJ. Computational models reduce complexity and accelerate insight into cardiac signaling networks. Circ Res 2011;108(1):85–97. [86] Yang T, Gu H, Chen X, Fu S, Wang C, Xu H, et al. Cardiac hypertrophy and dysfunction induced by overexpression of miR-214 in vivo. J Surg Res 2014. [87] Zampetaki A, Mayr M. MicroRNAs in vascular and metabolic disease. Circ Res 2012; 110(3):508–22. [88] Zhao H, Wang L, Dong L, Sun H, Gao Z. Discovery and comparative profiling of microRNAs in representative monopodial bamboo (Phyllostachys edulis) and sympodial bamboo (Dendrocalamus latiflorus). PLoS One 2014;9(7):e102375. [89] Zhao Y, Samal E, Srivastava D. Serum response factor regulates a muscle-specific microRNA that targets Hand2 during cardiogenesis. Nature 2005;436(7048): 214–20. [90] Zuker M. Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res 2003;31(13):3406–15.

N

C

O

R

R

E

C

T

[41] Kanehisa M, Goto S, Furumichi M, Tanabe M, Hirakawa M. KEGG for representation and analysis of molecular networks involving diseases and drugs. Nucleic Acids Res 2010;38(Suppl. 1):D355–60. [42] Karginov FV, Conaco C, Xuan Z, Schmidt BH, Parker JS, Mandel G, et al. A biochemical approach to identifying microRNA targets. Proc Natl Acad Sci 2007;104(49): 19291–6. [43] Kertesz M, Iovino N, Unnerstall U, Gaul U, Segal E. The role of site accessibility in microRNA target recognition. Nat Genet 2007;39(10):1278–84. [44] Khatri P, Sirota M, Butte AJ. Ten years of pathway analysis: current approaches and outstanding challenges. PLoS Comput Biol 2012;8(2):e1002375. [45] Khodiyar VK, Hill DP, Howe D, Berardini TZ, Tweedie S, Talmud PJ, et al. The representation of heart development in the gene ontology. Dev Biol 2011;354(1):9–17. [46] Kim VN. MicroRNA biogenesis: coordinated cropping and dicing. Nat Rev Mol Cell Biol 2005;6(5):376–85. [47] Kozomara A, Griffiths-Jones S. miRBase: annotating high confidence microRNAs using deep sequencing data. Nucleic Acids Res 2014;42(D1):D68–73. [48] Krek A, Grun D, Poy MN, Wolf R, Rosenberg L, Epstein EJ, et al. Combinatorial microRNA target predictions. Nat Genet 2005;37(5):495–500. [49] Kuo C-H, Goldberg M, Lin S-L, Ying S-Y, Zhong J. Identify intronic microRNA with bioinformatics. In: Ying S-Y, editor. MicroRNA protocols. Humana Press; 2013. p. 77–82. [50] Lagos-Quintana M, Rauhut R, Lendeckel W, Tuschl T. Identification of novel genes coding for small expressed RNAs. Science 2001;294(5543):853–8. [51] Langley SR, Dwyer J, Drozdov I, Yin X, Mayr M. Proteomics: from single molecules to biological pathways; 2013. [52] Latronico MVG, Condorelli G. MicroRNAs and cardiac pathology. Nat Rev Cardiol 2009;6(6):418–29. [53] Lau NC, Lim LP, Weinstein EG, Bartel DP. An abundant class of tiny RNAs with probable regulatory roles in Caenorhabditis elegans. Science 2001;294(5543):858–62. [54] Lee I, Ajay SS, Yook JI, Kim HS, Hong SH, Kim NH, et al. New class of microRNA targets containing simultaneous 5′-UTR and 3′-UTR interaction sites. Genome Res 2009; 19(7):1175–83. [55] Lee RC, Ambros V. An extensive class of small RNAs in Caenorhabditis elegans. Science 2001;294(5543):862–4. [56] Lee Y, Jeon K, Lee JT, Kim S, Kim VN. MicroRNA maturation: stepwise processing and subcellular localization. EMBO J 2002;21(17):4663–70. [57] Lewis BP, Burge CB, Bartel DP. Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell 2005;120(1): 15–20. [58] Li L, Xu J, Yang D, Tan X, Wang H. Computational approaches for microRNA studies: a review. Mamm Genome 2010;21(1–2):1–12. [59] Maragkakis M, Vergoulis T, Alexiou P, Reczko M, Plomaritou K, Gousis M, et al. DIANA-microT Web server upgrade supports Fly and Worm miRNA target prediction and bibliographic miRNA to disease association. Nucleic Acids Res 2011; 39(Web Server issue):W145–8. [60] Matkovich SJ, Van Booven DJ, Eschenbacher WH, Dorn GW. RISC RNA sequencing for context-specific identification of in vivo microRNA targets. Circ Res 2011;108(1): 18–26. [61] Matkovich SJ, Van Booven DJ, Youker KA, Torre-Amione G, Diwan A, Eschenbacher WH, et al. Reciprocal regulation of myocardial microRNAs and messenger RNA in human cardiomyopathy and reversal of the microRNA signature by biomechanical support. Circulation 2009;119(9):1263–71. [62] Mayr M, Madhu B, Xu Q. Proteomics and metabolomics combined in cardiovascular research. Trends Cardiovasc Med 2007;17(2):43–8. [63] Mendes ND, Freitas AT, Sagot M-F. Current tools for the identification of miRNA genes and their targets. Nucleic Acids Res 2009;37(8):2419–33. [64] Rajewsky N. MicroRNA target predictions in animals. Nat Genet 2006. [65] Rangrez AY, Massy ZA, Metzinger-Le Meuth V, Metzinger L. MiR-143 and miR-145: molecular keys to switch the phenotype of vascular smooth muscle cells. Circ Cardiovasc Genet 2011;4(2):197–205. [66] Reeder J, Giegerich R. Consensus shapes: an alternative to the Sankoff algorithm for RNA consensus structure prediction. Bioinformatics 2005;21(17):3516–23. [67] Ritchie W, Flamant S, Rasko JEJ. Predicting microRNA targets and functions: traps for the unwary. Nat Methods 2009;6(6):397–8.

U

684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 Q20 741 742 743 744 745 746 747 809

M. Kunz et al. / Journal of Molecular and Cellular Cardiology xxx (2014) xxx–xxx

E

8

Please cite this article as: Kunz M, et al, Bioinformatics of cardiovascular miRNA biology, J Mol Cell Cardiol (2014), http://dx.doi.org/10.1016/ j.yjmcc.2014.11.027

748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 Q21 799 800 801 802 803 804 805 806 807 808

Bioinformatics of cardiovascular miRNA biology.

MicroRNAs (miRNAs) are small ~22 nucleotide non-coding RNAs and are highly conserved among species. Moreover, miRNAs regulate gene expression of a lar...
759KB Sizes 2 Downloads 5 Views