YGENO-08720; No. of pages: 8; 4C: Genomics xxx (2015) xxx–xxx

Contents lists available at ScienceDirect

Genomics journal homepage: www.elsevier.com/locate/ygeno

2

Genome-wide SNP discovery and identification of QTL associated with agronomic traits in oil palm using genotyping-by-sequencing (GBS)

F

1Q1

Wirulda Pootakham, Nukoon Jomchai, Panthita Ruang-areerate, Jeremy R. Shearman, Chutima Sonthirod, Duangjai Sangsrakru, Somvong Tragoonrung, Sithichoke Tangphatsornruang ⁎

5

National Center for Genetic Engineering and Biotechnology, 113 Thailand Science Park, Pathum Thani 12120, Thailand

6

a r t i c l e

7 8 9 10

Article history: Received 22 September 2014 Accepted 12 February 2015 Available online xxxx

11 12 13 14 15 16 26

Keywords: Single nucleotide polymorphisms Oil palm Genetic linkage mapping Quantitative trait loci Genotyping-by-sequencing

R O

i n f o

a b s t r a c t

E

D

P

Oil palm has become one of the most important oil crops in the world. Marker-assisted selections have played a pivotal role in oil palm breeding programs. Here, we report the use of genotyping-by-sequencing (GBS) approach for a large-scale SNP discovery and genotyping of a mapping population. Reduced representation libraries of 108 F2 progeny were sequenced and a total of 524 million reads were obtained. We detected 21,471 single nucleotide substitutions, most of which (62.6%) represented transition events. Of 3417 fully informative SNP markers, we were able to place 1085 on a linkage map, which spanned 1429.6 cM and had an average of one marker every 1.26 cM. Three QTL affecting trunk height were detected on LG 10, 14 and 15, whereas a single QTL associated with fruit bunch weight was identified on LG 3. The use of GBS approach proved to be rapid, cost-effective and highly reproducible in this species. © 2015 Published by Elsevier Inc.

C

29

1. Introduction

32 33

Oil palm (Elaeis guineensis) is one of the most important oil bearing crops in the world. Palm oil is used in various products ranging from cooking oil and margarine to animal feeds, soaps and cosmetics. A recent escalation in global demand for palm oil is largely driven by its emerging role as a feedstock in biodiesel production [1]. In terms of yield per unit of planting area, oil palm is the most efficient oil yielding crop, producing roughly four tons of oil per hectare per year, exceeding the production of soybean by nearly ten fold (http://faostat3.fao.org/ faostat-gateway/go/to/home/E). The rapid ascent of palm oil production in the past few years has eventually led to it overtaking soybean oil as the world's leading vegetable oil. Palm oil currently dominates the global vegetable oil economy, contributing about 36% of the total world oil production [2]. Oil palm is a diploid monocotyledon with a chromosome complement of 2n = 32 and belongs to the genus Elaeis and the family Palmae. There are two commonly known species within the genus Elaeis: an economically important E. guineensis originating from West Africa and E. oleifera native to tropical Central and South America [3]. Oil palm is

42 43 44 45 46 47 48 49

R

R

N C O

40 41

U

38 39

E

31

36 37

17 18 19 20 21 22 23 24 25

T

30 28 27

34 35

O

4

3Q2

Abbreviations: SNP, single nucleotide polymorphism; GBS, genotyping-by-sequencing; QTL, quantitative trait loci. ⁎ Corresponding author. E-mail addresses: [email protected] (W. Pootakham), [email protected] (N. Jomchai), [email protected] (P. Ruang-areerate), [email protected] (J.R. Shearman), [email protected] (C. Sonthirod), [email protected] (D. Sangsrakru), [email protected] (S. Tragoonrung), [email protected] (S. Tangphatsornruang).

an outcrossing species due to asynchronous maturation of male and female inflorescences. Based on a flow cytometric analysis, oil palm has an estimated haploid genome size of about 1800 Mb [4]. Commercial breeding in oil palm began shortly after the introduction of palm seeds to Southeast Asia. Since then, considerable improvements have been achieved in both yield and quality characters [5]. It has been estimated that 70% of the yield increases in oil palm plantation in the second half of the 19th century was attributed to cultivar improvement [6]. Conventional breeding through recurrent selection has played a crucial role in yield enhancement. Unfortunately, oil palm requires at least 8–10 years to complete one breeding cycle. Its long generation time is considered a major obstacle in new cultivar development. The ability to identify individuals possessing desirable traits at an early stage will help save time and resources needed to develop superior varieties. One of the quantitative traits in oil palm that contributes directly to increased oil yield is fruit bunch weight (BW). Billotte et al. [7] attempted to locate the quantitative trait loci (QTL) affecting BW using a multi-parent linkage map consisting of 251 microsatellite markers. Nine QTL affecting BW were detected using a within-family or an across-family model [7]. More recently, Jeennor and Volkaert [8] employed a genetic map consisting of 89 microsatellite and 101 single nucleotide polymorphism (SNP) markers constructed from a mapping population of 69 progeny to identify a QTL associated with BW. Another valuable, yet often neglected, characteristic in oil palm is vertical height. Shorter palms require less harvesting time and also increase the economic lifespan of the plantation [9]. Harvesting of fruit bunches from taller trees raises the risk of fruit damage (caused by the

http://dx.doi.org/10.1016/j.ygeno.2015.02.002 0888-7543/© 2015 Published by Elsevier Inc.

Please cite this article as: W. Pootakham, et al., Genomics (2015), http://dx.doi.org/10.1016/j.ygeno.2015.02.002

50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77

101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143

148

Genomic DNA from 110 individuals (108 F2 progeny, F1 progeny clone A43/9 and the female parent clone A) were used to prepare PstIMspI reduced representation libraries and sequenced in multiplex on eight Ion Proton PI™ chips. We obtained a total of 524,508,111 reads covering 66.82 Gb of sequence data, with an average of 4,768,256 reads per sample and the mean read length of 127 bp (Table S1). On average, 85% of the total bases sequenced had a quality score of at least 20, and approximately 88% of the reads were able to align to unique locations on the reference genome (Table S1). Total mapped regions covered by the 200-bp PstI-MspI fragments represented about 0.93% of the published genome sequence [4], corresponding to an estimate from the in silico digestion. Each sample had an average read depth of 32 at the PstI-MspI fragments. The distribution of the raw reads across the sequenced portion of the genome was highly uniform, with an average uniformity of base coverage of 91.69%. We identified a total of 23,134 genetic variations, with 21,471 single nucleotide substitutions and 1663 insertions/deletions (indels; Table 1). The average read depth of the SNP loci across all samples was 106 ± 66. The fraction of indels discovered here (7.2%) is lower than the previous estimate (10.4%) in Riju et al. [26], which utilized oil palm expressed sequence tags (ESTs) available in the NCBI database to identify putative SNP markers. We focused only on single nucleotide variations and excluded the indels and variants involving more than one nucleotide as they were more likely derived from sequencing or alignment errors. A small subset of SNPs discovered through GBS were validated using a gold standard Sanger sequencing (Fig. S1). The frequency of the single nucleotide substitution observed here was 1 SNP in 665 bp, which is noticeably lower than the previously reported frequency of 1 SNP in 74 bp [26]. The frequency observed here was likely to be underestimated since the SNPs were identified from the F2 population derived from self-fertilization of clone A43/9, while SNPs discovered in Riju et al. [26] were obtained from mining publicly available ESTs derived from several unrelated individuals. Moreover, with merely 576 ESTs analyzed, the rate of SNP discovery reported in Riju et al. [26] may not represent the actual frequency of SNP occurrences across the genome. Most of the nucleotide polymorphisms (62.6%) detected were transitions (A/G or C/T) while transversion events (A/C, A/T, C/G or G/T) accounted for 37.4% (Table 1). The most prevalent variation was C/T (29.2%) and the least common type of change was C/G, representing merely 7.5% of total polymorphisms. The observed transition:transversion ratio is 1.67, which was similar to the earlier estimates reported for oil palm (1.77 in [27] and 1.55 in [28]).

149 150

Table 1 Summary of polymorphisms identified in oil palm.

t1:1 t1:2

O

F

2.1. SNP discovery and genotyping using GBS

R O

99 100

147

P

97 98

2. Results and discussion

D

95 96

based linkage map in oil palm. SNP markers reported here will expand 144 the existing repertoire of available molecular markers in E. guineensis 145 and will be useful for future QTL identification and phylogenetic studies. 146

T

93 94

C

91 92

E

89 90

R

87 88

R

85 86

O

84

C

82 83

N

80 81

fall of the bunches to the ground), which often lowers the quality of the oil extracted [10]. Thus, plant height is an important yield-associated trait, and the benefits of cultivating short palms have led to an effort to introgress a reduced vertical growth trait into high-yielding varieties in current breeding schemes [11]. Gibberellin is one of the most important determinants of plant height, and malfunctions in its biosynthesis or signaling pathway have been linked to reduced stature in numerous species, including Arabidopsis, rice, barley, wheat and maize [12]. A mutation in the sd-1 gene encoding an enzyme in the gibberellin biosynthesis pathway (GA20 oxidase) caused a prominent semi-dwarf phenotype in rice. The discovery of this semi-dwarf allele was a major breakthrough in modern rice breeding (the “green revolution”) that led to a development of high-yielding, semi-dwarf cultivars [13]. Gibberellin is also implicated in green-revolution varieties of wheat; however, the reduced height of those crops is conferred by defects in the hormone's signaling pathway rather than the deficiency in gibberellin [14]. A gain-of-function mutation in the wheat Rht gene, encoding a member of the DELLA protein family, caused a gibberellin-insensitive phenotype with characteristic dwarfism. DELLA proteins are negative regulators of gibberellin signaling. They function as nuclear transcription factors and are subjected to gibberellin-dependent proteolysis via the ubiquitin–proteasome pathway [15]. The advent of DNA-based genetic markers enabled plant breeders to accelerate their breeding programs through the utilization of markerassisted selections. In the past two decades, several types of molecular markers have been developed in oil palm (restriction fragment length polymorphisms, random amplified polymorphic DNAs, simple sequence repeats) and used in the construction of genetic linkage maps and various phylogenetic studies [16,17]. Recently, attention has been geared toward the use of single nucleotide polymorphisms (SNPs) as genetic markers. The ubiquity of SNPs in eukaryotic genome and their usefulness as genetic markers has been well established over the last decade. SNP markers typically occur at frequencies of one per ~ 100–500 bp in plant genomes, depending on the species, e.g. 1 SNP/ 490 bp in soybean [18] and 1 SNP/540 bp in pea [19]. With rapid advancement in sequencing throughput together with an overall decrease in sequencing cost, next generation sequencing technologies have been applied to SNP identification in various plant species [20]. However, it remains costly to employ whole-genome sequencing to evaluate multiple individuals in a mapping population, especially for organisms with large genomes such as oil palm (1800 Mb [4]). Reduced representation methods are extremely useful, not only because of their cost-reducing aspects, but also because many research questions can be answered with a small set of markers and do not require every base of the genome to be sequenced. There are several techniques employed to reduce genome complexity and to capture only a fraction of the genome to be sequenced. Restrictionsite-associated DNA sequencing (RAD-seq) was introduced by Miller et al. [21] and adapted to incorporate barcoding for multiplexing by Baird et al. [22]. More recently, a less complicated method for constructing highly multiplexed reduced representation genotyping by sequencing (GBS) libraries was proposed by Elshire et al. [23]. GBS is an efficient strategy that can simultaneously detect and score tens of thousands of molecular markers. This technique has successfully been applied to high-density genetic map construction and QTL mapping in several plant species [24]. Currently available GBS protocols have also been customized to work with multiple sequencing platforms, including the Illumina GAII, Illumina HiSeq, Ion Torrent PGM and Proton [24,25]. Our goals in this study were to perform a genome-wide discovery of SNP markers and utilize them to genotype an oil palm mapping population. We constructed a high-density linkage map and carried out a QTL analysis to detect markers associated with valuable agronomic traits such as BW and trunk height. To our knowledge, this is the first attempt to apply the GBS technique to identify SNP markers and generate a SNP-

151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190

U

78 79

W. Pootakham et al. / Genomics xxx (2015) xxx–xxx

E

2

Total number of polymorphisms Indels SNPs Transition A/G C/T Transversion A/C A/T C/G G/T

Please cite this article as: W. Pootakham, et al., Genomics (2015), http://dx.doi.org/10.1016/j.ygeno.2015.02.002

23,134 1663 21,471

100% 7.19% 92.81%

6698 6744

28.95% 29.15%

1889 2395 1738 2007

8.17% 10.35% 7.51% 8.68%

t1:3 t1:4 t1:5 t1:6 t1:7 t1:8 t1:9 t1:10 t1:11 t1:12 t1:13

W. Pootakham et al. / Genomics xxx (2015) xxx–xxx

Among the 21,471 SNPs identified, we were able to genotype 13,081 loci from 108 F2 progeny with fewer than 10% missing data. Because the offspring was derived from a self-pollination, only markers that are heterozygous in the F1 progeny (clone A43/9) segregate in the F2 generation. Since we also lacked the genotypic data from the male parent, only SNPs that are homozygous in the female parent (clone A) are informative and can be used to infer the genotype of the male parent. We obtained a set of 3417 SNPs after applying the two above criteria to remove non-segregating and uninformative markers. Segregation distortion often leads to a significant under- or overestimation of recombination fractions, which consequently affects the genetic distance between markers as well as the order of the markers [33]. Even though the removal of distorted markers usually reduces the coverage of the genome, we decided to exclude markers exhibiting significant deviation from the expected Mendelian ratio in order to obtain an accurate linkage map. Additional markers with identical segregation patterns were also removed and a final set of 1191 markers was used for the linkage map construction. The genetic map of the F2 population consisted of 1085 markers distributed over 17 linkage groups (Fig. 1, Table 3). The map spanned a cumulative distance of 1429.6 cM, with each linkage group ranging

211 212

217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236

t2:1 t2:2 t2:3 t2:4 t2:5 t2:6 t2:7 t2:8 t2:9 t2:10 t2:11 t2:12 t2:13 t2:14 t2:15 t2:16

C

209 210

E

207 208

R

205 206

R

203 204

N C O

201 202

U

199 200

Table 2 Analysis of synonymous and non-synonymous changes in oil palm SNPs. Detailed information regarding the changes introduced by nucleotide substitutions and their distribution according to the nucleotide position within codons. Total number of exonic SNPs Position in the codon 1st base of the codon 2nd base of the codon 3rd base of the codon Synonymous mutations Non-synonymous mutations Missense Conservative Non-conservative Nonsens: Read-through

2835

100%

772 720 1343

27.23% 25.40% 47.37%

1234 1601 451 1146 – 4

43.53% 56.47% 15.91% 40.42% – 0.14%

2.4. QTL associated with palm height and fruit bunch weight

272

Both the stem height and BW traits in the F2 progeny used in this study appeared to follow the normal distributions (Fig. S2). Broad sense heritability estimates for this particular mapping population were 0.30 for the height character and 0.15 for the BW. QTL results for height and yield component trait detected using interval mapping (IM) function are presented in Fig. 1 and Table 4. Three QTL affecting trunk height were identified at the genome-wide significant threshold level calculated according to Van Ooijen [39]. The first QTL (detected in Sep 2012, Mar 2013, Sep 2013 and Sep 2014) was located on LG 15 (39.9–46.3 cM) and explained 17–19% of the phenotypic variation (Fig. 1, Table 4). This QTL had two underlying LOD peaks: one located in the marker interval EgSNPGBS17286–EgSNPGBS17301 (Sep 2012) and the other located in the intervals EgSNPGBS17395–EgSNPGBS17420 (Mar 2013, Sep 2013 and Sep 2014). These two neighboring peaks probably represented a single QTL since their 1-LOD support intervals were completely overlapping (Fig. 1, Table S2). The second and third QTL were located in adjacent regions on LG 14 at 17 cM (Sep 2012) and 29 cM (Mar 2013), and each of them accounted for ~ 20% of the phenotypic variation (Table 4). Subsequent multiple QTL mapping (MQM) analysis using forwards/backwards selection unmasked an additional minor QTL associated with Sep 2012 height at 55.8 cM on LG 10 (Table 5). This minor QTL accounted for 10.8% of the phenotypic variance. The MQM analysis for Sep 2013 and Sep 2014 trunk height revealed the QTL on LG 14 at the location (29 cM) previously detected for Mar 2013 height (Table 5). Plant height is a complex dynamic trait that is regulated through the interactions of many genes that may behave differently during growth stages and/or environmental conditions. There has been a report of

273

F

216

197 198

O

2.3. Construction of a linkage map

196

R O

215

194 195

P

213 214

The majority of SNPs uncovered through the GBS approach were distributed in non-coding portions of the genome. Thirty percent of the single nucleotide substitutions were located in coding regions and the remaining was in the intergenic regions. Among the intergenic SNPs discovered, 20% were within 5 kb immediately upstream and 23% were within 5 kb immediately downstream of an open reading frame. We employed the SNPEff software to identify nucleotide substitutions that result in changes in amino acid (i.e. non-synonymous) and those that do not (i.e. synonymous) [29]. Among 2835 biallelic SNPs identified within the exonic regions (the remaining were located in the introns and intergenic regions), 43.5% were synonymous (silent) mutations and the majority of non-synonymous substitutions were non-conservative mutations (Table 2). The percentage of nonsynonymous substitutions found in oil palm (56%) was similar to those reported in rice (54%) but higher than the numbers reported for Arabidopsis (45%) and cassava (47%) [30–32]. The coding SNPs were further classified based on the variant position within the codons. The majority of the genic SNPs were located in the third codon position, congruent with the hypothesis that deleterious mutations occurring at the first or second base, which likely result in non-synonymous substitutions, have been selected against and eventually eliminated during the course of evolution.

237 238

D

193

from 156.15 cM (LG 2) to 21.99 cM (LG 5b; Table S2). The number of SNP markers mapped in each linkage group varied from 16 markers in LG 5b to 137 markers in LG 2, with an average of 64 SNPs per linkage group (Table 3). The average distance between markers was 1.26 cM across 17 linkage groups, with 72% of the intervals smaller than 1.26 cM. Our SNP-based linkage map yielded a significantly smaller inter-marker distance compared to the previously published microsatellite-based maps derived from populations of similar size (6 cM in [7]; 4.7 cM in [34]). This is due primarily to a larger number of SNP markers placed on this genetic map. Given the estimated genome size of 1800 Mb for oil palm [4], the average recombination rate across all the linkage groups is 0.79 cM/Mb, comparable to the rates reported for maize (0.75 cM/Mb) and sorghum (1.52 cM/Mb) [35]. The availability of the oil palm genome sequence [4] allowed the comparison between the genetic and physical maps. The linkage groups were numbered according to the assigned chromosome number in oil palm. Two linkage groups corresponding to the same chromosome were designated ‘a’ and ‘b’ (Table 3). Our genetic map had an excess of linkage groups relative to the haploid chromosome number (n = 16) even though a significant number of markers were incorporated into the map. Failure to obtain the basic chromosome number is likely due to a relatively small F2 population size used in this work [36]. Other mapping studies that analyzed segregating populations of a similar size have also reported genetic maps with excess number of linkage groups [4,37]. Another difficulty associated with obtaining an equal number of linkage groups to chromosomes is that polymorphic markers detected may not be evenly distributed over the chromosomes [38]. Because our mapping population was derived from a selfpollination, only SNP loci that were heterozygous in the F1 parent were informative for the linkage map construction. This limitation caused a substantial number of SNP markers to be excluded from the analysis and plausibly led to a non-uniform distribution of markers along the chromosome. This could in turn result in the separation of physically linked markers (located on the same chromosome) into two linkage groups.

E

2.2. SNP location and analysis of synonymous and non-synonymous SNPs in oil palm coding regions

T

191 192

3

Please cite this article as: W. Pootakham, et al., Genomics (2015), http://dx.doi.org/10.1016/j.ygeno.2015.02.002

239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271

274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300

4

W. Pootakham et al. / Genomics xxx (2015) xxx–xxx

LG 2

LG 3

LG 4

LG 7

LG 8

LG 9

LG 5a

LG 6

R O

O

F

BW

LG 1

LG 11

LG 12

LG 13

LG 16

LG 15

Height Sep 2014

Height Mar 2013

Height Mar 2013

Height Sep 2012

N

Height Mar 2013

U

Height Sep 2012

C

O

LG 14

R

R

E

C

T

E

D

P

LG 10

Fig. 1. Oil palm linkage map and distribution of QTL associated with height and BW. Marker names are indicated to the right of the linkage group, and the map distances in cM are shown to the left of the linkage group. QTL for BW, Sep 2012 height, Mar 2013 height, Sep 2013 and Sep 2014 height are indicated (to the right of the linkage groups) by blue, red, magenta, orange and purple bars, respectively. For each QTL, the vertical bar and the line represent −1LOD and −2LOD confidence intervals, respectively. The circles indicate the positions of the QTL peaks for each trait. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

301 302 303 304

temporal changes of the genetic control underlying dynamic traits [40]. Some chromosome segments may be associated with plant height at different growth periods, and those unconditional QTL can be detected throughout various stages of plant development. On the other hand,

certain QTL are conditional and were found only at specific growth stages [41]. The QTL affecting trunk height located in the interval between EgSNPGBS26250 and EgSNPGBS17420 on LG 15 appeared to be unconditional and were detected by the MQM analyses in all four

Please cite this article as: W. Pootakham, et al., Genomics (2015), http://dx.doi.org/10.1016/j.ygeno.2015.02.002

305 306 307 308

W. Pootakham et al. / Genomics xxx (2015) xxx–xxx Table 3 Distribution of SNP markers on the linkage groups. Linkage group

Chromosome

Number of SNP markers

Length (cM)

Average distance between markers (cM)

t3:5 t3:6 t3:7 t3:8 t3:9 t3:10 t3:11 t3:12 t3:13 t3:14 t3:15 t3:16 t3:17 t3:18 t3:19 t3:20 t3:21 t3:22 t3:23

LG 1 LG 2 LG 3 LG 4 LG 5a LG 5b LG 6 LG 7 LG 8 LG 9 LG 10 LG 11 LG 12 LG 13 LG 14 LG 15 LG 16 Average Total

I II III IV V V VI VII VIII IX X XI XII XIII XIV XV XVI – –

73 137 93 134 25 16 106 55 71 56 50 45 64 47 54 38 21 63.82 1085

152.26 156.15 64.99 110.09 63.42 21.99 96.79 74.55 83.96 77.54 91.79 86.25 62.12 98.06 63.79 52.76 73.09 84.09 1429.60

2.11 1.15 0.71 0.83 2.64 1.47 0.92 1.38 1.20 1.41 1.87 1.96 0.99 2.13 1.20 1.43 3.65 1.26 –

The availability of the oil palm genome sequence [4] enabled us to anchor SNP markers associated with the QTL onto the physical map

t4:1 t4:2

Table 4 Interval mapping and Kruskal–Wallis analyses of QTL affecting agronomic traits.

326 327 328 329 330

t4:3

C

E

324 325

R

322 323

R

320 321

N C O

318 319

Trait

U

316 317

Linkage group

t4:4 t4:5 t4:6 t4:7 t4:8 t4:9 t4:10 t4:11 t4:12 t4:13

Height Sep 2012

t4:14 t4:15 t4:16

a

Mar 2013 Sep 2013 Mar 2014 Sep 2014 BW b c

14 15 14 15 15 NDd 15 3

We examined the association between the actual segregation of SNP markers closest to the QTL peaks and our traits of interest in the mapping population. The relationship between the genotypes of the linked markers and the average phenotypic values are displayed in Table 6. For EgSNPGBS16706, the individuals harboring homozygous AA genotype exhibited significantly shorter stature. The absence of the “B” allele in this case was strongly correlated with reduced vertical height. On the other hand, the presence of the “A” allele in markers EgSNPGBS17286, EgSNPGBS26250 and EgSNPGBS17400 was associated with decreased stem stature. Other markers (e.g. EgSNPGBS16599 and EgSNPGBS4453) appeared to exhibit a “co-dominant” effect, where the average phenotypes of the heterozygous “AB” individuals were intermediate between those of the two homozygous groups. SNP markers that are tightly linked to the QTL are potentially useful for marker-assisted breeding programs. Additional testing may be required to assess their transferability to different oil palm populations.

358 359

3. Conclusions

374

In this study, we demonstrated a successful application of the GBS approach to simultaneously discover and genotype SNP markers in oil palm. This technique proved to be rapid, cost-effective and highly reproducible in this species. Over 21,000 SNP markers were identified and 1085 markers were placed on the genetic map, spanning 1429.6 cM. This SNP-based linkage map was subsequently employed in QTL

375 376

O

333 334

314 315

357

R O

2.5. Candidate genes identified in QTL mapping

312 313

2.6. SNP markers associated with the QTL

P

332

310 311

T

331

measurements (Table 5). On the contrary, the minor QTL located between EgSNPGBS13134 and EgSNPGBS13000 (at 55.8 cM on LG 10) was found only at an earlier growth stage (Sep 2012). Using the genome-wide significant threshold value (α = 0.05), we were unable to detect a QTL from Mar 2014 height data. It is plausible that during this growth stage there are multiple QTL regulating plant stature, and each of those QTL exhibits such a small effect on the phenotypic variation that it cannot be detected with the population size used in this study. Additionally, we observed an unusually low level of rainfall between Jan 2014 and Mar 2014. This environmental stress may contribute significantly to the phenotypic variation observed in Mar 2014 and confound the effect of genetic factors on the phenotypic variance, making it more difficult to detect the QTL associated with the trait. A single QTL affecting BW was revealed on LG 3 at ~23 cM, flanked by EgSNPGBS4453 and EgSNPGBS4463. Both markers showed significant linkage to the BW trait, with LOD scores of 4.43–4.58. The proportion of phenotypic variance explained by the QTL was 18.6%. Subsequent MQM analysis did not uncover any additional QTL linked to BW. The results from the Kruskal–Wallis test recapitulated those from the interval-mapping based QTL analyses. For all traits, the markers flanking the QTL were also significant (P b 0.05) for the presence of a segregating QTL in the Kruskal–Wallis test (Tables 4, S2).

D

309

335 336

F

t3:3 t3:4

and explore a list of potential candidate genes that may be controlling our traits of interest (Table S2). We identified two candidate genes co-locating with SNP markers linked to the cluster of QTL affecting trunk height on LG14 (Fig. 2). Open reading frames encoding a putative DELLA protein GAI1 (p5_sc00033.V1.gene762) and a putative gibberellin 2-oxidase 2 (also known as 2-beta-dioxygenase 2; GA2OX2; p5_sc00033.V1.gene644) were located in the 0.5-LOD support interval between markers EgSNPGBS16582 and EgSNPGBS16523. The DELLA protein GAI1 and GA2OX2 have both been implicated in the regulation of plant height through the gibberellin signaling pathway [12]. DELLA proteins are key negative regulators in gibberellin signaling and are thought to function as transcription factors [42]. Gain-offunction mutations in the DELLA gene family result in dwarfism, whereas loss of function results in elongated stem and leaf [43]. A second candidate gene encodes GA2OX2, which is an enzyme that catalyzes the deactivation of metabolically active gibberellins. It plays a crucial role in the feedback regulation that modulates the concentration of bioactive hormone in plants [44]. While the major QTL associated with trunk height was positioned in the vicinity of the two genes implicated in plant height regulation, their involvement in controlling stem stature in oil palm is still highly speculative. Further studies are necessary to determine whether these two genes play a role in regulating trunk height in oil palm.

E

t3:1 t3:2

5

Genome-wide significant threshold (P b 0.05)

Peak LODa

4.14 4.14 4.23 4.23 4.15

5.15 4.68 5.17 4.40 4.96

4.16 4.20

4.27 4.67

Peak position (cM)

Flanking loci

PVEb (%)

Kruskal–Wallis test (Pc)

Left marker

Right marker

17.0 39.9 29.0 46.1 46.3

EgSNPGBS16710 EgSNPGBS17286 EgSNPGBS16599 EgSNPGBS17395 EgSNPGBS26250

EgSNPGBS16706 EgSNPGBS17301 EgSNPGBS16609 EgSNPGBS17400 EgSNPGBS17420

19.70 18.90 19.80 17.10 19.06

0.0001 0.0005 0.0005 0.001 0.0001

46.3 23.6

EgSNPGBS26250 EgSNPGBS4453

EgSNPGBS17420 EgSNPGBS4463

17.00 18.06

0.0005 0.0001

The LOD value at the position of peak likelihood of the QTL. Phenotypic variance explained by each QTL. P: significance level.

Please cite this article as: W. Pootakham, et al., Genomics (2015), http://dx.doi.org/10.1016/j.ygeno.2015.02.002

337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356

360 361 362 363 364 365 366 367 368 369 370 371 372 373

377 378 379 380

6 t5:1 t5:2 t5:3

W. Pootakham et al. / Genomics xxx (2015) xxx–xxx

Table 5 Multiple QTL mapping of QTL affecting height and BW. Trait

Linkage group

Peak LODa

Cofactor

Peak position (cM)

t5:4

Height Sep 2012

t5:16 t5:17 t5:18

a

Sep 2013 Sep 2014 BW

EgSNPGBS16599 EgSNPGBS17400 EgSNPGBS17400

55.8 17.8 46.3 29.0 46.3 29.0 46.3 29.0 46.3

393

4.1. Plant materials and field experiments

394

The mapping population consisted of 108 F2 progeny derived from a self-fertilization of clone A43/9 (F1 progeny), which was obtained from a cross between a dura from Deli (clone A; tall) and a Dumpy

397

4.2. DNA extraction

411

T C

E

R

Chromosome 14

(Mb) 11.9 EgSNPGBS16682

N

28 29

40

398 399 400 401 402 403 404 405 406 407 408 409 410

Leaflet samples were collected from 108 F2 progeny, the Deli female 412 parental strains (clone A) and clone A43/9 (F1 progeny) for DNA 413

C

12 17 20

O

(cM)

R

Linkage group 14

U

395 396

O

4. Materials and methods

388 389

AVROS pisifera (short). The F2 population segregated for BW and the height character (Fig. S2). The seedlings were germinated in 2006 and transferred to soil in 2007. Fourteen-month-old seedlings were soilplanted in a triangular system at spacing of 9 × 9 × 9 m. The population has been maintained at the experimental field at the Golden Tenera Limited Partnership (Krabi, Thailand). The environmental conditions (e.g., sunlight, temperature, soil quality, and rain distribution) appeared to be homogeneous throughout the plot. All progeny were treated identically in terms of fertilizer applications and pest control. Trunk height was recorded from the surface of the soil to the base of the 41st frond, using a long, extendable measuring pole. The measurement was taken every six months from September 2012 to September 2014. The weight of harvested fruit bunches were recorded in 12-month periods (January to December) from 2011 to 2013.

E

392

386 387

10.82 19.34 20.88 21.23 18.47 16.72 22.22 18.20 19.07

EgSNPGBS4453

390 391

384 385

EgSNPGBS13000 EgSNPGBS16706 EgSNPGBS17420 EgSNPGBS16609 EgSNPGBS17420 EgSNPGBS16609 EgSNPGBS17420 EgSNPGBS16609 EgSNPGBS17420

The LOD value at the position of peak likelihood of the QTL. Phenotypic variance explained by each QTL. A single QTL was detected. Please refer to Table 4 for details.

mapping of agronomic traits. Three QTL governing plant statures were detected on LG 10, 14 and 15 while a single QTL associated with BW was identified on LG 3. Two interesting candidate genes encoding proteins implicated in plant height regulation were co-located with the QTL affecting plant height on LG 14. We also identified SNP markers tightly linked to BW and the height character. These markers will be useful for selecting individual palms with desirable characteristics in breeding programs. Finally, our high-density map will contribute to a fundamental knowledge of genome structure and will be valuable for mapping other economically important genes for marker-assisted selections.

382 383

EgSNPGBS13134 EgSNPGBS16710 EgSNPGBS26250 EgSNPGBS16582 EgSNPGBS26250 EgSNPGBS16582 EgSNPGBS26250 EgSNPGBS16582 EgSNPGBS26250

R O

c

4.39 7.35 7.84 6.93 6.14 5.39 6.92 5.78 6.03

P

b

EgSNPGBS16076

Right marker

D

381

Mar 2013

10 14 15 14 15 14 15 14 15 Single QTLc

Left marker

F

t5:5 t5:6 t5:7 t5:8 t5:9 t5:10 t5:11 t5:12 t5:13 t5:14 t5:15

PVEb (%)

Flanking loci

7.5 EgSNPGBS16582 7.0

DELLA protein GAl1 (putative)

6.3

GA20X2 (putative)

5.9 EgSNPGBS16523

Fig. 2. Candidate genes linked to the QTL controlling vertical height. A comparison of corresponding regions between EgSNPGBS16582 and EgSNPGBS16523 from the genetic and physical maps. The genetic distance (in cM) is displayed to the left of the genetic map, whereas the physical distance (in Mb) is shown to the right of the physical map. Orange circles indicate the positions of QTL peaks while the highlighted vertical bar represents the 0.5-LOD support interval of the QTL located at 29 cM. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Please cite this article as: W. Pootakham, et al., Genomics (2015), http://dx.doi.org/10.1016/j.ygeno.2015.02.002

W. Pootakham et al. / Genomics xxx (2015) xxx–xxx

t6:26 t6:27

a

Marker

Genotype

Mean ± SD

EgSNPGBS16706 (LG 14)

AA AB BB AA AB BB AA AB BB AA AB BB AA AB BB AA AB BB AA AB BB

0.737 ± 0.035 0.906 ± 0.028 0.973 ± 0.031 0.841 ± 0.031 0.827 ± 0.028 1.044 ± 0.038 0.953 ± 0.039 1.083 ± 0.032 1.223 ± 0.035 1.060 ± 0.040 1.033 ± 0.029 1.273 ± 0.043 1.334 ± 0.052 1.370 ± 0.038 1.691 ± 0.059 1.823 ± 0.068 1.869 ± 0.050 2.247 ± 0.075 159.30 ± 5.94 137.50 ± 5.00 118.60 ± 5.74

EgSNPGBS17286 (LG 15) Mar 2013

EgSNPGBS16599 (LG 14) EgSNPGBS26250 (LG 15)

Sep 2013

EgSNPGBS17400 (LG 15)

Sep 2014

EgSNPGBS17400 (LG 15)

BWb

EgSNPGBS4453 (LG 3)

421

4.3. GBS library construction and Ion Proton sequencing

422

446 447

To prepare the reduced representation libraries for sequencing, we followed the modified GBS protocol using two enzymes (PstI/MspI) and a Y-adapter by Mascher et al. [25]. Two sets of adapters compatible with the primers of Ion Torrent Proton sequencing platform were synthesized by IDT (Singapore). To enable multiplex sequencing of the libraries, the forward adapters contained 9-bp unique barcodes in addition to 21 bp of the Ion Forward adapter and a PstI restriction site. The reverse adapter (Y-adapter) contained the Ion reverse priming site and was designed such that amplification of the more common MspI–MspI fragments was prevented [25]. The digestion of genomic DNA and the adapter ligation were performed as described in Mascher et al. [25]. We examined the size distribution of the final PCR-amplified library fragments using the BioAnalyzer 2100 (Agilent Technologies, Santa Clara, CA, USA) and noticed the prevalence of products between 100 and 200 bp. In order to take full advantage of the Ion PI™ Template OT2 200 Kit (Life Technologies, Grand Island, NY, USA), which supports 200-base read libraries, we size-selected fragments of ~ 270 bp (combined length of the forward and reverse adapter sequences was ~ 70 bp) using the E-Gel® SizeSelect™ Agarose Gels (Life Technologies, Grand Island, NY, USA). The libraries were subsequently quantified using the 2100 Bioanalyzer High Sensitivity DNA kit (Agilent Technologies, Santa Clara, CA, USA) and diluted to 100 pM for emulsion PCR amplification. We multiplexed between 12 and 14 samples per run. The libraries were sequenced on the Ion Proton PI™ Chips according to the manufacturer's protocol (Life Technologies, Grand Island, NY, USA).

448

4.4. Sequence data analysis and SNP identification

449

Raw reads were de-multiplexed according to their barcodes and the adapter/barcode sequences were removed using the standard Ion

427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445

450

C

E

R

425 426

R

423 424

N C O

418 419

U

416 417

T

420

extraction. Unfortunately, the Dumpy AVROS plant (male parent) was no longer available for sample collection. Fresh samples were frozen in liquid nitrogen and preserved at − 80 °C until use. Frozen tissues were pulverized in liquid nitrogen and DNA was isolated using a DNeasy Plant Mini Kit (Qiagen, Carlsbad, CA, USA). DNA quantity was assessed by the NanoDrop ND-1000 Spectrophotometer and the samples were diluted to 20 ng/μL for library construction.

453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468

Since the mapping population was derived from a self-pollination of the F1 progeny, we first selected SNPs that were heterozygous in the F1 (clone A43/9) to ensure that they segregated in the F2 progeny. Moreover, we included only markers that were homozygous in the female parent (clone A) for mapping, so that the information on paternally inherited alleles could be inferred. The informative markers were expected to segregate in a 1:2:1 ratio and those that deviated significantly from the expected Mendelian segregation ratios (χ2 test P-value b 0.01) were removed from further analysis. Linkage analysis was performed with JoinMap v 3.0 [45] using the parameters set for the F2 population type. Initial assignment to linkage groups was based on the logarithm of the odds (LOD) threshold of 7.0 for each marker pair. We used linkages with a recombination rate (REC) b 0.4, a map LOD value of 0.05 and a goodness-of-fit jump threshold of 5 for inclusion into the map and for the calculation of the linear order of the markers within a linkage group. Loci that were completely linked (i.e. displayed identical segregation pattern) were excluded from the data set before the marker order within the groups was determined. Recombination frequencies were converted to map distances in centiMorgans (cM) using the Kosambi mapping function [46].

469 470

4.6. QTL mapping

489

QTL analyses were performed using R/qtl [47]. IM was initially performed to identify potential QTL, followed by MQM using the stepwiseqtl function, which performs forward and backward selection with a check for interaction to identify multiple QTL. Based on the results from the IM, the maximum number of QTL to search per trait was set to five. QTL model penalties were calculated using the calc.penalties function applied to the result of using the scantwo function with 1000 permutations. Analyses were carried out using a mapping step size of one. The empirical genome-wide LOD threshold values were determined by conducting a 1000 permutation test at α = 0.05 [48]. The QTL was determined as significant if its LOD score was higher than the genome-wide threshold. QTL peaks that were detected within 10 cM of each other and shared overlapping 1-LOD support intervals were considered as a single QTL. Finally, the nonparametric Kruskal–Wallis test was individually applied to each segregating locus. Five sets of height data (Sep 2012, Mar 2013, Sep 2013, Mar 2014 and Sep 2014) were analyzed separately. For BW (2011, 2012 and 2013), the averages of three replicates were used for QTL analyses. The broad sense heritabilities for height and BW were calculated using the lmer function in the lme4 R package [49]. Supplementary data to this article can be found online at http://dx. doi.org/10.1016/j.ygeno.2015.02.002.

490

P

Height was measured in meters. BW was measured in kilograms.

451 452

4.5. Construction of the linkage map

D

b

F

Heighta Sep 2012

O

Trait

t6:4 t6:5 t6:6 t6:7 t6:8 t6:9 t6:10 t6:11 t6:12 t6:13 t6:14 t6:15 t6:16 t6:17 t6:18 t6:19 t6:20 t6:21 t6:22 t6:23 t6:24 t6:25

R O

t6:3

414 415

Torrent™ Suite Software. Clean reads were mapped to the oil palm reference genome [4] using the Ion Torrent™ Suite Software Alignment Plugin (Torrent Mapping Alignment Program Version 4.0.6) and the variants were called using the Ion Torrent VariantCaller (GATK v1.4749-g8b996e2; Life Technologies, Grand Island, NY, USA). The following (default) parameter setting was applied: minimum sequence match on both sides of the variants — 5, minimum support for a variant to be evaluated — 6, minimum frequency of the variant to be reported — 0.15, and maximum relative strand bias — 0.8. The uniformity of base coverage was determined using the Ion Torrent Suite Software Coverage Analysis Plugin (Life Technologies, Grand Island, NY, USA), and it is defined as the percentage of bases in all targeted regions (or whole genome) covered by at least 0.2× the average base coverage depth. To analyze the types of mutations from the GBS data caused by single nucleotide variations in the coding regions, we used the program SNPEff [29] with oil palm reference genome sequence and GFF annotation input files [4].

Table 6 QTL effects expressed as differences in average phenotypes for each genotype group.

E

t6:1 t6:2

7

Please cite this article as: W. Pootakham, et al., Genomics (2015), http://dx.doi.org/10.1016/j.ygeno.2015.02.002

471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488

491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511

W. Pootakham et al. / Genomics xxx (2015) xxx–xxx

523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 660

[1] S. Mekhilef, S. Siga, R. Saidur, A review on palm oil biodiesel as a source of renewable fuel, Renew. Sust. Energ. Rev. 15 (4) (2011) 1937–1949. [2] S.K. Gupta, Technological Innovations In Major World Oil Crops, Volume 1: Breeding, Springer, 2011. [3] A.C. Soh, G. Wong, T.Y. Hor, C.C. Tan, P.S. Chew, Oil palm genetic improvement, Plant Breeding Reviews, John Wiley & Sons, Inc., 2010, pp. 165–219. [4] R. Singh, M. Ong-Abdullah, E.T. Low, M.A. Manaf, R. Rosli, R. Nookiah, L.C. Ooi, S.E. Ooi, K.L. Chan, M.A. Halim, et al., Oil palm genome sequence reveals divergence of interfertile species in old and new worlds, Nature 500 (7462) (2013) 335–339. [5] R.H.V. Corley, C.H. Lee, The physiological basis for genetic improvement of oil palm in Malaysia, Euphytica 60 (1992) 179–184. [6] L. Davidson, Management for efficient cost-effective and productive oil palm plantations, PORIM International Palm Oil Conference, Palm Oil Research Institute of Malaysia, Kuala Lumpur, 1993, pp. 153–167. [7] N. Billotte, M.F. Jourjon, N. Marseillac, A. Berger, A. Flori, H. Asmady, B. Adon, R. Singh, B. Nouy, F. Potier, et al., QTL detection by multi-parent linkage mapping in oil palm (Elaeis guineensis Jacq.), Theor. Appl. Genet. 120 (8) (2010) 1673–1687. [8] S. Jeennor, H. Volkaert, Mapping of quantitative trait loci (QTLs) for oil yield using SSRs and gene-based markers in African oil palm (Elaeis guineensis Jacq.), Tree Genet. Genomes 10 (1) (2014) 1–14. [9] P.D. Turner, R.A. Gillbanks, Oil Palm Cultivation And Management, 1st ed. The Incorporated Society of Planter, Kuala Lampur, 1974. [10] S. Hadi, D. Ahmad, F.B. Akande, Determination of the bruise indexes of oil palm fruits, J. Food Eng. 95 (2009) 322–326. [11] C. Bakoume, C. Louise, Breeding for oil yield and short oil palms in the second cycle of selection at La Dibamba (Cameroon), Euphytica 156 (2007) 195–202. [12] T.P. Sun, F. Gubler, Molecular mechanism of gibberellin signaling in plants, Annu. Rev. Plant Biol. 55 (2004) 197–223. [13] W. Spielmeyer, M.H. Ellis, P.M. Chandler, Semidwarf (sd-1), “green revolution” rice, contains a defective gibberellin 20-oxidase gene, Proc. Natl. Acad. Sci. U. S. A. 99 (13) (2002) 9043–9048. [14] S.G. Thomas, T.P. Sun, Update on gibberellin signaling. A tale of the tall and the short, Plant Physiol. 135 (2) (2004) 668–676. [15] S. Pearce, R. Saville, S.P. Vaughan, P.M. Chandler, E.P. Wilhelm, C.A. Sparks, N. Al-Kaff, A. Korolev, M.I. Boulton, A.L. Phillips, et al., Molecular characterization of Rht-1 dwarfing genes in hexaploid wheat, Plant Physiol. 157 (4) (2011) 1820–1831. [16] N. Billotte, A.M. Risterucci, E. Barcelos, J.L. Noyer, P. Amblard, F.C. Baurens, Development, characterisation, and across-taxa utility of oil palm (Elaeis guineensis Jacq.) microsatellite markers, Genome 44 (2001) 413–425. [17] N. Billotte, N. Marseillac, A.M. Risterucci, B. Adon, P. Brottier, F.C. Baurens, R. Singh, A. Herran, H. Asmady, C. Billot, et al., Microsatellite-based high density linkage map in oil palm (Elaeis guineensis Jacq.), Theor. Appl. Genet. 110 (4) (2005) 754–765. [18] I.Y. Choi, D.L. Hyten, L.K. Matukumalli, Q. Song, J.M. Chaky, C.V. Quigley, K. Chase, K.G. Lark, R.S. Reiter, M.S. Yoon, et al., A soybean transcript map: gene distribution, haplotype and single-nucleotide polymorphism analysis, Genetics 176 (1) (2007) 685–696. [19] A. Leonforte, S. Sudheesh, N. Cogan, P. Salisbury, M. Nicolas, M. Materne, J. Forster, S. Kaur, SNP marker discovery, linkage map construction and identification of QTLs for enhanced salinity tolerance in field pea (Pisum sativum L.), BMC Plant Biol. 13 (1) (2013) 1–14. [20] M. Ganal, T. Altmann, M. Roder, SNP identification in crop plants, Curr. Opin. Plant Biol. 12 (2009) 211–217. [21] M.R. Miller, J.P. Dunham, A. Amores, W.A. Cresko, E.A. Johnson, Rapid and costeffective polymorphism identification and genotyping using restriction site associated DNA (RAD) markers, Genome Res. 17 (2) (2007) 240–248. [22] N.A. Baird, P.D. Etter, T.S. Atwood, M.C. Currey, A.L. Shiver, Z.A. Lewis, E.U. Selker, W.A. Cresko, E.A. Johnson, Rapid SNP discovery and genetic mapping using sequenced RAD markers, PLoS ONE 3 (10) (2008) e3376. [23] R.J. Elshire, J.C. Glaubitz, Q. Sun, J.A. Poland, K. Kawamoto, E.S. Buckler, S.E. Mitchell, A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species, PLoS ONE 6 (5) (2011) e19379.

C

E

R

R

O

C

N

U

518 519

F

References

517

O

522

515 516

R O

520 521

We would like to thank Mr. Anek Limsrivilai (Golden Tenera Limited Partnership, Krabi, Thailand) for providing biological samples and useful discussions on oil palm. We would also like to thank Ms. Yuppadee Pathpo and Mr. Prapatsorn Pethpo for their assistance with phenotypic data and leaf sample collection. We gratefully thank Dr. Kristian Ridley and Dr. Suthee Benjaphokee for their valuable advice regarding genotyping-by-sequencing technique. This work was supported by National Science and Technology Development Agency (NSTDA), Thailand.

P

513 514

[24] J.A. Poland, P.J. Brown, M.E. Sorrells, J.L. Jannink, Development of high-density genetic maps for barley and wheat using a novel two-enzyme genotyping-by-sequencing approach, PLoS One 7 (2) (2013) e32253. [25] M. Mascher, S. Wu, P.S. Amand, N. Stein, J. Poland, Application of genotyping-bysequencing on semiconductor sequencing platforms: a comparison of genetic and reference-based marker ordering in barley, PLoS ONE 8 (10) (2013) e76925. [26] A. Riju, A. Chandrasekar, V. Arunachalam, Mining for single nucleotide polymorphisms and insertions/deletions in expressed sequence tag libraries of oil palm, Bioinformation 2 (4) (2007) 128–131. [27] W. Pootakham, P. Uthaipaisanwong, D. Sangsrakru, T. Yoocha, S. Tragoonrung, S. Tangphatsornruang, Development and characterization of single-nucleotide polymorphism markers from 454 transcriptome sequences in oil palm (Elaeis guineensis), Plant Breed. 132 (2013) 711–717. [28] N.C. Ting, J. Jansen, S. Mayes, F. Massawe, R. Sambanthamurthi, O.L. Cheng-Li, C.W. Chin, X. Arulandoo, T.Y. Seng, S.S. Alwee, et al., High density SNP and SSR-based genetic maps of two independent oil palm hybrids, BMC Genomics 15 (1) (2014) 309. [29] P. Cingolani, A. Platts, L. Wang le, M. Coon, T. Nguyen, L. Wang, S.J. Land, X. Lu, D.M. Ruden, A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3, Fly (Austin) 6 (2) (2012) 80–92. [30] R.M. Clark, G. Schweikert, C. Toomajian, S. Ossowski, G. Zeller, P. Shinn, N. Warthmann, T.T. Hu, G. Fu, D.A. Hinds, et al., Common sequence polymorphisms shaping genetic diversity in Arabidopsis thaliana, Science 317 (5836) (2007) 338–342. [31] W. Pootakham, J.R. Shearman, P. Ruang-Areerate, C. Sonthirod, D. Sangsrakru, N. Jomchai, T. Yoocha, K. Triwitayakorn, S. Tragoonrung, S. Tangphatsornruang, Large-scale SNP discovery through RNA sequencing and SNP genotyping by targeted enrichment sequencing in cassava (Manihot esculenta Crantz), PLoS One 9 (12) (2014) e116028. [32] I.-S. Jeong, U.-H. Yoon, G.-S. Lee, H.-S. Ji, H.-J. Lee, C.-D. Han, J.-H. Hahn, G. An, T.-H. Kim, SNP-based analysis of genetic diversity in anther-derived rice by whole genome sequencing, Rice 6 (1) (2013) 6. [33] M. Lorieux, X. Perrier, B. Goffinet, C. Lanaud, D.G. de León, Maximum-likelihood models for mapping genetic markers showing segregation distortion. 2. F2 populations, Theor. Appl. Genet. 90 (1) (1995) 81–89. [34] T.-Y. Seng, S.H. Mohamed Saad, C.-W. Chin, N.-C. Ting, R.S. Harminder Singh, F. Qamaruz Zaman, S.-G. Tan, S.S.R. Syed Alwee, Genetic linkage map of a high yielding FELDA deli × yangambi oil palm cross, PLoS ONE 6 (11) (2011) e26593. [35] I.R. Henderson, Control of meiotic recombination frequency in plant genomes, Curr. Opin. Plant Biol. 15 (5) (2012) 556–561. [36] LdCe Silva, C.D. Cruz, M.A. Moreira, EGd Barros, Simulation of population size and genome saturation level for genetic mapping of recombinant inbred lines (RILs), Genet. Mol. Biol. 30 (2007) 1101–1108. [37] R. Singh, S.G. Tan, J.M. Panandam, R.A. Rahman, L.C. Ooi, E.T. Low, M. Sharma, J. Jansen, S.C. Cheah, Mapping quantitative trait loci (QTLs) for fatty acid composition in an interspecific cross of oil palm, BMC Plant Biol. 9 (2009) 114. [38] A.H. Paterson, Making genetic maps, in: A.H. Paterson (Ed.), Genome Mapping In Plants, R. G. Landes Company, San Diego, California, Austin, Texas, 1996, pp. 23–29. [39] J. Van Ooijen, LOD significance thresholds for QTL analysis in experimental populations of diploid species, Heredity 83 (1999) 613–624. [40] L. Busemeyer, A. Ruckelshausen, K. Moller, A.E. Melchinger, K.V. Alheit, H.P. Maurer, V. Hahn, E.A. Weissmann, J.C. Reif, T. Wurschum, Precision phenotyping of biomass accumulation in triticale reveals temporal genetic patterns of regulation, Sci. Rep. 3 (2013) 2442. [41] G. Yang, Y. Xing, S. Li, J. Ding, B. Yue, K. Deng, Y. Li, Y. Zhu, Molecular dissection of developmental behavior of tiller number and plant height and their relationship in rice (Oryza sativa L.), Hereditas 143 (2006) 236–245. [42] A.L. Silverstone, C.N. Ciampaglio, T. Sun, The Arabidopsis RGA gene encodes a transcriptional regulator repressing the gibberellin signal transduction pathway, Plant Cell 10 (2) (1998) 155–169. [43] A. Ikeda, M. Ueguchi-Tanaka, Y. Sonoda, H. Kitano, M. Koshioka, Y. Futsuhara, M. Matsuoka, J. Yamaguchi, Slender rice, a constitutive gibberellin response mutant, is caused by a null mutation of the SLR1 gene, an ortholog of the heightregulating gene GAI/RGA/RHT/D8, Plant Cell 13 (5) (2001) 999–1010. [44] P. Hedden, A.L. Phillips, Gibberellin metabolism: new insights revealed by the genes, Trends Plant Sci. 5 (12) (2000) 523–530. [45] J.W. Van Ooijen, R.E. Voorrips, JoinMap® 3.0, Software For The Calculation Of Genetic Linkage Maps, Plant Research International, Wageningen, the Netherlands, 2001. [46] D.D. Kosambi, The estimation of map distance from recombination values, Ann. Eugen. 12 (1944) 172–175. [47] K.W. Broman, H. Wu, Ś. Sen, G.A. Churchill, R/qtl: QTL mapping in experimental crosses, Bioinformatics 19 (7) (2003) 889–890. [48] G.A. Churchill, R.W. Doerge, Empirical threshold values for quantitative trait mapping, Genetics 138 (3) (1994) 963–971. [49] D.M. Bates, lme4: Mixed-Effects Modeling With R, Springer, 2010.

D

Acknowledgments

T

512

E

8

Please cite this article as: W. Pootakham, et al., Genomics (2015), http://dx.doi.org/10.1016/j.ygeno.2015.02.002

584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 Q5 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659

Genome-wide SNP discovery and identification of QTL associated with agronomic traits in oil palm using genotyping-by-sequencing (GBS).

Oil palm has become one of the most important oil crops in the world. Marker-assisted selections have played a pivotal role in oil palm breeding progr...
2MB Sizes 3 Downloads 19 Views