School of Agriculture and Food Sciences University of Queensland
PresentationTitle
From small and simple to big and complex: what can short sequence reads tell us about genomes
Abstract
The genome sequence of an organism provides the basis for gene discovery, the analysis of genetic variation and the association of genomic variation with heritable traits. Second generation sequencing technologies and applied bioinformatics tools can provide an unprecedented insight into genome structure and variation. This technology is still in its infancy, yet is already making a huge impact in our understanding of biological processes. We have developed and applied novel bioinformatics tools and approaches for Illumina second generation sequence data analysis with the aim of understanding large and complex genomes.
The genome of bread wheat (Triticum aestivum) is greater than 16 Gbp in size and consists predominantly of repetitive elements. There has been some debate over whether second generation sequencing can be applied for such a large and complex genome. We have reduced genome sequence complexity by sequencing isolated chromosome arms, with the aim to assemble low copy and genic regions. Our approach enabled the assembly of all genes, as well as a substantial portion of the repetitive fraction for these chromosomes. The syntenic relationship between wheat and a sequenced close relative, Brachypodium distachyon has been used to produce annotated syntenic builds, whereby the majority of genes have been placed in an approximate order and orientation. Our results suggest that the sequencing of isolated chromosome arms can provide valuable information on the gene content of wheat, and that these assemblies can be applied for genome wide SNP discovery, the identification of candidate genes associated with genetically mapped traits and investigation of genome evolution in this important crop.
Our research in canola (Brassica napus) is more advanced and we have identified more than 1 million SNPs across the polyploid genome, with a validation accuracy of 96%. This information has been integrated with mapped genetic marker and trait information within searchable databases. The resulting tools enable the association of candidate genes with trait associated genetic markers and the study of Brassica genome evolution under selection. Our results demonstrate that the challenges of analysing very large and complex genomes using short read sequences are at least partially overcome and that this technology has the potential to revolutionise our understanding of crop genomes with applications for future crop improvement.
Biographical
David Edwards gained an Honours degree in agriculture from the University of Nottingham and a PhD from the Department of Plant Science, University of Cambridge. He has held positions within academia (University of Adelaide, Australia; University of Cambridge, UK; and McGill University, Canada), government (Long Ashton Research Centre, UK, Department of Primary Industries, Victoria, Australia) and industry (ICI seeds, UK). David moved to The University of Queensland, Australia in 2007 as an Associate Professor. He is a Principal Research Fellow and leads the bioinformatics focus group within the Australian Centre for Plant Functional Genomics. His research interests include applied agricultural biotechnology, the structure and expression of plant genomes, the discovery and application of molecular genetic markers and applied bioinformatics, with a focus on crops, and more recently, metagenomic populations.
Further Information
The Applied Bioinformatics Group
Australian Centre for Plant Functional Genomics
|