THE FIRST SUGARCANE GENOME ASSEMBLY: HOW CAN WE USE IT?
By KAREN AITKEN; PAUL BERKMAN; ANNE RAE
THE HUMAN GENOME Project was started in 1990 and was expected to take 15 years at a cost of $3 billion, funded by the US Department of Energy. It comprised a consortium of scientists from the UK, France, Australia, China and USA with contributions from many others. Due to massive advances in sequencing and computing technology during the course of the project, the first rough draft was completed ahead of schedule in 2000. The major announcement of the essentially complete genome was in April 2003, two years earlier than planned. The sequence of the last chromosome was published in Nature in 2006. The availability of this sequence has had a major impact on the development of many fields in human biology from molecular medicine to human evolution. Since then, the genomes of many other animals and over 100 plant species have been sequenced. These range from algae to trees. Arabidopsis thaliana (thale cress) was sequenced first because of its small genome and its importance as a research tool. Within the grasses, the genome sequences of 15 species, mostly important crop plants, have been completed. Sugarcane is a polyploid, like many other crop plants, with its basic genome represented in greater than 10 copies. Another important grass crop, wheat, is also a polyploid and is being sequenced chromosome by chromosome but has yet to be finished. Sugarcane’s polyploidy has resulted in a major challenge for sequencing as its genome is considered to be around 10 gigabases or 10 000 000 000 base pairs long. This is more than 10× larger than the genome of sorghum, its closest relative, which has been sequenced and more than 6× larger than the human genome. Despite the complexity of numerous copies of chromosomes, we have generated a sequence assembly for sugarcane that is proving immensely useful for the sugarcane research community. The assembly allows comparative analysis to other sequenced genomes which gives access to information about genes that affect traits of agronomic importance. For example, the assembly has been used to identify genes that are linked to disease resistance including smut and rust. It has also been used by the researchers working on yellow canopy syndrome to search for pathogens which may be involved in this disease. In this paper we will present some of the early applications where the sugarcane genome sequence has provided novel solutions and describe a vision for future use of this valuable resource.