Home > Platforms > Sequence Assembly
Sequence Assembly

Current technology does not allow us to sequence a genome from end to end without first breaking the DNA into smaller more manageable pieces for sequencing. The process of putting these smaller pieces of sequence back together in the correct order is sequence assembly.

The two most common approaches to sequencing genomes, are clone-based (hierarchical) and whole genome shotgun (WGS) sequencing. In the clone-based method, genomic DNA is cut into large random segments (150-200kb). Each segment is inserted into a Bacterial Artificial Chromosome (BAC). The BAC is sequenced using a shotgun sequencing approach where the BAC DNA is sheared into smaller random pieces (2-4kb). From each piece two reads will be generated, one from each end, these are referred to as read pairs. The reads (~4000/BAC) along with their orientation and distance information are used to assemble the BAC. The completed sequence of a genome is obtained by using a BAC clone map to order and orient the BAC clone sequences into a contiguous piece, chromosome or genome.

In the whole-genome shotgun sequencing method, genomic DNA is sheared and sequenced as with BAC sequencing, but millions of reads are generated and used in the assembly of the genome. Mapping data and comparison to related genomes helps in assembling the genome correctly. Both methods of assembly can be complicated by sequencing errors, clone biases, repetitive genome structures, and polymorphisms.

At WUGSC, based on the genome complexity and sequencing objective, we select a proper sequencing strategy from the hierarchical, WGS or a combination of the two methods. For assembly of BAC-sized clones, we use the Phrap assembler developed by Phil Green. WGS assemblies of large genomes are generated with PCAP developed in collaboration with Xiaoqiu.

 
Sequence Assembly Information
Sequence Annotation Projects
 
Sequence Assembly Links
EMBL-EBI
Ensembl Genome Browser
ExPASy Proteomics Server
Generic Model Organism Database Construction Set
National Center for Biotechnology Information
Pfam
The Wellcome Trust Sanger Institute
WormBase
Sequence Assembly Contact
Patrick Minx
Group Leader of Sequencing Assembly
Send email

Shiaw-Pyng Yang
Research Asst. Prof. of Genetics
Send email

Washington University School of Medicine
The Genome Center
4444 Forest Park Ave
St. Louis, Missouri 63108
USA
 
Selected Sequence Assembly Publications

Title
EAnnot: a genome annotation tool using experimental evidence.
Authors
Ding L, Sabo A, Berkowicz N, Meyer RR, Shotland Y, Johnson MR, Pepin KH, Wilson RK, Spieth J.
Journal
  Genome Res. 2004 Dec;14(12):2503-9.

Title
Comparison of genome degradation in Paratyphi A and Typhi, human-restricted serovars of Salmonella enterica that cause typhoid
Authors
McClelland M, Sanderson KE, Clifton SW, Latreille P, Porwollik S, Sabo A, Meyer R, Bieri T, Ozersky P, McLellan M, Harkins CR, Wang C, Nguyen C, Berghoff A, Elliott G, Kohlberg S, Strong C, Du F, Carter J, Kremizki C, ...
Journal
  Nat Genet. 2004 Dec;36(12):1268-74. Epub 2004 Dec.

Title
WormBase: a multi-species resource for nematode biology and genomics.
Authors
Harris TW, Chen N, Cunningham F, Tello-Ruiz M, Antoshechkin I, Bastiani C, Bieri T, Blasiar D, Bradnam K, Chan J, Chen CK, Chen WJ, Davis P, Kenny E, Kishore R, Lawson D, Lee R, Muller HM, Nakamura C, Ozersky P, ...
Journal
  Nucleic Acids Res. 2004 Jan 1;32 Database issue:D411-7.