The complete sequence of the 16,569-base pair human mitochondrial genome is presented. The genes for the 12S and 16S rRNAs, 22 tRNAs, cytochrome c oxidase subunits I, II and III, ATPase subunit 6, cytochrome b and eight other predicted protein coding genes have been located. The sequence shows extreme economy in that the genes have none or only a few noncoding bases between them, and in many cases the termination codons are not coded in the DNA but are created post-transcriptionally by polyadenylation of the mRNAs.
The SWISS-PROT protein knowledgebase (http://www.expasy.org/sprot/ and http://www.ebi.ac.uk/swissprot/) connects amino acid sequences with the current knowledge in the Life Sciences. Each protein entry provides an interdisciplinary overview of relevant information by bringing together experimental results, computed features and sometimes even contradictory conclusions. Detailed expertise that goes beyond the scope of SWISS-PROT is made available via direct links to specialised databases. SWISS-PROT provides annotated entries for all species, but concentrates on the annotation of entries from human (the HPI project) and other model organisms to ensure the presence of high quality annotation for representative members of all protein families. Part of the annotation can be transferred to other family members, as is already done for microbes by the High-quality Automated and Manual Annotation of microbial Proteomes (HAMAP) project. Protein families and groups of proteins are regularly reviewed to keep up with current scientific findings. Complementarily, TrEMBL strives to comprise all protein sequences that are not yet represented in SWISS-PROT, by incorporating a perpetually increasing level of mostly automated annotation. Researchers are welcome to contribute their knowledge to the scientific community by submitting relevant findings to SWISS-PROT at email@example.com.
The sequence of the 16,019 nucleotide-pair mitochondrial DNA (mtDNA) molecule of Drosophila yakuba is presented. This molecule contains the genes for two rRNAs, 22 tRNAs, six identified proteins [cytochrome b, cytochrome c oxidase subunits I, II, and III (COI-III), and ATPase subunits 6 and 8] and seven presumptive proteins (URF1-6 and URF4L). Replication originates within a region of 1077 nucleotides that is 92.8% A + T and lacks any open reading frame larger than 123 nucleotides. An equivalent to the sequence found in all mammalian mtCDNAs that is associated with initiation of second-strand DNA synthesis is not present in D. yakuba mtDNA. Introns are absent from D. yakuba mitochondrial genes and there are few (0-31) intergenic nucleotides. The genes found in D. yakuba and mammalian mtDNAs are the same, but there are differences in their arrangement and in the relative proportions of the complementary strands of the molecule that serve as templates for transcription. Although the D. yakuba small and large mitochondrial rRNA genes are exceptionally low in G and C and are shorter than any other metazoan rRNA genes reported, they can be folded into secondary structures remarkably similar to the secondary structures proposed for mammalian mitochondrial rRNAs. D. yakuba mitochondrial tRNA genes, like their mammalian counterparts, are more variable in sequence than nonorganelle tRNAs. In mitochondrial protein genes ATG, ATT, ATA, and in one case (COI) ATAA appear to be used as translation initiation codons. The only termination codon found in these genes is TAA. In the D. yakuba mitochondrial genetic code, AGA, ATA, and TGA specify serine, isoleucine, and tryptophan, respectively. Fifty-nine types of sense condon are used in the D. yakuba mitochondrial protein genes, but 93.8% of all codons end in A or T. Codon-anticodon interactions may include both G-A and C-A pairing in the wobble position. Evidence is summarized that supports the hypothesis that A and T nucleotides are favored at all locations in the D. yakuba mtDNA molecule where these nucleotides are compatible with function.
We have determined the complete sequence of the mitochondrial DNA in the model plant species Arabidopsis thaliana, affording access to the first of its three genomes. The 366,924 nucleotides code for 57 identified genes, which cover only 10% of the genome. Introns in these genes add about 8%, open reading frames larger than 100 amino acids represent 10% of the genome, duplications account for 7%, remnants of retrotransposons of nuclear origin contribute 4% and integrated plastid sequences amount to 1%-leaving 60% of the genome unaccounted for. With the significant contribution of duplications, imported foreign DNA and the extensive background of apparently functionless sequences, the mosaic structure of the Arabidopsis thaliana mitochondrial genome features many aspects of size-relaxed nuclear genomes.
The complete sequence of the 17,553-nucleotide Xenopus laevis mitochondrial genome has been determined. A comparison of this amphibian mitochondrial genomic sequence with those of the mammalian mitochondrial genomes reveals a similar gene order and compact genomic organization. The encoded genes for 22 tRNAs, two ribosomal RNAs, and 13 proteins (COI, COII, COIII, ATPase 6, cytochrome b, and eight additional unidentified reading frames) in the amphibian mitochondria are highly homologous to their mammalian counterparts. Although the amphibian mitochondrial genome contains a significantly larger displacement loop region than the mammalian mitochondrial genomes, there are several regions of sequence homology near the putative sites for heavy and light strand transcription initiation and heavy strand replication. The unique mitochondrial genetic code observed in the mammalian mitochondrial systems is similar to that of the X. laevis mitochondrial genome because of the presence of only 22 encoded tRNAs and the high degree of homology between the predicted protein sequences. However, the amphibian system exclusively utilizes AUG as the start codon in all 13 open reading frames and shows a preference for codons ending in U rather than ending in C. In addition, the X. laevis mitochondrial genome employs the encoded AGA stop codon once and the UAA stop codon three times and requires polyadenylation to provide the nine other UAA stop codons. These observations suggest that the mechanisms of replication, transcription, processing, and translation in mitochondria are highly conserved throughout higher vertebrates.
Brown algae (Phaeophyceae) are complex photosynthetic organisms with a very different evolutionary history to green plants, to which they are only distantly related. These seaweeds are the dominant species in rocky coastal ecosystems and they exhibit many interesting adaptations to these, often harsh, environments. Brown algae are also one of only a small number of eukaryotic lineages that have evolved complex multicellularity (Fig. 1). We report the 214 million base pair (Mbp) genome sequence of the filamentous seaweed Ectocarpus siliculosus (Dillwyn) Lyngbye, a model organism for brown algae, closely related to the kelps (Fig. 1). Genome features such as the presence of an extended set of light-harvesting and pigment biosynthesis genes and new metabolic processes such as halide metabolism help explain the ability of this organism to cope with the highly variable tidal environment. The evolution of multicellularity in this lineage is correlated with the presence of a rich array of signal transduction genes. Of particular interest is the presence of a family of receptor kinases, as the independent evolution of related molecules has been linked with the emergence of multicellularity in both the animal and green plant lineages. The Ectocarpus genome sequence represents an important step towards developing this organism as a model species, providing the possibility to combine genomic and genetic approaches to explore these and other aspects of brown algal biology further.
Analysis of the mitochondrial DNA of a liverwort Marchantia polymorpha by electron microscopy and restriction endonuclease mapping indicated that the liverwort mitochondrial genome was a single circular molecule of about 184,400 base-pairs. We have determined the complete sequence of the liverwort mitochondrial DNA and detected 94 possible genes in the sequence of 186,608 base-pairs. These included genes for three species of ribosomal RNA, 29 genes for 27 species of transfer RNA and 30 open reading frames (ORFs) for functionally known proteins (16 ribosomal proteins, 3 subunits of H(+)-ATPase, 3 subunits of cytochrome c oxidase, apocytochrome b protein and 7 subunits of NADH ubiquinone oxidoreductase). Three ORFs showed similarity to ORFs of unknown function in the mitochondrial genomes of other organisms. Furthermore, 29 ORFs were predicted as possible genes by using the index of G + C content in first, second and third letters of codons (42.0 +/- 10.9%, 37.0 +/- 13.2% and 26.4 +/- 9.4%, respectively) obtained from the codon usages of identified liverwort genes. To date, 32 introns belonging to either group I or group II intron have been found in the coding regions of 17 genes including ribosomal RNA genes (rrn18 and rrn26), a transfer RNA gene (trnS) and a pseudogene (psi nad7). RNA editing was apparently lacking in liverwort mitochondria since the nucleotide sequences of the liverwort mitochondrial DNA were well-conserved at the DNA level.
The 16,775 base-pair mitochondrial genome of the white Leghorn chicken has been cloned and sequenced. The avian genome encodes the same set of genes (13 proteins, 2 rRNAs and 22 tRNAs) as do other vertebrate mitochondrial DNAs and is organized in a very similar economical fashion. There are very few intergenic nucleotides and several instances of overlaps between protein or tRNA genes. The protein genes are highly similar to their mammalian and amphibian counterparts and are translated according to the same variant genetic code. Despite these highly conserved features, the chicken mitochondrial genome displays two distinctive characteristics. First, it exhibits a novel gene order, the contiguous tRNA(Glu) and ND6 genes are located immediately adjacent to the displacement loop region of the molecule, just ahead of the contiguous tRNA(Pro), tRNA(Thr) and cytochrome b genes, which border the displacement loop region in other vertebrate mitochondrial genomes. This unusual gene order is conserved among the galliform birds. Second, a light-strand replication origin, equivalent to the conserved sequence found between the tRNA(Cys) and tRNA(Asn) genes in all vertebrate mitochondrial genomes sequenced thus far, is absent in the chicken genome. These observations indicate that galliform mitochondrial genomes departed from their mammalian and amphibian counterparts during the course of evolution of vertebrate species. These unexpected characteristics represent useful markers for investigating phylogenetic relationships at a higher taxonomic level.
The entire mitochondrial genome of rice (Oryza sativa L.), a monocot plant, has been sequenced. It was found to comprise 490,520 bp, with an average G+C content of 43.8%. Three rRNA genes, 17 tRNA genes and five pseudo tRNA sequences were identified. In addition, eleven ribosomal protein genes and two pseudo ribosomal protein genes were found, which are homologous to 13 of the 16 genes for ribosomal proteins in the mitochondrial genome of the liverwort (Marchantia polymorpha). A greater degree of variation in terms of presence/absence and integrity of genes was observed among the ribosomal protein genes and tRNA genes of rice, Arabidopsis and sugar beet. Transcription and post-transcriptional modification (RNA editing) in the rice mitochondrial sequence were also examined. In all, 491 Cs in the genomic DNA were converted to Ts in cDNA. The frequency of RNA editing differed markedly depending upon the ORF considered. Sequences derived from plastid and nuclear genomes make up 6.3% and 13.4% of the mitochondrial genome, respectively. The degree of conservation of plastid sequences in the mitochondrial genome ranged from 61% to 100%, suggesting that sequence migration has occurred very frequently. Three plastid DNA fragments that were incorporated into the mitochondrial genome were subsequently transferred to the nuclear genome. Nineteen fragments that were similar to transposon or retrotransposon sequences, but different from those found in the mitochondrial genomes of dicots, were identified. The results indicate frequent and independent DNA sequence flow to and from the mitochondrial genome during the evolution of flowering plants, and this may account for the range of genetic variation observed between the mitochondrial genomes of higher plants.
Mitochondria, organelles specialized in energy conservation reactions in eukaryotic cells, have evolved from eubacteria-like endosymbionts whose closest known relatives are the rickettsial group of alpha-proteobacteria. Because characterized mitochondrial genomes vary markedly in structure, it has been impossible to infer from them the initial form of the proto-mitochondrial genome. This would require the identification of minimally derived mitochondrial DNAs that better reflect the ancestral state. Here we describe such a primitive mitochondrial genome, in the freshwater protozoon Reclinomonas americana. This protist displays ultrastructural characteristics that ally it with the retortamonads, a protozoan group that lacks mitochondria. R. americana mtDNA (69,034 base pairs) contains the largest collection of genes (97) so far identified in any mtDNA, including genes for 5S ribosomal RNA, the RNA component of RNase P, and at least 18 proteins not previously known to be encoded in mitochondria. Most surprising are four genes specifying a multisubunit, eubacterial-type RNA polymerase. Features of gene content together with eubacterial characteristics of genome organization and expression not found before in mitochondrial genomes indicate that R. americana mtDNA more closely resembles the ancestral proto-mitochondrial genome than any other mtDNA investigated to date.
Huntington's disease (HD) is a fatal neurodegenerative condition caused by expansion of the polyglutamine tract in the huntingtin (Htt) protein. Neuronal toxicity in HD is thought to be, at least in part, a consequence of protein interactions involving mutant Htt. We therefore hypothesized that genetic modifiers of HD neurodegeneration should be enriched among Htt protein interactors. To test this idea, we identified a comprehensive set of Htt interactors using two complementary approaches: high-throughput yeast two-hybrid screening and affinity pull down followed by mass spectrometry. This effort led to the identification of 234 high-confidence Htt-associated proteins, 104 of which were found with the yeast method and 130 with the pull downs. We then tested an arbitrary set of 60 genes encoding interacting proteins for their ability to behave as genetic modifiers of neurodegeneration in a Drosophila model of HD. This high-content validation assay showed that 27 of 60 orthologs tested were high-confidence genetic modifiers, as modification was observed with more than one allele. The 45% hit rate for genetic modifiers seen among the interactors is an order of magnitude higher than the 1%-4% typically observed in unbiased genetic screens. Genetic modifiers were similarly represented among proteins discovered using yeast two-hybrid and pull-down/mass spectrometry methods, supporting the notion that these complementary technologies are equally useful in identifying biologically relevant proteins. Interacting proteins confirmed as modifiers of the neurodegeneration phenotype represent a diverse array of biological functions, including synaptic transmission, cytoskeletal organization, signal transduction, and transcription. Among the modifiers were 17 loss-of-function suppressors of neurodegeneration, which can be considered potential targets for therapeutic intervention. Finally, we show that seven interacting proteins from among 11 tested were able to co-immunoprecipitate with full-length Htt from mouse brain. These studies demonstrate that high-throughput screening for protein interactions combined with genetic validation in a model organism is a powerful approach for identifying novel candidate modifiers of polyglutamine toxicity.
A complete mitochondrial (mt) genome sequence was reconstructed from a 38,000 year-old Neandertal individual with 8341 mtDNA sequences identified among 4.8 Gb of DNA generated from approximately 0.3 g of bone. Analysis of the assembled sequence unequivocally establishes that the Neandertal mtDNA falls outside the variation of extant human mtDNAs, and allows an estimate of the divergence date between the two mtDNA lineages of 660,000 +/- 140,000 years. Of the 13 proteins encoded in the mtDNA, subunit 2 of cytochrome c oxidase of the mitochondrial electron transport chain has experienced the largest number of amino acid substitutions in human ancestors since the separation from Neandertals. There is evidence that purifying selection in the Neandertal mtDNA was reduced compared with other primate lineages, suggesting that the effective population size of Neandertals was small.
The nucleotide sequences of the mitochondrial DNA (mtDNA) molecules of two nematodes, Caenorhabditis elegans [13,794 nucleotide pairs (ntp)], and Ascaris suum (14,284 ntp) are presented and compared. Each molecule contains the genes for two ribosomal RNAs (s-rRNA and l-rRNA), 22 transfer RNAs (tRNAs) and 12 proteins, all of which are transcribed in the same direction. The protein genes are the same as 12 of the 13 protein genes found in other metazoan mtDNAs: Cyt b, cytochrome b; COI-III, cytochrome c oxidase subunits I-III; ATPase6, Fo ATPase subunit 6; ND1-6 and 4L, NADH dehydrogenase subunits 1-6 and 4L: a gene for ATPase subunit 8, common to other metazoan mtDNAs, has not been identified in nematode mtDNAs. The C. elegans and A. suum mtDNA molecules both include an apparently noncoding sequence that contains runs of AT dinucleotides, and direct and inverted repeats (the AT region: 466 and 886 ntp, respectively). A second, apparently noncoding sequence in the C. elegans and A. suum mtDNA molecules (109 and 117 ntp, respectively) includes a single, hairpin-forming structure. There are only 38 and 89 other intergenic nucleotides in the C. elegans and A. suum mtDNAs, and no introns. Gene arrangements are identical in the C. elegans and A. suum mtDNA molecules except that the AT regions have different relative locations. However, the arrangement of genes in the two nematode mtDNAs differs extensively from gene arrangements in all other sequenced metazoan mtDNAs. Unusual features regarding nematode mitochondrial tRNA genes and mitochondrial protein gene initiation codons, previously described by us, are reviewed. In the C. elegans and A. suum mt-genetic codes, AGA and AGG specify serine, TGA specifies tryptophan and ATA specifies methionine. From considerations of amino acid and nucleotide sequence similarities it appears likely that the C. elegans and A. suum ancestral lines diverged close to the time of divergence of the cow and human ancestral lines, about 80 million years ago.
The entire mitochondrial genome of rapeseed (Brassica napus L.) was sequenced and compared with that of Arabidopsis thaliana. The 221 853 bp genome contains 34 protein-coding genes, three rRNA genes and 17 tRNA genes. This gene content is almost identical to that of Arabidopsis: However the rps14 gene, which is a pseudo-gene in Arabidopsis, is intact in rapeseed. On the other hand, five tRNA genes are missing in rapeseed compared to Arabidopsis, although the set of mitochondrially encoded tRNA species is identical in the two Cruciferae. RNA editing events were systematically investigated on the basis of the sequence of the rapeseed mitochondrial genome. A total of 427 C to U conversions were identified in ORFs, which is nearly identical to the number in Arabidopsis (441 sites). The gene sequences and intron structures are mostly conserved (more than 99% similarity for protein-coding regions); however, only 358 editing sites (83% of total editings) are shared by rapeseed and Arabidopsis: Non-coding regions are mostly divergent between the two plants. One-third (about 78.7 kb) and two-thirds (about 223.8 kb) of the rapeseed and Arabidopsis mitochondrial genomes, respectively, cannot be aligned with each other and most of these regions do not show any homology to sequences registered in the DNA databases. The results of the comparative analysis between the rapeseed and Arabidopsis mitochondrial genomes suggest that higher plant mitochondria are extremely conservative with respect to coding sequences and somewhat conservative with respect to RNA editing, but that non-coding parts of plant mitochondrial DNA are extraordinarily dynamic with respect to structural changes, sequence acquisition and/or sequence loss.
On the basis of the sequence of the mitochondrial genome in the flowering plant Arabidopsis thaliana, RNA editing events were systematically investigated in the respective RNA population. A total of 456 C to U, but no U to C, conversions were identified exclusively in mRNAs, 441 in ORFs, 8 in introns, and 7 in leader and trailer sequences. No RNA editing was seen in any of the rRNAs or in several tRNAs investigated for potential mismatch corrections. RNA editing affects individual coding regions with frequencies varying between 0 and 18.9% of the codons. The predominance of RNA editing events in the first two codon positions is not related to translational decoding, because it is not correlated with codon usage. As a general effect, RNA editing increases the hydrophobicity of the coded mitochondrial proteins. Concerning the selection of RNA editing sites, little significant nucleotide preference is observed in their vicinity in comparison to unedited C residues. This sequence bias is, per se, not sufficient to specify individual C nucleotides in the total RNA population in Arabidopsis mitochondria.
Deuterostomes comprise vertebrates, the related invertebrate chordates (tunicates and cephalochordates) and three other invertebrate taxa: hemichordates, echinoderms and Xenoturbella. The relationships between invertebrate and vertebrate deuterostomes are clearly important for understanding our own distant origins. Recent phylogenetic studies of chordate classes and a sea urchin have indicated that urochordates might be the closest invertebrate sister group of vertebrates, rather than cephalochordates, as traditionally believed. More remarkable is the suggestion that cephalochordates are closer to echinoderms than to vertebrates and urochordates, meaning that chordates are paraphyletic. To study the relationships among all deuterostome groups, we have assembled an alignment of more than 35,000 homologous amino acids, including new data from a hemichordate, starfish and Xenoturbella. We have also sequenced the mitochondrial genome of Xenoturbella. We support the clades Olfactores (urochordates and vertebrates) and Ambulacraria (hemichordates and echinoderms). Analyses using our new data, however, do not support a cephalochordate and echinoderm grouping and we conclude that chordates are monophyletic. Finally, nuclear and mitochondrial data place Xenoturbella as the sister group of the two ambulacrarian phyla. As such, Xenoturbella is shown to be an independent phylum, Xenoturbellida, bringing the number of living deuterostome phyla to four.
Proteomic analyses of different subcellular compartments, so-called organellar proteomics, facilitate the understanding of cellular functions on a molecular level. In this work, various orthogonal multidimensional separation techniques both on the protein and on the peptide level are compared with regard to the number of identified proteins as well as the classes of proteins accessible by the respective methodology. The most complete overview was achieved by a combination of such orthogonal techniques as shown by the analysis of the yeast mitochondrial proteome. A total of 851 different proteins (PROMITO dataset) were identified by use of multidimensional LC-MS/MS, 1D-SDS-PAGE combined with nano-LC-MS/MS and 2D-PAGE with subsequent MALDI-mass fingerprinting. Our PROMITO approach identified the 749 proteins, which were found in the largest previous study on the yeast mitochondrial proteome, and additionally 102 proteins including 42 open reading frames with unknown function, providing the basis for a more detailed elucidation of mitochondrial processes. Comparison of the different approaches emphasizes a bias of 2D-PAGE against proteins with very high isoelectric points as well as large and hydrophobic proteins, which can be accessed more appropriately by the other methods. While 2D-PAGE has advantages in the possible separation of protein isoforms and quantitative differential profiling, 1D-SDS-PAGE with nano-LC-MS/MS and multidimensional LC-MS/MS are better suited for efficient protein identification as they are less biased against distinct classes of proteins. Thus, comprehensive proteome analyses can only be realized by a combination of such orthogonal approaches, leading to the largest dataset available for the mitochondrial proteome of yeast.
The complete sequence of honeybee (Apis mellifera) mitochondrial DNA is reported being 16,343 bp long in the strain sequenced. Relative to their positions in the Drosophila map, 11 of the tRNA genes are in altered positions, but the other genes and regions are in the same relative positions. Comparisons of the predicted protein sequences indicate that the honeybee mitochondrial genetic code is the same as that for Drosophila; but the anticodons of two tRNAs differ between these two insects. The base composition shows extreme bias, being 84.9% AT (cf. 78.6% in Drosophila yakuba). In protein-encoding genes, the AT bias is strongest at the third codon positions (which in some cases lack guanines altogether), and least in second codon positions. Multiple stepwise regression analysis of the predicted products of the protein-encoding genes shows a significant association between the numbers of occurrences of amino acids and %T in codon family, but not with the number of codons per codon family or other parameters associated with codon family base composition. Differences in amino acid abundances are apparent between the predicted Apis and Drosophila proteins, with a relative abundance in the Apis proteins of lysine and a relative deficiency of alanine. Drosophila alanine residues are as often replaced by serine as conserved in Apis. The differences in abundances between Drosophila and Apis are associated with %AT in the codon families, and the degree of divergence in amino acid composition between proteins correlates with the divergence in %AT at the second codon positions. Overall, transversions are about twice as abundant as transitions when comparing Drosophila and Apis protein-encoding genes, but this ratio varies between codon positions. Marked excesses of transitions over chance expectation are seen for the third positions of protein-coding genes and for the gene for the small subunit of ribosomal RNA. For the third codon positions the excess of transitions is adequately explained as due to the restriction of observable substitutions to transitions for conserved amino acids with two-codon families; the excess of transitions over expectation for the small ribosomal subunit suggests that the conservation of nucleotide size is favored by selection.
With the exception of Neanderthals, from which DNA sequences of numerous individuals have now been determined, the number and genetic relationships of other hominin lineages are largely unknown. Here we report a complete mitochondrial (mt) DNA sequence retrieved from a bone excavated in 2008 in Denisova Cave in the Altai Mountains in southern Siberia. It represents a hitherto unknown type of hominin mtDNA that shares a common ancestor with anatomically modern human and Neanderthal mtDNAs about 1.0 million years ago. This indicates that it derives from a hominin migration out of Africa distinct from that of the ancestors of Neanderthals and of modern humans. The stratigraphy of the cave where the bone was found suggests that the Denisova hominin lived close in time and space with Neanderthals as well as with modern humans.
The 15,650 base-pair mitochondrial genome of the sea urchin Strongylocentrotus purpuratus has been cloned and sequenced. It exhibits a novel organization that suggests the primacy of post-transcriptional gene regulation. The same 13 polypeptides, two rRNAs and 22 tRNAs are encoded as in other animal mitochondrial DNAs, but are organized with extreme economy; non-coding information between genes is almost completely absent, some stop codons are generated post-transcriptionally and tRNA sequences are interspersed between only a minority of other structural genes. The genome uses a variant genetic code, in which AAA specifies asparagine, ATA isoleucine, TGA tryptophan and AGN serine, and has an unusual pattern of codon bias. The order of genes shows several differences from that of vertebrates. The genes for the large (16 S) ribosomal RNA and for NADH dehydrogenase subunit 4L (ND4L) are in different positions, located respectively between those encoding ND2 and cytochrome oxidase subunit I (COI) and between COI and COII. This organization is conserved amongst at least four regular echinoids diverging by some 225 million years. Most tRNA genes are also in different positions. The only long unassigned sequence in the genome (121 base-pairs) is located within a cluster of 15 tRNA genes. It contains elements resembling some of those found in the displacement (D) loop of vertebrate mtDNAs, notably polypurine/polypyrimidine tracts that may play a role in regulating transcription and the initiation of replication. The separation of the ribosomal RNA genes from each other and from the putative control region imposes special demands on the transcription of the genome.
We analyzed the complete mitochondrial DNA (mtDNA) sequences of three humans (African, European, and Japanese), three African apes (common and pygmy chimpanzees, and gorilla), and one orangutan in an attempt to estimate most accurately the substitution rates and divergence times of hominoid mtDNAs. Nonsynonymous substitutions and substitutions in RNA genes have accumulated with an approximately clock-like regularity. From these substitutions and under the assumption that the orangutan and African apes diverged 13 million years ago, we obtained a divergence time for humans and chimpanzees of 4.9 million years. This divergence time permitted calibration of the synonymous substitution rate (3.89 x 10(-8)/site per year). To obtain the substitution rate in the displacement (D)-loop region, we compared the three human mtDNAs and measured the relative abundance of substitutions in the D-loop region and at synonymous sites. The estimated substitution rate in the D-loop region was 7.00 x 10(-8)/site per year. Using both synonymous and D-loop substitutions, we inferred the age of the last common ancestor of the human mtDNAs as 143,000 +/- 18,000 years. The shallow ancestry of human mtDNAs, together with the observation that the African sequence is the most diverged among humans, strongly supports the recent African origin of modern humans, Homo sapiens sapiens.
We determined the complete nucleotide sequence of the mitochondrial genome of an angiosperm, sugar beet (Beta vulgaris cv TK81-O). The 368 799 bp genome contains 29 protein, five rRNA and 25 tRNA genes, most of which are also shared by the mitochondrial genome of Arabidopsis thaliana, the only other completely sequenced angiosperm mitochondrial genome. However, four genes identified here (namely rps13, trnF-GAA, ccb577 and trnC2-GCA) are missing in Arabidopsis mitochondria. In addition, four genes found in Arabidopsis (ccb228, rpl2, rpl16 and trnY2-GUA) are entirely absent in sugar beet or present only in severely truncated form. Introns, duplicated sequences, additional reading frames and inserted foreign sequences (chloroplast, nuclear and plasmid DNA sequences) contribute significantly to the overall size of the sugar beet mitochondrial genome. Nevertheless, 55.6% of the genome has no obvious features of information. We identified a novel tRNA(Cys) gene (trnC2-GCA) which shows no sequence homology with any tRNA(Cys) genes reported so far in higher plants. Intriguingly, this tRNA gene is actually transcribed into a mature tRNA, whereas the native tRNA(Cys) gene (trnC1-GCA) is most likely a pseudogene.
Pythium ultimum is a ubiquitous oomycete plant pathogen responsible for a variety of diseases on a broad range of crop and ornamental species.
The P. ultimum genome (42.8 Mb) encodes 15,290 genes and has extensive sequence similarity and synteny with related Phytophthora species, including the potato blight pathogen Phytophthora infestans. Whole transcriptome sequencing revealed expression of 86% of genes, with detectable differential expression of suites of genes under abiotic stress and in the presence of a host. The predicted proteome includes a large repertoire of proteins involved in plant pathogen interactions, although, surprisingly, the P. ultimum genome does not encode any classical RXLR effectors and relatively few Crinkler genes in comparison to related phytopathogenic oomycetes. A lower number of enzymes involved in carbohydrate metabolism were present compared to Phytophthora species, with the notable absence of cutinases, suggesting a significant difference in virulence mechanisms between P. ultimum and more host-specific oomycete species. Although we observed a high degree of orthology with Phytophthora genomes, there were novel features of the P. ultimum proteome, including an expansion of genes involved in proteolysis and genes unique to Pythium. We identified a small gene family of cadherins, proteins involved in cell adhesion, the first report of these in a genome outside the metazoans.
Access to the P. ultimum genome has revealed not only core pathogenic mechanisms within the oomycetes but also lineage-specific genes associated with the alternative virulence and lifestyles found within the pythiaceous lineages compared to the Peronosporaceae.