Masatoshi Nei
Citations
All
Search in:AllTitleAbstractAuthor name
Publications
(70)
Patents
Grants
Pathways
Clinical trials
Publication
Journal: Molecular Biology and Evolution
March/24/2012
Abstract
Comparative analysis of molecular sequence data is essential for reconstructing the evolutionary histories of species and inferring the nature and extent of selective forces shaping the evolution of genes and species. Here, we announce the release of Molecular Evolutionary Genetics Analysis version 5 (MEGA5), which is a user-friendly software for mining online databases, building sequence alignments and phylogenetic trees, and using methods of evolutionary bioinformatics in basic biology, biomedicine, and evolution. The newest addition in MEGA5 is a collection of maximum likelihood (ML) analyses for inferring evolutionary trees, selecting best-fit substitution models (nucleotide or amino acid), inferring ancestral states and sequences (along with probabilities), and estimating evolutionary rates site-by-site. In computer simulation analyses, ML tree inference algorithms in MEGA5 compared favorably with other software packages in terms of computational efficiency and the accuracy of the estimates of phylogenetic trees, substitution parameters, and rate variation among sites. The MEGA user interface has now been enhanced to be activity driven to make it easier for the use of both beginners and experienced scientists. This version of MEGA is intended for the Windows platform, and it has been configured for effective use on Mac OS X and Linux desktops. It is available free of charge from http://www.megasoftware.net.
Pulse
Views:
27
Posts:
No posts
Rating:
Not rated
Publication
Journal: Molecular Biology and Evolution
October/28/2007
Abstract
We announce the release of the fourth version of MEGA software, which expands on the existing facilities for editing DNA sequence data from autosequencers, mining Web-databases, performing automatic and manual sequence alignment, analyzing sequence alignments to estimate evolutionary distances, inferring phylogenetic trees, and testing evolutionary hypotheses. Version 4 includes a unique facility to generate captions, written in figure legend format, in order to provide natural language descriptions of the models and methods used in the analyses. This facility aims to promote a better understanding of the underlying assumptions used in analyses, and of the results generated. Another new feature is the Maximum Composite Likelihood (MCL) method for estimating evolutionary distances between all pairs of sequences simultaneously, with and without incorporating rate variation among sites and substitution pattern heterogeneities among lineages. This MCL method also can be used to estimate transition/transversion bias and nucleotide substitution pattern without knowledge of the phylogenetic tree. This new version is a native 32-bit Windows application with multi-threading and multi-user supports, and it is also available to run in a Linux desktop environment (via the Wine compatibility layer) and on Intel-based Macintosh computers under the Parallels program. The current version of MEGA is available free of charge at (http://www.megasoftware.net).
Pulse
Views:
1
Posts:
No posts
Rating:
Not rated
Publication
Journal: Briefings in Bioinformatics
March/28/2005
Abstract
With its theoretical basis firmly established in molecular evolutionary and population genetics, the comparative DNA and protein sequence analysis plays a central role in reconstructing the evolutionary histories of species and multigene families, estimating rates of molecular evolution, and inferring the nature and extent of selective forces shaping the evolution of genes and genomes. The scope of these investigations has now expanded greatly owing to the development of high-throughput sequencing techniques and novel statistical and computational methods. These methods require easy-to-use computer programs. One such effort has been to produce Molecular Evolutionary Genetics Analysis (MEGA) software, with its focus on facilitating the exploration and analysis of the DNA and protein sequence variation from an evolutionary perspective. Currently in its third major release, MEGA3 contains facilities for automatic and manual sequence alignment, web-based mining of databases, inference of the phylogenetic trees, estimation of evolutionary distances and testing evolutionary hypotheses. This paper provides an overview of the statistical methods, computational tools, and visual exploration modules for data input and the results obtainable in MEGA.
Publication
Journal: Briefings in Bioinformatics
July/6/2008
Abstract
The Molecular Evolutionary Genetics Analysis (MEGA) software is a desktop application designed for comparative analysis of homologous gene sequences either from multigene families or from different species with a special emphasis on inferring evolutionary relationships and patterns of DNA and protein evolution. In addition to the tools for statistical analysis of data, MEGA provides many convenient facilities for the assembly of sequence data sets from files or web-based repositories, and it includes tools for visual presentation of the results obtained in the form of interactive phylogenetic trees and evolutionary distance matrices. Here we discuss the motivation, design principles and priorities that have shaped the development of MEGA. We also discuss how MEGA might evolve in the future to assist researchers in their growing need to analyze large data set using new computational methods.
Publication
Journal: Proceedings of the National Academy of Sciences of the United States of America
August/24/2004
Abstract
Current efforts to reconstruct the tree of life and histories of multigene families demand the inference of phylogenies consisting of thousands of gene sequences. However, for such large data sets even a moderate exploration of the tree space needed to identify the optimal tree is virtually impossible. For these cases the neighbor-joining (NJ) method is frequently used because of its demonstrated accuracy for smaller data sets and its computational speed. As data sets grow, however, the fraction of the tree space examined by the NJ algorithm becomes minuscule. Here, we report the results of our computer simulation for examining the accuracy of NJ trees for inferring very large phylogenies. First we present a likelihood method for the simultaneous estimation of all pairwise distances by using biologically realistic models of nucleotide substitution. Use of this method corrects up to 60% of NJ tree errors. Our simulation results show that the accuracy of NJ trees decline only by approximately 5% when the number of sequences used increases from 32 to 4,096 (128 times) even in the presence of extensive variation in the evolutionary rate among lineages or significant biases in the nucleotide composition and transition/transversion ratio. Our results encourage the use of complex models of nucleotide substitution for estimating evolutionary distances and hint at bright prospects for the application of the NJ and related methods in inferring large phylogenies.
Publication
Journal: Annual Review of Genetics
January/31/2006
Abstract
Until around 1990, most multigene families were thought to be subject to concerted evolution, in which all member genes of a family evolve as a unit in concert. However, phylogenetic analysis of MHC and other immune system genes showed a quite different evolutionary pattern, and a new model called birth-and-death evolution was proposed. In this model, new genes are created by gene duplication and some duplicate genes stay in the genome for a long time, whereas others are inactivated or deleted from the genome. Later investigations have shown that most non-rRNA genes including highly conserved histone or ubiquitin genes are subject to this type of evolution. However, the controversy over the two models is still continuing because the distinction between the two models becomes difficult when sequence differences are small. Unlike concerted evolution, the model of birth-and-death evolution can give some insights into the origins of new genetic systems or new phenotypic characters.
Publication
Journal: Nature Reviews Genetics
December/16/2008
Abstract
Chemosensory receptors are essential for the survival of organisms that range from bacteria to mammals. Recent studies have shown that the numbers of functional chemosensory receptor genes and pseudogenes vary enormously among the genomes of different animal species. Although much of the variation can be explained by the adaptation of organisms to different environments, it has become clear that a substantial portion is generated by genomic drift, a random process of gene duplication and deletion. Genomic drift also generates a substantial amount of copy-number variation in chemosensory receptor genes within species. It seems that mutation by gene duplication and inactivation has important roles in both the adaptive and non-adaptive evolution of chemosensation.
Publication
Journal: Molecular Biology and Evolution
October/22/2003
Abstract
Although the phylogenetic relationships of major lineages of primate species are relatively well established, the times of divergence of these lineages as estimated by molecular data are still controversial. This controversy has been generated in part because different authors have used different types of molecular data, different statistical methods, and different calibration points. We have therefore examined the effects of these factors on the estimates of divergence times and reached the following conclusions: (1) It is advisable to concatenate many gene sequences and use a multigene gamma distance for estimating divergence times rather than using the individual gene approach. (2) When sequence data from many nuclear genes are available, protein sequences appear to give more robust estimates than DNA sequences. (3) Nuclear proteins are generally more suitable than mitochondrial proteins for time estimation. (4) It is important first to construct a phylogenetic tree for a group of species using some outgroups and then estimate the branch lengths. (5) It appears to be better to use a few reliable calibration points rather than many unreliable ones. Considering all these factors and using two calibration points, we estimated that the human lineage diverged from the chimpanzee, gorilla, orangutan, Old World monkey, and New World monkey lineages approximately 6 MYA (with a range of 5-7), 7 MYA (range, 6-8), 13 MYA (range, 12-15), 23 MYA (range, 21-25), and 33 MYA (range 32-36).
Publication
Journal: Molecular Biology and Evolution
May/11/2010
Abstract
Currently, there is a demand for software to analyze polymorphism data such as microsatellite DNA and single nucleotide polymorphism with easily accessible interface in many fields of research. In this article, we would like to make an announcement of POPTREE2, a computer program package, that can perform evolutionary analyses of allele frequency data. The original version (POPTREE) was a command-line program that runs on the Command Prompt of Windows and Unix. In POPTREE2 genetic distances (measures of the extent of genetic differentiation between populations) for constructing phylogenetic trees, average heterozygosities (H) (a measure of genetic variation within populations) and G(ST) (a measure of genetic differentiation of subdivided populations) are computed through a simple and intuitive Windows interface. It will facilitate statistical analyses of polymorphism data for researchers in many different fields. POPTREE2 is available at http://www.med.kagawa-u.ac.jp/ approximately genomelb/takezaki/poptree2/index.html.
Publication
Journal: PLoS ONE
March/11/2010
Abstract
Odor perception in mammals is mediated by a large multigene family of olfactory receptor (OR) genes. The number of OR genes varies extensively among different species of mammals, and most species have a substantial number of pseudogenes. To gain some insight into the evolutionary dynamics of mammalian OR genes, we identified the entire set of OR genes in platypuses, opossums, cows, dogs, rats, and macaques and studied the evolutionary change of the genes together with those of humans and mice. We found that platypuses and primates have <400 functional OR genes while the other species have 800-1,200 functional OR genes. We then estimated the numbers of gains and losses of OR genes for each branch of the phylogenetic tree of mammals. This analysis showed that (i) gene expansion occurred in the placental lineage each time after it diverged from monotremes and from marsupials and (ii) hundreds of gains and losses of OR genes have occurred in an order-specific manner, making the gene repertoires highly variable among different orders. It appears that the number of OR genes is determined primarily by the functional requirement for each species, but once the number reaches the required level, it fluctuates by random duplication and deletion of genes. This fluctuation seems to have been aided by the stochastic nature of OR gene expression.
Publication
Journal: Proceedings of the National Academy of Sciences of the United States of America
February/4/2009
Abstract
F-box proteins are substrate-recognition components of the Skp1-Rbx1-Cul1-F-box protein (SCF) ubiquitin ligases. In plants, F-box genes form one of the largest multigene superfamilies and control many important biological functions. However, it is unclear how and why plants have acquired a large number of F-box genes. Here we identified 692, 337, and 779 F-box genes in Arabidopsis, poplar and rice, respectively, and studied their phylogenetic relationships and evolutionary patterns. We found that the plant F-box superfamily can be divided into 42 families, each of which has a distinct domain organization. We also estimated the number of ancestral genes for each family and identified highly conservative versus divergent families. In conservative families, there has been little or no change in the number of genes since the divergence between eudicots and monocots approximately 145 million years ago. In divergent families, however, the numbers have increased dramatically during the same period. In two cases, the numbers of genes in extant species are >100 times greater than that in the most recent common ancestor (MRCA) of the three species. Proteins encoded by highly conservative genes always have the same domain organization, suggesting that they interact with the same or similar substrates. In contrast, proteins of rapidly duplicating genes sometimes have quite different domain structures, mainly caused by unusually frequent shifts of exon-intron boundaries and/or frameshift mutations. Our results indicate that different F-box families, or different clusters of the same family, have experienced dramatically different modes of sequence divergence, apparently having resulted in adaptive changes in function.
Publication
Journal: Evolution; international journal of organic evolution
May/30/2017
Publication
Journal: Molecular Biology and Evolution
April/12/2006
Abstract
Charles Darwin proposed that evolution occurs primarily by natural selection, but this view has been controversial from the beginning. Two of the major opposing views have been mutationism and neutralism. Early molecular studies suggested that most amino acid substitutions in proteins are neutral or nearly neutral and the functional change of proteins occurs by a few key amino acid substitutions. This suggestion generated an intense controversy over selectionism and neutralism. This controversy is partially caused by Kimura's definition of neutrality, which was too strict (|2Ns|< or =1). If we define neutral mutations as the mutations that do not change the function of gene products appreciably, many controversies disappear because slightly deleterious and slightly advantageous mutations are engulfed by neutral mutations. The ratio of the rate of nonsynonymous nucleotide substitution to that of synonymous substitution is a useful quantity to study positive Darwinian selection operating at highly variable genetic loci, but it does not necessarily detect adaptively important codons. Previously, multigene families were thought to evolve following the model of concerted evolution, but new evidence indicates that most of them evolve by a birth-and-death process of duplicate genes. It is now clear that most phenotypic characters or genetic systems such as the adaptive immune system in vertebrates are controlled by the interaction of a number of multigene families, which are often evolutionarily related and are subject to birth-and-death evolution. Therefore, it is important to study the mechanisms of gene family interaction for understanding phenotypic evolution. Because gene duplication occurs more or less at random, phenotypic evolution contains some fortuitous elements, though the environmental factors also play an important role. The randomness of phenotypic evolution is qualitatively different from allele frequency changes by random genetic drift. However, there is some similarity between phenotypic and molecular evolution with respect to functional or environmental constraints and evolutionary rate. It appears that mutation (including gene duplication and other DNA changes) is the driving force of evolution at both the genic and the phenotypic levels.
Publication
Journal: Proceedings of the National Academy of Sciences of the United States of America
July/7/2005
Abstract
Olfaction, which is an important physiological function for the survival of mammals, is controlled by a large multigene family of olfactory receptor (OR) genes. Fishes also have this gene family, but the number of genes is known to be substantially smaller than in mammals. To understand the evolutionary dynamics of OR genes, we conducted a phylogenetic analysis of all functional genes identified from the genome sequences of zebrafish, pufferfish, frogs, chickens, humans, and mice. The results suggested that the most recent common ancestor between fishes and tetrapods had at least nine ancestral OR genes, and all OR genes identified were classified into nine groups, each of which originated from one ancestral gene. Eight of the nine group genes are still observed in current fish species, whereas only two group genes were found from mammalian genomes, showing that the OR gene family in fishes is much more diverse than in mammals. In mammals, however, one group of genes, gamma, expanded enormously, containing approximately 90% of the entire gene family. Interestingly, the gene groups observed in mammals or birds are nearly absent in fishes. The OR gene repertoire in frogs is as diverse as that in fishes, but the expansion of group gamma genes also occurred, indicating that the frog OR gene family has both mammal- and fish-like characters. All of these observations can be explained by the environmental change that organisms have experienced from the time of the common ancestor of all vertebrates to the present.
Publication
Journal: Proceedings of the National Academy of Sciences of the United States of America
August/15/2006
Abstract
The bacterial recA gene and its eukaryotic homolog RAD51 are important for DNA repair, homologous recombination, and genome stability. Members of the recA/RAD51 family have functions that have differentiated during evolution. However, the evolutionary history and relationships of these members remains unclear. Homolog searches in prokaryotes and eukaryotes indicated that most eubacteria contain only one recA. However, many archaeal species have two recA/RAD51 homologs (RADA and RADB), and eukaryotes possess multiple members (RAD51, RAD51B, RAD51C, RAD51D, DMC1, XRCC2, XRCC3, and recA). Phylogenetic analyses indicated that the recA/RAD51 family can be divided into three subfamilies: (i) RADalpha, with highly conserved functions; (ii) RADbeta, with relatively divergent functions; and (iii) recA, functioning in eubacteria and eukaryotic organelles. The RADalpha and RADbeta subfamilies each contain archaeal and eukaryotic members, suggesting that a gene duplication occurred before the archaea/eukaryote split. In the RADalpha subfamily, eukaryotic RAD51 and DMC1 genes formed two separate monophyletic groups when archaeal RADA genes were used as an outgroup. This result suggests that another duplication event occurred in the early stage of eukaryotic evolution, producing the DMC1 clade with meiosis-specific genes. The RADbeta subfamily has a basal archaeal clade and five eukaryotic clades, suggesting that four eukaryotic duplication events occurred before animals and plants diverged. The eukaryotic recA genes were detected in plants and protists and showed strikingly high levels of sequence similarity to recA genes from proteobacteria or cyanobacteria. These results suggest that endosymbiotic transfer of recA genes occurred from mitochondria and chloroplasts to nuclear genomes of ancestral eukaryotes.
Publication
Journal: Proceedings of the National Academy of Sciences of the United States of America
January/27/2003
Abstract
Bayesian phylogenetics has recently been proposed as a powerful method for inferring molecular phylogenies, and it has been reported that the mammalian and some plant phylogenies were resolved by using this method. The statistical confidence of interior branches as judged by posterior probabilities in Bayesian analysis is generally higher than that as judged by bootstrap probabilities in maximum likelihood analysis, and this difference has been interpreted as an indication that bootstrap support may be too conservative. However, it is possible that the posterior probabilities are too high or too liberal instead. Here, we show by computer simulation that posterior probabilities in Bayesian analysis can be excessively liberal when concatenated gene sequences are used, whereas bootstrap probabilities in neighbor-joining and maximum likelihood analyses are generally slightly conservative. These results indicate that bootstrap probabilities are more suitable for assessing the reliability of phylogenetic trees than posterior probabilities and that the mammalian and plant phylogenies may not have been fully resolved.
Publication
Journal: Proceedings of the National Academy of Sciences of the United States of America
December/3/2003
Abstract
Olfactory receptor (OR) genes form the largest known multigene family in the human genome. To obtain some insight into their evolutionary history, we have identified the complete set of OR genes and their chromosomal locations from the latest human genome sequences. We detected 388 potentially functional genes that have intact ORFs and 414 apparent pseudogenes. The number and the fraction (48%) of functional genes are considerably larger than the ones previously reported. The human OR genes can clearly be divided into class I and class II genes, as was previously noted. Our phylogenetic analysis has shown that the class II OR genes can further be classified into 19 phylogenetic clades supported by high bootstrap values. We have also found that there are many tandem arrays of OR genes that are phylogenetically closely related. These genes appear to have been generated by tandem gene duplication. However, the relationships between genomic clusters and phylogenetic clades are very complicated. There are a substantial number of cases in which the genes in the same phylogenetic clade are located on different chromosomal regions. In addition, OR genes belonging to distantly related phylogenetic clades are sometimes located very closely in a chromosomal region and form a tight genomic cluster. These observations can be explained by the assumption that several chromosomal rearrangements have occurred at the regions of OR gene clusters and the OR genes contained in different genomic clusters are shuffled.
Publication
Journal: Journal of Human Genetics
September/27/2006
Abstract
The numbers of functional olfactory receptor (OR) genes in humans and mice are about 400 and 1,000 respectively. In both humans and mice, these genes exist as genomic clusters and are scattered over almost all chromosomes. The difference in the number of genes between the two species is apparently caused by massive inactivation of OR genes in the human lineage and a substantial increase of OR genes in the mouse lineage after the human-mouse divergence. Compared with mammals, fishes have a much smaller number of OR genes. However, the OR gene family in fishes is much more divergent than that in mammals. Fishes have many different groups of genes that are absent in mammals, suggesting that the mammalian OR gene family is characterized by the loss of many group genes that existed in the ancestor of vertebrates and the subsequent expansion of specific groups of genes. Therefore, this gene family apparently changed dynamically depending on the evolutionary lineage and evolved under the birth-and-death model of evolution. Study of the evolutionary changes of two gene families for vomeronasal receptors and two gene families for taste receptors, which are structurally similar, but remotely related to OR genes, showed that some of the gene families evolved in the same fashion as the OR gene family. It appears that the number and types of genes in chemosensory receptor gene families have evolved in response to environmental needs, but they are also affected by fortuitous factors.
Publication
Journal: Proceedings of the National Academy of Sciences of the United States of America
March/15/2004
Abstract
Plant MADS-box genes form a large gene family for transcription factors and are involved in various aspects of developmental processes, including flower development. They are known to be subject to birth-and-death evolution, but the detailed features of this mode of evolution remain unclear. To have a deeper insight into the evolutionary pattern of this gene family, we enumerated all available functional and nonfunctional (pseudogene) MADS-box genes from the Arabidopsis and rice genomes. Plant MADS-box genes can be classified into types I and II genes on the basis of phylogenetic analysis. Conducting extensive homology search and phylogenetic analysis, we found 64 presumed functional and 37 nonfunctional type I genes and 43 presumed functional and 4 nonfunctional type II genes in Arabidopsis. We also found 24 presumed functional and 6 nonfunctional type I genes and 47 presumed functional and 1 nonfunctional type II genes in rice. Our phylogenetic analysis indicated there were at least about four to eight type I genes and approximately 15-20 type II genes in the most recent common ancestor of Arabidopsis and rice. It has also been suggested that type I genes have experienced a higher rate of birth-and-death evolution than type II genes in angiosperms. Furthermore, the higher rate of birth-and-death evolution in type I genes appeared partly due to a higher frequency of segmental gene duplication and weaker purifying selection in type I than in type II genes.
Publication
Journal: Proceedings of the National Academy of Sciences of the United States of America
September/9/2007
Abstract
Recent studies of developmental biology have shown that the genes controlling phenotypic characters expressed in the early stage of development are highly conserved and that recent evolutionary changes have occurred primarily in the characters expressed in later stages of development. Even the genes controlling the latter characters are generally conserved, but there is a large component of neutral or nearly neutral genetic variation within and between closely related species. Phenotypic evolution occurs primarily by mutation of genes that interact with one another in the developmental process. The enormous amount of phenotypic diversity among different phyla or classes of organisms is a product of accumulation of novel mutations and their conservation that have facilitated adaptation to different environments. Novel mutations may be incorporated into the genome by natural selection (elimination of preexisting genotypes) or by random processes such as genetic and genomic drift. However, once the mutations are incorporated into the genome, they may generate developmental constraints that will affect the future direction of phenotypic evolution. It appears that the driving force of phenotypic evolution is mutation, and natural selection is of secondary importance.
Publication
Journal: Proceedings of the National Academy of Sciences of the United States of America
May/28/2009
Abstract
Natural selection operating in protein-coding genes is often studied by examining the ratio (omega) of the rates of nonsynonymous to synonymous nucleotide substitution. The branch-site method (BSM) based on a likelihood ratio test is one of such tests to detect positive selection for a predetermined branch of a phylogenetic tree. However, because the number of nucleotide substitutions involved is often very small, we conducted a computer simulation to examine the reliability of BSM in comparison with the small-sample method (SSM) based on Fisher's exact test. The results indicate that BSM often generates false positives compared with SSM when the number of nucleotide substitutions is approximately 80 or smaller. Because the omega value is also used for predicting positively selected sites, we examined the reliabilities of the site-prediction methods, using nucleotide sequence data for the dim-light and color vision genes in vertebrates. The results showed that the site-prediction methods have a low probability of identifying functional changes of amino acids experimentally determined and often falsely identify other sites where amino acid substitutions are unlikely to be important. This low rate of predictability occurs because most of the current statistical methods are designed to identify codon sites with high omega values, which may not have anything to do with functional changes. The codon sites showing functional changes generally do not show a high omega value. To understand adaptive evolution, some form of experimental confirmation is necessary.
Publication
Journal: Annual Review of Genomics and Human Genetics
September/29/2010
Abstract
The neutral theory of molecular evolution has been widely accepted and is the guiding principle for studying evolutionary genomics and the molecular basis of phenotypic evolution. Recent data on genomic evolution are generally consistent with the neutral theory. However, many recently published papers claim the detection of positive Darwinian selection via the use of new statistical methods. Examination of these methods has shown that their theoretical bases are not well established and often result in high rates of false-positive and false-negative results. When the deficiencies of these statistical methods are rectified, the results become largely consistent with the neutral theory. At present, genome-wide analyses of natural selection consist of collections of single-locus analyses. However, because phenotypic evolution is controlled by the interaction of many genes, the study of natural selection ought to take such interactions into account. Experimental studies of evolution will also be crucial.
Publication
Journal: Proceedings of the National Academy of Sciences of the United States of America
January/27/2008
Abstract
The number of sensory receptor genes varies extensively among different mammalian species. This variation is believed to be caused partly by physiological requirements of animals and partly by genomic drift due to random duplication and deletion of genes. If the contribution of genomic drift is substantial, each species should contain a significant amount of copy number variation (CNV). We therefore investigated CNVs in sensory receptor genes among 270 healthy humans by using published CNV data. The results indicated that olfactory receptor (OR), taste receptor type 2, and vomeronasal receptor type 1 genes show a high level of intraspecific CNVs. In particular, >30% of the approximately 800 OR gene loci in humans were polymorphic with respect to copy number, and two randomly chosen individuals showed a copy number difference of approximately 11 in functional OR genes on average. There was no significant difference in the amount of CNVs between functional and nonfunctional OR genes. Because pseudogenes are expected to evolve in a neutral fashion, this observation suggests that functional OR genes also have evolved in a similar manner with respect to copy number change. In addition, we found that the evolutionary change of copy number of OR genes approximately follows the Gaussian process in probability theory, and the copy number divergence between populations has increased with evolutionary time. We therefore conclude that genomic drift plays an important role for generating intra- and interspecific CNVs of sensory receptor genes. Similar results were obtained when all annotated genes were analyzed.
Publication
Journal: Genome Biology and Evolution
July/2/2012
Abstract
MicroRNAs (miRNAs) are among the most important regulatory elements of gene expression in animals and plants. However, their origin and evolutionary dynamics have not been studied systematically. In this paper, we identified putative miRNA genes in 11 plant species using the bioinformatic technique and examined their evolutionary changes. Our homology search indicated that no miRNA gene is currently shared between green algae and land plants. The number of miRNA genes has increased substantially in the land plant lineage, but after the divergence of eudicots and monocots, the number has changed in a lineage-specific manner. We found that miRNA genes have originated mainly by duplication of preexisting miRNA genes or protein-coding genes. Transposable elements also seem to have contributed to the generation of species-specific miRNA genes. The relative importance of these mechanisms in plants is quite different from that in Drosophila species, where the formation of hairpin structures in the genomes seems to be a major source of miRNA genes. This difference in the origin of miRNA genes between plants and Drosophila may be explained by the difference in the binding to target mRNAs between plants and animals. We also found that young miRNA genes are less conserved than old genes in plants as well as in Drosophila species. Yet, nearly half of the gene families in the ancestor of flowering plants have been lost in at least one species examined. This indicates that the repertoires of miRNA genes have changed more dynamically than previously thought during plant evolution.
load more...