Sudhir Kumar
Citations
All
Search in:AllTitleAbstractAuthor name
Publications
(416)
Patents
Grants
Pathways
Clinical trials
Publication
Journal: Molecular Biology and Evolution
March/24/2012
Abstract
Comparative analysis of molecular sequence data is essential for reconstructing the evolutionary histories of species and inferring the nature and extent of selective forces shaping the evolution of genes and species. Here, we announce the release of Molecular Evolutionary Genetics Analysis version 5 (MEGA5), which is a user-friendly software for mining online databases, building sequence alignments and phylogenetic trees, and using methods of evolutionary bioinformatics in basic biology, biomedicine, and evolution. The newest addition in MEGA5 is a collection of maximum likelihood (ML) analyses for inferring evolutionary trees, selecting best-fit substitution models (nucleotide or amino acid), inferring ancestral states and sequences (along with probabilities), and estimating evolutionary rates site-by-site. In computer simulation analyses, ML tree inference algorithms in MEGA5 compared favorably with other software packages in terms of computational efficiency and the accuracy of the estimates of phylogenetic trees, substitution parameters, and rate variation among sites. The MEGA user interface has now been enhanced to be activity driven to make it easier for the use of both beginners and experienced scientists. This version of MEGA is intended for the Windows platform, and it has been configured for effective use on Mac OS X and Linux desktops. It is available free of charge from http://www.megasoftware.net.
Publication
Journal: Molecular Biology and Evolution
October/28/2007
Abstract
We announce the release of the fourth version of MEGA software, which expands on the existing facilities for editing DNA sequence data from autosequencers, mining Web-databases, performing automatic and manual sequence alignment, analyzing sequence alignments to estimate evolutionary distances, inferring phylogenetic trees, and testing evolutionary hypotheses. Version 4 includes a unique facility to generate captions, written in figure legend format, in order to provide natural language descriptions of the models and methods used in the analyses. This facility aims to promote a better understanding of the underlying assumptions used in analyses, and of the results generated. Another new feature is the Maximum Composite Likelihood (MCL) method for estimating evolutionary distances between all pairs of sequences simultaneously, with and without incorporating rate variation among sites and substitution pattern heterogeneities among lineages. This MCL method also can be used to estimate transition/transversion bias and nucleotide substitution pattern without knowledge of the phylogenetic tree. This new version is a native 32-bit Windows application with multi-threading and multi-user supports, and it is also available to run in a Linux desktop environment (via the Wine compatibility layer) and on Intel-based Macintosh computers under the Parallels program. The current version of MEGA is available free of charge at (http://www.megasoftware.net).
Publication
Journal: Molecular Biology and Evolution
July/7/2014
Abstract
We announce the release of an advanced version of the Molecular Evolutionary Genetics Analysis (MEGA) software, which currently contains facilities for building sequence alignments, inferring phylogenetic histories, and conducting molecular evolutionary analysis. In version 6.0, MEGA now enables the inference of timetrees, as it implements the RelTime method for estimating divergence times for all branching points in a phylogeny. A new Timetree Wizard in MEGA6 facilitates this timetree inference by providing a graphical user interface (GUI) to specify the phylogeny and calibration constraints step-by-step. This version also contains enhanced algorithms to search for the optimal trees under evolutionary criteria and implements a more advanced memory management that can double the size of sequence data sets to which MEGA can be applied. Both GUI and command-line versions of MEGA6 can be downloaded from www.megasoftware.net free of charge.
Publication
Journal: Molecular Biology and Evolution
July/11/2017
Abstract
We present the latest version of the Molecular Evolutionary Genetics Analysis (Mega) software, which contains many sophisticated methods and tools for phylogenomics and phylomedicine. In this major upgrade, Mega has been optimized for use on 64-bit computing systems for analyzing larger datasets. Researchers can now explore and analyze tens of thousands of sequences in Mega The new version also provides an advanced wizard for building timetrees and includes a new functionality to automatically predict gene duplication events in gene family trees. The 64-bit Mega is made available in two interfaces: graphical and command line. The graphical user interface (GUI) is a native Microsoft Windows application that can also be used on Mac OS X. The command line Mega is available as native applications for Windows, Linux, and Mac OS X. They are intended for use in high-throughput and scripted analysis. Both versions are available from www.megasoftware.net free of charge.
Publication
Journal: Briefings in Bioinformatics
March/28/2005
Abstract
With its theoretical basis firmly established in molecular evolutionary and population genetics, the comparative DNA and protein sequence analysis plays a central role in reconstructing the evolutionary histories of species and multigene families, estimating rates of molecular evolution, and inferring the nature and extent of selective forces shaping the evolution of genes and genomes. The scope of these investigations has now expanded greatly owing to the development of high-throughput sequencing techniques and novel statistical and computational methods. These methods require easy-to-use computer programs. One such effort has been to produce Molecular Evolutionary Genetics Analysis (MEGA) software, with its focus on facilitating the exploration and analysis of the DNA and protein sequence variation from an evolutionary perspective. Currently in its third major release, MEGA3 contains facilities for automatic and manual sequence alignment, web-based mining of databases, inference of the phylogenetic trees, estimation of evolutionary distances and testing evolutionary hypotheses. This paper provides an overview of the statistical methods, computational tools, and visual exploration modules for data input and the results obtainable in MEGA.
Publication
Journal: Molecular Biology and Evolution
November/13/2018
Abstract
The Molecular Evolutionary Genetics Analysis (Mega) software implements many analytical methods and tools for phylogenomics and phylomedicine. Here, we report a transformation of Mega to enable cross-platform use on Microsoft Windows and Linux operating systems. Mega X does not require virtualization or emulation software and provides a uniform user experience across platforms. Mega X has additionally been upgraded to use multiple computing cores for many molecular evolutionary analyses. Mega X is available in two interfaces (graphical and command line) and can be downloaded from www.megasoftware.net free of charge.
Publication
Journal: Briefings in Bioinformatics
July/6/2008
Abstract
The Molecular Evolutionary Genetics Analysis (MEGA) software is a desktop application designed for comparative analysis of homologous gene sequences either from multigene families or from different species with a special emphasis on inferring evolutionary relationships and patterns of DNA and protein evolution. In addition to the tools for statistical analysis of data, MEGA provides many convenient facilities for the assembly of sequence data sets from files or web-based repositories, and it includes tools for visual presentation of the results obtained in the form of interactive phylogenetic trees and evolutionary distance matrices. Here we discuss the motivation, design principles and priorities that have shaped the development of MEGA. We also discuss how MEGA might evolve in the future to assist researchers in their growing need to analyze large data set using new computational methods.
Publication
Journal: Nature
January/2/2008
Abstract
Comparative analysis of multiple genomes in a phylogenetic framework dramatically improves the precision and sensitivity of evolutionary inference, producing more robust results than single-genome analyses can provide. The genomes of 12 Drosophila species, ten of which are presented here for the first time (sechellia, simulans, yakuba, erecta, ananassae, persimilis, willistoni, mojavensis, virilis and grimshawi), illustrate how rates and patterns of sequence divergence across taxa can illuminate evolutionary processes on a genomic scale. These genome sequences augment the formidable genetic tools that have made Drosophila melanogaster a pre-eminent model for animal genetics, and will further catalyse fundamental research on mechanisms of development, cell biology, genetics, disease, neurobiology, behaviour, physiology and evolution. Despite remarkable similarities among these Drosophila species, we identified many putatively non-neutral changes in protein-coding genes, non-coding RNA genes, and cis-regulatory regions. These may prove to underlie differences in the ecology and behaviour of these diverse species.
Publication
Journal: Proceedings of the National Academy of Sciences of the United States of America
August/24/2004
Abstract
Current efforts to reconstruct the tree of life and histories of multigene families demand the inference of phylogenies consisting of thousands of gene sequences. However, for such large data sets even a moderate exploration of the tree space needed to identify the optimal tree is virtually impossible. For these cases the neighbor-joining (NJ) method is frequently used because of its demonstrated accuracy for smaller data sets and its computational speed. As data sets grow, however, the fraction of the tree space examined by the NJ algorithm becomes minuscule. Here, we report the results of our computer simulation for examining the accuracy of NJ trees for inferring very large phylogenies. First we present a likelihood method for the simultaneous estimation of all pairwise distances by using biologically realistic models of nucleotide substitution. Use of this method corrects up to 60% of NJ tree errors. Our simulation results show that the accuracy of NJ trees decline only by approximately 5% when the number of sequences used increases from 32 to 4,096 (128 times) even in the presence of extensive variation in the evolutionary rate among lineages or significant biases in the nucleotide composition and transition/transversion ratio. Our results encourage the use of complex models of nucleotide substitution for estimating evolutionary distances and hint at bright prospects for the application of the NJ and related methods in inferring large phylogenies.
Publication
Journal: Bioinformatics
December/12/2006
Abstract
Biologists and other scientists routinely need to know times of divergence between species and to construct phylogenies calibrated to time (timetrees). Published studies reporting time estimates from molecular data have been increasing rapidly, but the data have been largely inaccessible to the greater community of scientists because of their complexity. TimeTree brings these data together in a consistent format and uses a hierarchical structure, corresponding to the tree of life, to maximize their utility. Results are presented and summarized, allowing users to quickly determine the range and robustness of time estimates and the degree of consensus from the published literature.
BACKGROUND
TimeTree is available at http://www.timetree.net
Publication
Journal: Molecular Biology and Evolution
October/20/2004
Abstract
Drosophila melanogaster has been a canonical model organism to study genetics, development, behavior, physiology, evolution, and population genetics for nearly a century. Despite this emphasis and the completion of its nuclear genome sequence, the timing of major speciation events leading to the origin of this fruit fly remain elusive because of the paucity of extensive fossil records and biogeographic data. Use of molecular clocks as an alternative has been fraught with non-clock-like accumulation of nucleotide and amino-acid substitutions. Here we present a novel methodology in which genomic mutation distances are used to overcome these limitations and to make use of all available gene sequence data for constructing a fruit fly molecular time scale. Our analysis of 2977 pairwise sequence comparisons from 176 nuclear genes reveals a long-term fruit fly mutation clock ticking at a rate of 11.1 mutations per kilobase pair per Myr. Genomic mutation clock-based timings of the landmark speciation events leading to the evolution of D. melanogaster show that it shared most recent common ancestry 5.4 MYA with D. simulans, 12.6 MYA with D. erecta+D. orena, 12.8 MYA with D. yakuba+D. teisseri, 35.6 MYA with the takahashii subgroup, 41.3 MYA with the montium subgroup, 44.2 MYA with the ananassae subgroup, 54.9 MYA with the obscura group, 62.2 MYA with the willistoni group, and 62.9 MYA with the subgenus Drosophila. These and other estimates are compatible with those known from limited biogeographic and fossil records. The inferred temporal pattern of fruit fly evolution shows correspondence with the cooling patterns of paleoclimate changes and habitat fragmentation in the Cenozoic.
Publication
Journal: Molecular Biology and Evolution
October/30/2017
Abstract
Evolutionary information on species divergence times is fundamental to studies of biodiversity, development, and disease. Molecular dating has enhanced our understanding of the temporal patterns of species divergences over the last five decades, and the number of studies is increasing quickly due to an exponential growth in the available collection of molecular sequences from diverse species and large number of genes. Our TimeTree resource is a public knowledge-base with the primary focus to make available all species divergence times derived using molecular sequence data to scientists, educators, and the general public in a consistent and accessible format. Here, we report a major expansion of the TimeTree resource, which more than triples the number of species (>97,000) and more than triples the number of studies assembled (>3,000). Furthermore, scientists can access not only the divergence time between two species or higher taxa, but also a timetree of a group of species and a timeline that traces a species' evolution through time. The new timetree and timeline visualizations are integrated with display of events on earth and environmental history over geological time, which will lead to broader and better understanding of the interplay of the change in the biosphere with the diversity of species on Earth. The next generation TimeTree resource is publicly available online at http://www.timetree.org.
Publication
Journal: Molecular Biology and Evolution
May/12/2016
Abstract
Genomic data are rapidly resolving the tree of living species calibrated to time, the timetree of life, which will provide a framework for research in diverse fields of science. Previous analyses of taxonomically restricted timetrees have found a decline in the rate of diversification in many groups of organisms, often attributed to ecological interactions among species. Here, we have synthesized a global timetree of life from 2,274 studies representing 50,632 species and examined the pattern and rate of diversification as well as the timing of speciation. We found that species diversity has been mostly expanding overall and in many smaller groups of species, and that the rate of diversification in eukaryotes has been mostly constant. We also identified, and avoided, potential biases that may have influenced previous analyses of diversification including low levels of taxon sampling, small clade size, and the inclusion of stem branches in clade analyses. We found consistency in time-to-speciation among plants and animals, ∼2 My, as measured by intervals of crown and stem species times. Together, this clock-like change at different levels suggests that speciation and diversification are processes dominated by random events and that adaptive change is largely a separate process.
Publication
Journal: Proceedings of the National Academy of Sciences of the United States of America
April/28/2002
Abstract
Knowledge of the rate of point mutation is of fundamental importance, because mutations are a vital source of genetic novelty and a significant cause of human diseases. Currently, mutation rate is thought to vary many fold among genes within a genome and among lineages in mammals. We have conducted a computational analysis of 5,669 genes (17,208 sequences) from species representing major groups of placental mammals to characterize the extent of mutation rate differences among genes in a genome and among diverse mammalian lineages. We find that mutation rate is approximately constant per year and largely similar among genes. Similarity of mutation rates among lineages with vastly different generation lengths and physiological attributes points to a much greater contribution of replication-independent mutational processes to the overall mutation rate. Our results suggest that the average mammalian genome mutation rate is 2.2 x 10(-9) per base pair per year, which provides further opportunities for estimating species and population divergence times by using molecular clocks.
Publication
Journal: Nature
January/2/2008
Abstract
Both genome content and deployment contribute to phenotypic differences between species. Sex is the most important difference between individuals in a species and has long been posited to be rapidly evolving. Indeed, in the Drosophila genus, traits such as sperm length, genitalia, and gonad size are the most obvious differences between species. Comparative analysis of sex-biased expression should deepen our understanding of the relationship between genome content and deployment during evolution. Using existing and newly assembled genomes, we designed species-specific microarrays to examine sex-biased expression of orthologues and species-restricted genes in D. melanogaster, D. simulans, D. yakuba, D. ananassae, D. pseudoobscura, D. virilis and D. mojavensis. We show that averaged sex-biased expression changes accumulate monotonically over time within the genus. However, different genes contribute to expression variance within species groups compared to between groups. We observed greater turnover of species-restricted genes with male-biased expression, indicating that gene formation and extinction may play a significant part in species differences. Genes with male-biased expression also show the greatest expression and DNA sequence divergence. This higher divergence and turnover of genes with male-biased expression may be due to high transcription rates in the male germline, greater functional pleiotropy of genes expressed in females, and/or sexual competition.
Publication
Journal: Genetics
April/11/2005
Abstract
Natural selection leaves its footprints on protein-coding sequences by modulating their silent and replacement evolutionary rates. In highly expressed genes in invertebrates, these footprints are seen in the higher codon usage bias and lower synonymous divergence. In mammals, the highly expressed genes have a shorter gene length in the genome and the breadth of expression is known to constrain the rate of protein evolution. Here we have examined how the rates of evolution of proteins encoded by the vertebrate genomes are modulated by the amount (intensity) of gene expression. To understand how natural selection operates on proteins that appear to have arisen in earlier and later phases of animal evolution, we have contrasted patterns of mouse proteins that have homologs in invertebrate and protist genomes (Precambrian genes) with those that do not have such detectable homologs (vertebrate-specific genes). We find that the intensity of gene expression relates inversely to the rate of protein sequence evolution on a genomic scale. The most highly expressed genes actually show the lowest total number of substitutions per polypeptide, consistent with cumulative effects of purifying selection on individual amino acid replacements. Precambrian genes exhibit a more pronounced difference in protein evolutionary rates (up to three times) between the genes with high and low expression levels as compared to the vertebrate-specific genes, which appears to be due to the narrower breadth of expression of the vertebrate-specific genes. These results provide insights into the differential relationship and effect of the increasing complexity of animal body form on evolutionary rates of proteins.
Publication
Journal: Molecular Biology and Evolution
January/6/2020
Abstract
The Molecular Evolutionary Genetics Analysis (MEGA) software enables comparative analysis of molecular sequences in phylogenetics and evolutionary medicine. Here, we introduce the macOS version of the MEGA software. This new version eliminates the need for virtualization and emulation programs previously required to use MEGA on Apple computers. MEGA for macOS utilizes memory and computing resources efficiently for conducting evolutionary analyses on Apple computers. It has a native Cocoa graphical user interface that is programmed to provide a consistent user experience across macOS, Windows, and Linux. MEGA for macOS is available from www.megasoftware.net free of charge.
Publication
Journal: Proceedings of the National Academy of Sciences of the United States of America
January/28/2013
Abstract
Molecular dating of species divergences has become an important means to add a temporal dimension to the Tree of Life. Increasingly larger datasets encompassing greater taxonomic diversity are becoming available to generate molecular timetrees by using sophisticated methods that model rate variation among lineages. However, the practical application of these methods is challenging because of the exorbitant calculation times required by current methods for contemporary data sizes, the difficulty in correctly modeling the rate heterogeneity in highly diverse taxonomic groups, and the lack of reliable clock calibrations and their uncertainty distributions for most groups of species. Here, we present a method that estimates relative times of divergences for all branching points (nodes) in very large phylogenetic trees without assuming a specific model for lineage rate variation or specifying any clock calibrations. The method (RelTime) performed better than existing methods when applied to very large computer simulated datasets where evolutionary rates were varied extensively among lineages by following autocorrelated and uncorrelated models. On average, RelTime completed calculations 1,000 times faster than the fastest Bayesian method, with even greater speed difference for larger number of sequences. This speed and accuracy will enable molecular dating analysis of very large datasets. Relative time estimates will be useful for determining the relative ordering and spacing of speciation events, identifying lineages with significantly slower or faster evolutionary rates, diagnosing the effect of selected calibrations on absolute divergence times, and estimating absolute times of divergence when highly reliable calibration points are available.
Publication
Journal: Trends in Genetics
June/10/2003
Abstract
For decades, molecular clocks have helped to illuminate the evolutionary timescale of life, but now genomic data pose a challenge for time estimation methods. It is unclear how to integrate data from many genes, each potentially evolving under a different model of substitution and at a different rate. Current methods can be grouped by the way the data are handled (genes considered separately or combined into a 'supergene') and the way gene-specific rate models are applied (global versus local clock). There are advantages and disadvantages to each of these approaches, and the optimal method has not yet emerged. Fortunately, time estimates inferred using many genes or proteins have greater precision and appear to be robust to different approaches.
Publication
Journal: Journal of Experimental Zoology Part B: Molecular and Developmental Evolution
March/28/2005
Abstract
Phylogenetic trees from multiple genes can be obtained in two fundamentally different ways. In one, gene sequences are concatenated into a super-gene alignment, which is then analyzed to generate the species tree. In the other, phylogenies are inferred separately from each gene, and a consensus of these gene phylogenies is used to represent the species tree. Here, we have compared these two approaches by means of computer simulation, using 448 parameter sets, including evolutionary rate, sequence length, base composition, and transition/transversion rate bias. In these simulations, we emphasized a worst-case scenario analysis in which 100 replicate datasets for each evolutionary parameter set (gene) were generated, and the replicate dataset that produced a tree topology showing the largest number of phylogenetic errors was selected to represent that parameter set. Both randomly selected and worst-case replicates were utilized to compare the consensus and concatenation approaches primarily using the neighbor-joining (NJ) method. We find that the concatenation approach yields more accurate trees, even when the sequences concatenated have evolved with very different substitution patterns and no attempts are made to accommodate these differences while inferring phylogenies. These results appear to hold true for parsimony and likelihood methods as well. The concatenation approach shows >95% accuracy with only 10 genes. However, this gain in accuracy is sometimes accompanied by reinforcement of certain systematic biases, resulting in spuriously high bootstrap support for incorrect partitions, whether we employ site, gene, or a combined bootstrap resampling approach. Therefore, it will be prudent to report the number of individual genes supporting an inferred clade in the concatenated sequence tree, in addition to the bootstrap support.
Publication
Journal: Nature Reviews Genetics
September/25/2005
Abstract
During the past four decades, the molecular-clock hypothesis has provided an invaluable tool for building evolutionary timescales, and has served as a null model for testing evolutionary and mutation rates in different species. Molecular clocks have also influenced the development of theories of molecular evolution. As DNA-sequencing technologies have progressed, the use of molecular clocks has increased, with a profound effect on our understanding of the temporal diversification of species and genomes.
Publication
Journal: International Journal of Nanomedicine
June/18/2012
Abstract
BACKGROUND
Zinc oxide nanoparticles (ZnO NPs) have received much attention for their implications in cancer therapy. It has been reported that ZnO NPs induce selective killing of cancer cells. However, the underlying molecular mechanisms behind the anticancer response of ZnO NPs remain unclear.
RESULTS
We investigated the cytotoxicity of ZnO NPs against three types of cancer cells (human hepatocellular carcinoma HepG2, human lung adenocarcinoma A549, and human bronchial epithelial BEAS-2B) and two primary rat cells (astrocytes and hepatocytes). Results showed that ZnO NPs exert distinct effects on mammalian cell viability via killing of all three types of cancer cells while posing no impact on normal rat astrocytes and hepatocytes. The toxicity mechanisms of ZnO NPs were further investigated using human liver cancer HepG2 cells. Both the mRNA and protein levels of tumor suppressor gene p53 and apoptotic gene bax were upregulated while the antiapoptotic gene bcl-2 was downregulated in ZnO NP-treated HepG2 cells. ZnO NPs were also found to induce activity of caspase-3 enzyme, DNA fragmentation, reactive oxygen species generation, and oxidative stress in HepG2 cells.
CONCLUSIONS
Overall, our data demonstrated that ZnO NPs selectively induce apoptosis in cancer cells, which is likely to be mediated by reactive oxygen species via p53 pathway, through which most of the anticancer drugs trigger apoptosis. This study provides preliminary guidance for the development of liver cancer therapy using ZnO NPs.
Publication
Journal: Bioinformatics
July/18/2013
Abstract
There is a growing need in the research community to apply the molecular evolutionary genetics analysis (MEGA) software tool for batch processing a large number of datasets and to integrate it into analysis workflows. Therefore, we now make available the computing core of the MEGA software as a stand-alone executable (MEGA-CC), along with an analysis prototyper (MEGA-Proto). MEGA-CC provides users with access to all the computational analyses available through MEGA's graphical user interface version. This includes methods for multiple sequence alignment, substitution model selection, evolutionary distance estimation, phylogeny inference, substitution rate and pattern estimation, tests of natural selection and ancestral sequence inference. Additionally, we have upgraded the source code for phylogenetic analysis using the maximum likelihood methods for parallel execution on multiple processors and cores. Here, we describe MEGA-CC and outline the steps for using MEGA-CC in tandem with MEGA-Proto for iterative and automated data analysis.
BACKGROUND
http://www.megasoftware.net/.
Publication
Journal: Molecular Biology and Evolution
March/16/2003
Abstract
Most of the sophisticated methods to estimate evolutionary divergence between DNA sequences assume that the two sequences have evolved with the same pattern of nucleotide substitution after their divergence from their most recent common ancestor (homogeneity assumption). If this assumption is violated, the evolutionary distance estimated will be biased, which may result in biased estimates of divergence times and substitution rates, and may lead to erroneous branching patterns in the inferred phylogenies. Here we present a simple modification for existing distance estimation methods to relax the assumption of the substitution pattern homogeneity among lineages when analyzing DNA and protein sequences. Results from computer simulations and empirical data analyses for human and mouse genes are presented to demonstrate that the proposed modification reduces the estimation bias considerably and that the modified method performs much better than the LogDet methods, which do not require the homogeneity assumption in estimating the number of substitutions per site. We also discuss the relationship of the substitution and mutation rate estimates when the substitution pattern is not the same in the lineages leading to the two sequences compared.
load more...