Wei Wang
Citations
All
Search in:AllTitleAbstractAuthor name
Publications
(13K+)
Patents
Grants
Pathways
Clinical trials
Publication
Journal: Journal of computational chemistry
December/1/2005
Abstract
NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD scales to hundreds of processors on high-end parallel platforms, as well as tens of processors on low-cost commodity clusters, and also runs on individual desktop and laptop computers. NAMD works with AMBER and CHARMM potential functions, parameters, and file formats. This article, directed to novices as well as experts, first introduces concepts and methods used in the NAMD program, describing the classical molecular dynamics force field, equations of motion, and integration methods along with the efficient electrostatics evaluation algorithms employed and temperature and pressure controls used. Features for steering the simulation across barriers and for calculating both alchemical and conformational free energy differences are presented. The motivations for and a roadmap to the internal design of NAMD, implemented in C++ and based on Charm++ parallel objects, are outlined. The factors affecting the serial and parallel performance of a simulation are discussed. Finally, typical NAMD use is illustrated with representative applications to a small, a medium, and a large biomolecular system, highlighting particular features of NAMD, for example, the Tcl scripting language. The article also provides a list of the key features of NAMD and discusses the benefits of combining NAMD with the molecular graphics/sequence analysis software VMD and the grid computing/collaboratory software BioCoRE. NAMD is distributed free of charge with source code at www.ks.uiuc.edu.
Publication
Journal: Nature
November/20/2007
Abstract
We describe the Phase II HapMap, which characterizes over 3.1 million human single nucleotide polymorphisms (SNPs) genotyped in 270 individuals from four geographically diverse populations and includes 25-35% of common SNP variation in the populations surveyed. The map is estimated to capture untyped common variation with an average maximum r2 of between 0.9 and 0.96 depending on population. We demonstrate that the current generation of commercial genome-wide genotyping products captures common Phase II SNPs with an average maximum r2 of up to 0.8 in African and up to 0.95 in non-African populations, and that potential gains in power in association studies can be obtained through imputation. These data also reveal novel aspects of the structure of linkage disequilibrium. We show that 10-30% of pairs of individuals within a population share at least one region of extended genetic identity arising from recent ancestry and that up to 1% of all common variants are untaggable, primarily because they lie within recombination hotspots. We show that recombination rates vary systematically around genes and between genes of different function. Finally, we demonstrate increased differentiation at non-synonymous, compared to synonymous, SNPs, resulting from systematic differences in the strength or efficacy of natural selection between populations.
Publication
Journal: Nature genetics
June/19/2007
Abstract
Eukaryotic gene transcription is accompanied by acetylation and methylation of nucleosomes near promoters, but the locations and roles of histone modifications elsewhere in the genome remain unclear. We determined the chromatin modification states in high resolution along 30 Mb of the human genome and found that active promoters are marked by trimethylation of Lys4 of histone H3 (H3K4), whereas enhancers are marked by monomethylation, but not trimethylation, of H3K4. We developed computational algorithms using these distinct chromatin signatures to identify new regulatory elements, predicting over 200 promoters and 400 enhancers within the 30-Mb region. This approach accurately predicted the location and function of independently identified regulatory elements with high sensitivity and specificity and uncovered a novel functional enhancer for the carnitine transporter SLC22A5 (OCTN2). Our results give insight into the connections between chromatin modifications and transcriptional regulatory activity and provide a new tool for the functional annotation of the human genome.
Publication
Journal: Nature
March/3/2015
Abstract
The reference human genome sequence set the stage for studies of genetic variation and its association with human disease, but epigenomic studies lack a similar reference. To address this need, the NIH Roadmap Epigenomics Consortium generated the largest collection so far of human epigenomes for primary cells and tissues. Here we describe the integrative analysis of 111 reference human epigenomes generated as part of the programme, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression. We establish global maps of regulatory elements, define regulatory modules of coordinated activity, and their likely activators and repressors. We show that disease- and trait-associated genetic variants are enriched in tissue-specific epigenomic marks, revealing biologically relevant cell types for diverse human traits, and providing a resource for interpreting the molecular basis of human disease. Our results demonstrate the central role of epigenomic information for understanding gene regulation, cellular differentiation and human disease.
Publication
Journal: Nature
November/20/2007
Abstract
With the advent of dense maps of human genetic variation, it is now possible to detect positive natural selection across the human genome. Here we report an analysis of over 3 million polymorphisms from the International HapMap Project Phase 2 (HapMap2). We used 'long-range haplotype' methods, which were developed to identify alleles segregating in a population that have undergone recent selection, and we also developed new methods that are based on cross-population comparisons to discover alleles that have swept to near-fixation within a population. The analysis reveals more than 300 strong candidate regions. Focusing on the strongest 22 regions, we develop a heuristic for scrutinizing these regions to identify candidate targets of selection. In a complementary analysis, we identify 26 non-synonymous, coding, single nucleotide polymorphisms showing regional evidence of positive selection. Examination of these candidates highlights three cases in which two genes in a common biological process have apparently undergone positive selection in the same population:LARGE and DMD, both related to infection by the Lassa virus, in West Africa;SLC24A5 and SLC45A2, both involved in skin pigmentation, in Europe; and EDAR and EDA2R, both involved in development of hair follicles, in Asia.
Publication
Journal: Nature
April/19/2009
Abstract
Transgenic expression of just four defined transcription factors (c-Myc, Klf4, Oct4 and Sox2) is sufficient to reprogram somatic cells to a pluripotent state. The resulting induced pluripotent stem (iPS) cells resemble embryonic stem cells in their properties and potential to differentiate into a spectrum of adult cell types. Current reprogramming strategies involve retroviral, lentiviral, adenoviral and plasmid transfection to deliver reprogramming factor transgenes. Although the latter two methods are transient and minimize the potential for insertion mutagenesis, they are currently limited by diminished reprogramming efficiencies. piggyBac (PB) transposition is host-factor independent, and has recently been demonstrated to be functional in various human and mouse cell lines. The PB transposon/transposase system requires only the inverted terminal repeats flanking a transgene and transient expression of the transposase enzyme to catalyse insertion or excision events. Here we demonstrate successful and efficient reprogramming of murine and human embryonic fibroblasts using doxycycline-inducible transcription factors delivered by PB transposition. Stable iPS cells thus generated express characteristic pluripotency markers and succeed in a series of rigorous differentiation assays. By taking advantage of the natural propensity of the PB system for seamless excision, we show that the individual PB insertions can be removed from established iPS cell lines, providing an invaluable tool for discovery. In addition, we have demonstrated the traceless removal of reprogramming factors joined with viral 2A sequences delivered by a single transposon from murine iPS lines. We anticipate that the unique properties of this virus-independent simplification of iPS cell production will accelerate this field further towards full exploration of the reprogramming process and future cell-based therapies.
Publication
Journal: Nature genetics
February/25/2008
Abstract
Systemic lupus erythematosus (SLE) is a common systemic autoimmune disease with complex etiology but strong clustering in families (lambda(S) = approximately 30). We performed a genome-wide association scan using 317,501 SNPs in 720 women of European ancestry with SLE and in 2,337 controls, and we genotyped consistently associated SNPs in two additional independent sample sets totaling 1,846 affected women and 1,825 controls. Aside from the expected strong association between SLE and the HLA region on chromosome 6p21 and the previously confirmed non-HLA locus IRF5 on chromosome 7q32, we found evidence of association with replication (1.1 x 10(-7) < P(overall) < 1.6 x 10(-23); odds ratio = 0.82-1.62) in four regions: 16p11.2 (ITGAM), 11p15.5 (KIAA1542), 3p14.3 (PXK) and 1q25.1 (rs10798269). We also found evidence for association (P < 1 x 10(-5)) at FCGR2A, PTPN22 and STAT4, regions previously associated with SLE and other autoimmune diseases, as well as at > or =9 other loci (P < 2 x 10(-7)). Our results show that numerous genes, some with known immune-related functions, predispose to SLE.
Publication
Journal: Nature
December/3/2008
Abstract
Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the Asian individual's genome. We identified approximately 3 million single-nucleotide polymorphisms (SNPs) inside this region, of which 13.6% were not in the dbSNP database. Genotyping analysis showed that SNP identification had high accuracy and consistency, indicating the high sequence quality of this assembly. We also carried out heterozygote phasing and haplotype prediction against HapMap CHB and JPT haplotypes (Chinese and Japanese, respectively), sequence comparison with the two available individual genomes (J. D. Watson and J. C. Venter), and structural variation identification. These variations were considered for their potential biological impact. Our sequence data and analyses demonstrate the potential usefulness of next-generation sequencing technologies for personal genomics.
Publication
Journal: Journal of molecular graphics & modelling
January/7/2007
Abstract
In molecular mechanics (MM) studies, atom types and/or bond types of molecules are needed to determine prior to energy calculations. We present here an automatic algorithm of perceiving atom types that are defined in a description table, and an automatic algorithm of assigning bond types just based on atomic connectivity. The algorithms have been implemented in a new module of the AMBER packages. This auxiliary module, antechamber (roughly meaning "before AMBER"), can be applied to generate necessary inputs of leap-the AMBER program to generate topologies for minimization, molecular dynamics, etc., for most organic molecules. The algorithms behind the manipulations may be useful for other molecular mechanical packages as well as applications that need to designate atom types and bond types.
Publication
Journal: Cell
June/21/2012
Abstract
Tumor maintenance relies on continued activity of driver oncogenes, although their rate-limiting role is highly context dependent. Oncogenic Kras mutation is the signature event in pancreatic ductal adenocarcinoma (PDAC), serving a critical role in tumor initiation. Here, an inducible Kras(G12D)-driven PDAC mouse model establishes that advanced PDAC remains strictly dependent on Kras(G12D) expression. Transcriptome and metabolomic analyses indicate that Kras(G12D) serves a vital role in controlling tumor metabolism through stimulation of glucose uptake and channeling of glucose intermediates into the hexosamine biosynthesis and pentose phosphate pathways (PPP). These studies also reveal that oncogenic Kras promotes ribose biogenesis. Unlike canonical models, we demonstrate that Kras(G12D) drives glycolysis intermediates into the nonoxidative PPP, thereby decoupling ribose biogenesis from NADP/NADPH-mediated redox control. Together, this work provides in vivo mechanistic insights into how oncogenic Kras promotes metabolic reprogramming in native tumors and illuminates potential metabolic targets that can be exploited for therapeutic benefit in PDAC.
Publication
Journal: Science (New York, N.Y.)
July/12/2010
Abstract
Residents of the Tibetan Plateau show heritable adaptations to extreme altitude. We sequenced 50 exomes of ethnic Tibetans, encompassing coding sequences of 92% of human genes, with an average coverage of 18x per individual. Genes showing population-specific allele frequency changes, which represent strong candidates for altitude adaptation, were identified. The strongest signal of natural selection came from endothelial Per-Arnt-Sim (PAS) domain protein 1 (EPAS1), a transcription factor involved in response to hypoxia. One single-nucleotide polymorphism (SNP) at EPAS1 shows a 78% frequency difference between Tibetan and Han samples, representing the fastest allele frequency change observed at any human gene to date. This SNP's association with erythrocyte abundance supports the role of EPAS1 in adaptation to hypoxia. Thus, a population genomic survey has revealed a functionally important locus in genetic adaptation to high altitude.
Publication
Journal: Cell stem cell
September/20/2010
Abstract
Human embryonic stem cells (hESCs) share an identical genome with lineage-committed cells, yet possess the remarkable properties of self-renewal and pluripotency. The diverse cellular properties in different cells have been attributed to their distinct epigenomes, but how much epigenomes differ remains unclear. Here, we report that epigenomic landscapes in hESCs and lineage-committed cells are drastically different. By comparing the chromatin-modification profiles and DNA methylomes in hESCs and primary fibroblasts, we find that nearly one-third of the genome differs in chromatin structure. Most changes arise from dramatic redistributions of repressive H3K9me3 and H3K27me3 marks, which form blocks that significantly expand in fibroblasts. A large number of potential regulatory sequences also exhibit a high degree of dynamics in chromatin modifications and DNA methylation. Additionally, we observe novel, context-dependent relationships between DNA methylation and chromatin modifications. Our results provide new insights into epigenetic mechanisms underlying properties of pluripotency and cell fate commitment.
Publication
Journal: Nature
November/25/2012
Abstract
The Pacific oyster Crassostrea gigas belongs to one of the most species-rich but genomically poorly explored phyla, the Mollusca. Here we report the sequencing and assembly of the oyster genome using short reads and a fosmid-pooling strategy, along with transcriptomes of development and stress response and the proteome of the shell. The oyster genome is highly polymorphic and rich in repetitive sequences, with some transposable elements still actively shaping variation. Transcriptome studies reveal an extensive set of genes responding to environmental stress. The expansion of genes coding for heat shock protein 70 and inhibitors of apoptosis is probably central to the oyster's adaptation to sessile life in the highly stressful intertidal zone. Our analyses also show that shell formation in molluscs is more complex than currently understood and involves extensive participation of cells and their exosomes. The oyster genome sequence fills a void in our understanding of the Lophotrochozoa.
Publication
Journal: Nature
June/10/2008
Abstract
Papaya, a fruit crop cultivated in tropical and subtropical regions, is known for its nutritional benefits and medicinal applications. Here we report a 3x draft genome sequence of 'SunUp' papaya, the first commercial virus-resistant transgenic fruit tree to be sequenced. The papaya genome is three times the size of the Arabidopsis genome, but contains fewer genes, including significantly fewer disease-resistance gene analogues. Comparison of the five sequenced genomes suggests a minimal angiosperm gene set of 13,311. A lack of recent genome duplication, atypical of other angiosperm genomes sequenced so far, may account for the smaller papaya gene number in most functional groups. Nonetheless, striking amplifications in gene number within particular functional groups suggest roles in the evolution of tree-like habit, deposition and remobilization of starch reserves, attraction of seed dispersal agents, and adaptation to tropical daylengths. Transgenesis at three locations is closely associated with chloroplast insertions into the nuclear genome, and with topoisomerase I recognition sites. Papaya offers numerous advantages as a system for fruit-tree functional genomics, and this draft genome sequence provides the foundation for revealing the basis of Carica's distinguishing morpho-physiological, medicinal and nutritional properties.
Publication
Journal: Nature genetics
May/26/2010
Abstract
Chronic kidney disease (CKD) is a significant public health problem, and recent genetic studies have identified common CKD susceptibility variants. The CKDGen consortium performed a meta-analysis of genome-wide association data in 67,093 individuals of European ancestry from 20 predominantly population-based studies in order to identify new susceptibility loci for reduced renal function as estimated by serum creatinine (eGFRcrea), serum cystatin c (eGFRcys) and CKD (eGFRcrea < 60 ml/min/1.73 m(2); n = 5,807 individuals with CKD (cases)). Follow-up of the 23 new genome-wide-significant loci (P < 5 x 10(-8)) in 22,982 replication samples identified 13 new loci affecting renal function and CKD (in or near LASS2, GCKR, ALMS1, TFDP2, DAB2, SLC34A1, VEGFA, PRKAG2, PIP5K1B, ATXN2, DACH1, UBE2Q2 and SLC7A9) and 7 loci suspected to affect creatinine production and secretion (CPS1, SLC22A2, TMEM60, WDR37, SLC6A13, WDR72 and BCAS3). These results further our understanding of the biologic mechanisms of kidney function by identifying loci that potentially influence nephrogenesis, podocyte function, angiogenesis, solute transport and metabolic functions of the kidney.