Citations
All
Search in:AllTitleAbstractAuthor name
Publications
(1M+)
Patents
Grants
Pathways
Clinical trials
Publication
Journal: Nucleic Acids Research
October/1/1997
Abstract
The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSI-BLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily.
Pulse
Views:
17
Posts:
No posts
Rating:
Not rated
Publication
Journal: BMJ (Clinical research ed.)
October/22/1997
Abstract
OBJECTIVE
Funnel plots (plots of effect estimates against sample size) may be useful to detect bias in meta-analyses that were later contradicted by large trials. We examined whether a simple test of asymmetry of funnel plots predicts discordance of results when meta-analyses are compared to large trials, and we assessed the prevalence of bias in published meta-analyses.
METHODS
Medline search to identify pairs consisting of a meta-analysis and a single large trial (concordance of results was assumed if effects were in the same direction and the meta-analytic estimate was within 30% of the trial); analysis of funnel plots from 37 meta-analyses identified from a hand search of four leading general medicine journals 1993-6 and 38 meta-analyses from the second 1996 issue of the Cochrane Database of Systematic Reviews.
METHODS
Degree of funnel plot asymmetry as measured by the intercept from regression of standard normal deviates against precision.
RESULTS
In the eight pairs of meta-analysis and large trial that were identified (five from cardiovascular medicine, one from diabetic medicine, one from geriatric medicine, one from perinatal medicine) there were four concordant and four discordant pairs. In all cases discordance was due to meta-analyses showing larger effects. Funnel plot asymmetry was present in three out of four discordant pairs but in none of concordant pairs. In 14 (38%) journal meta-analyses and 5 (13%) Cochrane reviews, funnel plot asymmetry indicated that there was bias.
CONCLUSIONS
A simple analysis of funnel plots provides a useful test for the likely presence of bias in meta-analyses, but as the capacity to detect bias will be limited when meta-analyses are based on a limited number of small trials the results from such analyses should be treated with considerable caution.
Pulse
Views:
4
Posts:
No posts
Rating:
Not rated
Publication
Journal: Cell
January/29/2008
Abstract
Successful reprogramming of differentiated human somatic cells into a pluripotent state would allow creation of patient- and disease-specific stem cells. We previously reported generation of induced pluripotent stem (iPS) cells, capable of germline transmission, from mouse somatic cells by transduction of four defined transcription factors. Here, we demonstrate the generation of iPS cells from adult human dermal fibroblasts with the same four factors: Oct3/4, Sox2, Klf4, and c-Myc. Human iPS cells were similar to human embryonic stem (ES) cells in morphology, proliferation, surface antigens, gene expression, epigenetic status of pluripotent cell-specific genes, and telomerase activity. Furthermore, these cells could differentiate into cell types of the three germ layers in vitro and in teratomas. These findings demonstrate that iPS cells can be generated from adult human fibroblasts.
Pulse
Views:
1
Posts:
No posts
Rating:
Not rated
Publication
Journal: Systematic Biology
December/23/2003
Abstract
The increase in the number of large data sets and the complexity of current probabilistic sequence evolution models necessitates fast and reliable phylogeny reconstruction methods. We describe a new approach, based on the maximum- likelihood principle, which clearly satisfies these requirements. The core of this method is a simple hill-climbing algorithm that adjusts tree topology and branch lengths simultaneously. This algorithm starts from an initial tree built by a fast distance-based method and modifies this tree to improve its likelihood at each iteration. Due to this simultaneous adjustment of the topology and branch lengths, only a few iterations are sufficient to reach an optimum. We used extensive and realistic computer simulations to show that the topological accuracy of this new method is at least as high as that of the existing maximum-likelihood programs and much higher than the performance of distance-based and parsimony approaches. The reduction of computing time is dramatic in comparison with other maximum-likelihood packages, while the likelihood maximization ability tends to be higher. For example, only 12 min were required on a standard personal computer to analyze a data set consisting of 500 rbcL sequences with 1,428 base pairs from plant plastids, thus reaching a speed of the same order as some popular distance-based and parsimony algorithms. This new method is implemented in the PHYML program, which is freely available on our web page: http://www.lirmm.fr/w3ifa/MAAS/.
Publication
Journal: Bioinformatics
July/28/2013
Abstract
BACKGROUND
Accurate alignment of high-throughput RNA-seq data is a challenging and yet unsolved problem because of the non-contiguous transcript structure, relatively short read lengths and constantly increasing throughput of the sequencing technologies. Currently available RNA-seq aligners suffer from high mapping error rates, low mapping speed, read length limitation and mapping biases.
RESULTS
To align our large (>80 billon reads) ENCODE Transcriptome RNA-seq dataset, we developed the Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. STAR outperforms other aligners by a factor of >50 in mapping speed, aligning to the human genome 550 million 2 × 76 bp paired-end reads per hour on a modest 12-core server, while at the same time improving alignment sensitivity and precision. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences. Using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons, we experimentally validated 1960 novel intergenic splice junctions with an 80-90% success rate, corroborating the high precision of the STAR mapping strategy.
METHODS
STAR is implemented as a standalone C++ code. STAR is free open source software distributed under GPLv3 license and can be downloaded from http://code.google.com/p/rna-star/.
Pulse
Views:
1
Posts:
No posts
Rating:
Not rated
Publication
Journal: Genetics
July/16/1989
Abstract
A series of yeast shuttle vectors and host strains has been created to allow more efficient manipulation of DNA in Saccharomyces cerevisiae. Transplacement vectors were constructed and used to derive yeast strains containing nonreverting his3, trp1, leu2 and ura3 mutations. A set of YCp and YIp vectors (pRS series) was then made based on the backbone of the multipurpose plasmid pBLUESCRIPT. These pRS vectors are all uniform in structure and differ only in the yeast selectable marker gene used (HIS3, TRP1, LEU2 and URA3). They possess all of the attributes of pBLUESCRIPT and several yeast-specific features as well. Using a pRS vector, one can perform most standard DNA manipulations in the same plasmid that is introduced into yeast.
Publication
Journal: Nature
September/13/2000
Abstract
Human breast tumours are diverse in their natural history and in their responsiveness to treatments. Variation in transcriptional programs accounts for much of the biological diversity of human cells and tumours. In each cell, signal transduction and regulatory systems transduce information from the cell's identity to its environmental status, thereby controlling the level of expression of every gene in the genome. Here we have characterized variation in gene expression patterns in a set of 65 surgical specimens of human breast tumours from 42 different individuals, using complementary DNA microarrays representing 8,102 human genes. These patterns provided a distinctive molecular portrait of each tumour. Twenty of the tumours were sampled twice, before and after a 16-week course of doxorubicin chemotherapy, and two tumours were paired with a lymph node metastasis from the same patient. Gene expression patterns in two tumour samples from the same individual were almost always more similar to each other than either was to any other sample. Sets of co-expressed genes were identified for which variation in messenger RNA levels could be related to specific features of physiological variation. The tumours could be classified into subtypes distinguished by pervasive differences in their gene expression patterns.
Pulse
Views:
1
Posts:
No posts
Rating:
Not rated
Publication
Journal: NeuroImage
March/19/2002
Abstract
An anatomical parcellation of the spatially normalized single-subject high-resolution T1 volume provided by the Montreal Neurological Institute (MNI) (D. L. Collins et al., 1998, Trans. Med. Imag. 17, 463-468) was performed. The MNI single-subject main sulci were first delineated and further used as landmarks for the 3D definition of 45 anatomical volumes of interest (AVOI) in each hemisphere. This procedure was performed using a dedicated software which allowed a 3D following of the sulci course on the edited brain. Regions of interest were then drawn manually with the same software every 2 mm on the axial slices of the high-resolution MNI single subject. The 90 AVOI were reconstructed and assigned a label. Using this parcellation method, three procedures to perform the automated anatomical labeling of functional studies are proposed: (1) labeling of an extremum defined by a set of coordinates, (2) percentage of voxels belonging to each of the AVOI intersected by a sphere centered by a set of coordinates, and (3) percentage of voxels belonging to each of the AVOI intersected by an activated cluster. An interface with the Statistical Parametric Mapping package (SPM, J. Ashburner and K. J. Friston, 1999, Hum. Brain Mapp. 7, 254-266) is provided as a freeware to researchers of the neuroimaging community. We believe that this tool is an improvement for the macroscopical labeling of activated area compared to labeling assessed using the Talairach atlas brain in which deformations are well known. However, this tool does not alleviate the need for more sophisticated labeling strategies based on anatomical or cytoarchitectonic probabilistic maps.
Pulse
Views:
13
Posts:
No posts
Rating:
Not rated
Publication
Journal: Biometrics
February/2/1989
Abstract
Methods of evaluating and comparing the performance of diagnostic tests are of increasing importance as new tests are developed and marketed. When a test is based on an observed variable that lies on a continuous or graded scale, an assessment of the overall value of the test can be made through the use of a receiver operating characteristic (ROC) curve. The curve is constructed by varying the cutpoint used to determine which values of the observed variable will be considered abnormal and then plotting the resulting sensitivities against the corresponding false positive rates. When two or more empirical curves are constructed based on tests performed on the same individuals, statistical analysis on differences between curves must take into account the correlated nature of the data. This paper presents a nonparametric approach to the analysis of areas under correlated ROC curves, by using the theory on generalized U-statistics to generate an estimated covariance matrix.
Publication
Journal: Proceedings of the National Academy of Sciences of the United States of America
May/19/1988
Abstract
We have developed three computer programs for comparisons of protein and DNA sequences. They can be used to search sequence data bases, evaluate similarity scores, and identify periodic structures based on local sequence similarity. The FASTA program is a more sensitive derivative of the FASTP program, which can be used to search protein or DNA sequence data bases and can compare a protein sequence to a DNA sequence data base by translating the DNA data base as it is searched. FASTA includes an additional step in the calculation of the initial pairwise similarity score that allows multiple regions of similarity to be joined to increase the score of related sequences. The RDF2 program can be used to evaluate the significance of similarity scores using a shuffling method that preserves local sequence composition. The LFASTA program can display all the regions of local similarity between two sequences with scores greater than a threshold, using the same scoring parameters and a similar alignment algorithm; these local similarities can be displayed as a "graphic matrix" plot or as individual alignments. In addition, these programs have been generalized to allow comparison of DNA or protein sequences based on a variety of alternative scoring matrices.
Pulse
Views:
2
Posts:
No posts
Rating:
Not rated
Publication
Journal: Nature
June/20/2005
Abstract
Recent work has revealed the existence of a class of small non-coding RNA species, known as microRNAs (miRNAs), which have critical functions across various biological processes. Here we use a new, bead-based flow cytometric miRNA expression profiling method to present a systematic expression analysis of 217 mammalian miRNAs from 334 samples, including multiple human cancers. The miRNA profiles are surprisingly informative, reflecting the developmental lineage and differentiation state of the tumours. We observe a general downregulation of miRNAs in tumours compared with normal tissues. Furthermore, we were able to successfully classify poorly differentiated tumours using miRNA expression profiles, whereas messenger RNA profiles were highly inaccurate when applied to the same samples. These findings highlight the potential of miRNA profiling in cancer diagnosis.
Publication
Journal: Medicine and Science in Sports and Exercise
December/10/2003
Abstract
BACKGROUND
Physical inactivity is a global concern, but diverse physical activity measures in use prevent international comparisons. The International Physical Activity Questionnaire (IPAQ) was developed as an instrument for cross-national monitoring of physical activity and inactivity.
METHODS
Between 1997 and 1998, an International Consensus Group developed four long and four short forms of the IPAQ instruments (administered by telephone interview or self-administration, with two alternate reference periods, either the "last 7 d" or a "usual week" of recalled physical activity). During 2000, 14 centers from 12 countries collected reliability and/or validity data on at least two of the eight IPAQ instruments. Test-retest repeatability was assessed within the same week. Concurrent (inter-method) validity was assessed at the same administration, and criterion IPAQ validity was assessed against the CSA (now MTI) accelerometer. Spearman's correlation coefficients are reported, based on the total reported physical activity.
RESULTS
Overall, the IPAQ questionnaires produced repeatable data (Spearman's rho clustered around 0.8), with comparable data from short and long forms. Criterion validity had a median rho of about 0.30, which was comparable to most other self-report validation studies. The "usual week" and "last 7 d" reference periods performed similarly, and the reliability of telephone administration was similar to the self-administered mode.
CONCLUSIONS
The IPAQ instruments have acceptable measurement properties, at least as good as other established self-reports. Considering the diverse samples in this study, IPAQ has reasonable measurement properties for monitoring population levels of physical activity among 18- to 65-yr-old adults in diverse settings. The short IPAQ form "last 7 d recall" is recommended for national monitoring and the long form for research requiring more detailed assessment.
Publication
Journal: JAMA - Journal of the American Medical Association
February/7/2020
Abstract
In December 2019, novel coronavirus (2019-nCoV)-infected pneumonia (NCIP) occurred in Wuhan, China. The number of cases has increased rapidly but information on the clinical characteristics of affected patients is limited.To describe the epidemiological and clinical characteristics of NCIP.Retrospective, single-center case series of the 138 consecutive hospitalized patients with confirmed NCIP at Zhongnan Hospital of Wuhan University in Wuhan, China, from January 1 to January 28, 2020; final date of follow-up was February 3, 2020.Documented NCIP.Epidemiological, demographic, clinical, laboratory, radiological, and treatment data were collected and analyzed. Outcomes of critically ill patients and noncritically ill patients were compared. Presumed hospital-related transmission was suspected if a cluster of health professionals or hospitalized patients in the same wards became infected and a possible source of infection could be tracked.Of 138 hospitalized patients with NCIP, the median age was 56 years (interquartile range, 42-68; range, 22-92 years) and 75 (54.3%) were men. Hospital-associated transmission was suspected as the presumed mechanism of infection for affected health professionals (40 [29%]) and hospitalized patients (17 [12.3%]). Common symptoms included fever (136 [98.6%]), fatigue (96 [69.6%]), and dry cough (82 [59.4%]). Lymphopenia (lymphocyte count, 0.8 × 109/L [interquartile range {IQR}, 0.6-1.1]) occurred in 97 patients (70.3%), prolonged prothrombin time (13.0 seconds [IQR, 12.3-13.7]) in 80 patients (58%), and elevated lactate dehydrogenase (261 U/L [IQR, 182-403]) in 55 patients (39.9%). Chest computed tomographic scans showed bilateral patchy shadows or ground glass opacity in the lungs of all patients. Most patients received antiviral therapy (oseltamivir, 124 [89.9%]), and many received antibacterial therapy (moxifloxacin, 89 [64.4%]; ceftriaxone, 34 [24.6%]; azithromycin, 25 [18.1%]) and glucocorticoid therapy (62 [44.9%]). Thirty-six patients (26.1%) were transferred to the intensive care unit (ICU) because of complications, including acute respiratory distress syndrome (22 [61.1%]), arrhythmia (16 [44.4%]), and shock (11 [30.6%]). The median time from first symptom to dyspnea was 5.0 days, to hospital admission was 7.0 days, and to ARDS was 8.0 days. Patients treated in the ICU (n = 36), compared with patients not treated in the ICU (n = 102), were older (median age, 66 years vs 51 years), were more likely to have underlying comorbidities (26 [72.2%] vs 38 [37.3%]), and were more likely to have dyspnea (23 [63.9%] vs 20 [19.6%]), and anorexia (24 [66.7%] vs 31 [30.4%]). Of the 36 cases in the ICU, 4 (11.1%) received high-flow oxygen therapy, 15 (41.7%) received noninvasive ventilation, and 17 (47.2%) received invasive ventilation (4 were switched to extracorporeal membrane oxygenation). As of February 3, 47 patients (34.1%) were discharged and 6 died (overall mortality, 4.3%), but the remaining patients are still hospitalized. Among those discharged alive (n = 47), the median hospital stay was 10 days (IQR, 7.0-14.0).In this single-center case series of 138 hospitalized patients with confirmed NCIP in Wuhan, China, presumed hospital-related transmission of 2019-nCoV was suspected in 41% of patients, 26% of patients received ICU care, and mortality was 4.3%.
Pulse
Views:
12
Posts:
No posts
Rating:
Not rated
Publication
Journal: BMC Bioinformatics
October/31/2011
Abstract
BACKGROUND
RNA-Seq is revolutionizing the way transcript abundances are measured. A key challenge in transcript quantification from RNA-Seq data is the handling of reads that map to multiple genes or isoforms. This issue is particularly important for quantification with de novo transcriptome assemblies in the absence of sequenced genomes, as it is difficult to determine which transcripts are isoforms of the same gene. A second significant issue is the design of RNA-Seq experiments, in terms of the number of reads, read length, and whether reads come from one or both ends of cDNA fragments.
RESULTS
We present RSEM, an user-friendly software package for quantifying gene and isoform abundances from single-end or paired-end RNA-Seq data. RSEM outputs abundance estimates, 95% credibility intervals, and visualization files and can also simulate RNA-Seq data. In contrast to other existing tools, the software does not require a reference genome. Thus, in combination with a de novo transcriptome assembler, RSEM enables accurate transcript quantification for species without sequenced genomes. On simulated and real data sets, RSEM has superior or comparable performance to quantification methods that rely on a reference genome. Taking advantage of RSEM's ability to effectively use ambiguously-mapping reads, we show that accurate gene-level abundance estimates are best obtained with large numbers of short single-end reads. On the other hand, estimates of the relative frequencies of isoforms within single genes may be improved through the use of paired-end reads, depending on the number of possible splice forms for each gene.
CONCLUSIONS
RSEM is an accurate and user-friendly software tool for quantifying transcript abundances from RNA-Seq data. As it does not rely on the existence of a reference genome, it is particularly useful for quantification with de novo transcriptome assemblies. In addition, RSEM has enabled valuable guidance for cost-efficient design of quantification experiments with RNA-Seq, which is currently relatively expensive.
Pulse
Views:
1
Posts:
No posts
Rating:
Not rated
Publication
Journal: Nature
February/4/2020
Abstract
Since the SARS outbreak 18 years ago, a large number of severe acute respiratory syndrome-related coronaviruses (SARSr-CoV) have been discovered in their natural reservoir host, bats1-4. Previous studies indicated that some of those bat SARSr-CoVs have the potential to infect humans5-7. Here we report the identification and characterization of a novel coronavirus (2019-nCoV) which caused an epidemic of acute respiratory syndrome in humans in Wuhan, China. The epidemic, which started from 12 December 2019, has caused 2,050 laboratory-confirmed infections with 56 fatal cases by 26 January 2020. Full-length genome sequences were obtained from five patients at the early stage of the outbreak. They are almost identical to each other and share 79.5% sequence identify to SARS-CoV. Furthermore, it was found that 2019-nCoV is 96% identical at the whole-genome level to a bat coronavirus. The pairwise protein sequence analysis of seven conserved non-structural proteins show that this virus belongs to the species of SARSr-CoV. The 2019-nCoV virus was then isolated from the bronchoalveolar lavage fluid of a critically ill patient, which can be neutralized by sera from several patients. Importantly, we have confirmed that this novel CoV uses the same cell entry receptor, ACE2, as SARS-CoV.
Pulse
Views:
1
Posts:
No posts
Rating:
Not rated
Publication
Journal: Arthritis and rheumatism
December/20/1982
Abstract
The 1971 preliminary criteria for the classification of systemic lupus erythematosus (SLE) were revised and updated to incorporate new immunologic knowledge and improve disease classification. The 1982 revised criteria include fluorescence antinuclear antibody and antibody to native DNA and Sm antigen. Some criteria involving the same organ systems were aggregated into single criteria. Raynaud's phenomenon and alopecia were not included in the 1982 revised criteria because of low sensitivity and specificity. The new criteria were 96% sensitive and 96% specific when tested with SLE and control patient data gathered from 18 participating clinics. When compared with the 1971 criteria, the 1982 revised criteria showed gains in sensitivity and specificity.
Publication
Journal: Radiology
May/20/1982
Abstract
A representation and interpretation of the area under a receiver operating characteristic (ROC) curve obtained by the "rating" method, or by mathematical predictions based on patient characteristics, is presented. It is shown that in such a setting the area represents the probability that a randomly chosen diseased subject is (correctly) rated or ranked with greater suspicion than a randomly chosen non-diseased subject. Moreover, this probability of a correct ranking is the same quantity that is estimated by the already well-studied nonparametric Wilcoxon statistic. These two relationships are exploited to (a) provide rapid closed-form expressions for the approximate magnitude of the sampling variability, i.e., standard error that one uses to accompany the area under a smoothed ROC curve, (b) guide in determining the size of the sample required to provide a sufficiently reliable estimate of this area, and (c) determine how large sample sizes should be to ensure that one can statistically detect differences in the accuracy of diagnostic techniques.
Publication
Journal: Nature
March/11/2002
Abstract
Breast cancer patients with the same stage of disease can have markedly different treatment responses and overall outcome. The strongest predictors for metastases (for example, lymph node status and histological grade) fail to classify accurately breast tumours according to their clinical behaviour. Chemotherapy or hormonal therapy reduces the risk of distant metastases by approximately one-third; however, 70-80% of patients receiving this treatment would have survived without it. None of the signatures of breast cancer gene expression reported to date allow for patient-tailored therapy strategies. Here we used DNA microarray analysis on primary breast tumours of 117 young patients, and applied supervised classification to identify a gene expression signature strongly predictive of a short interval to distant metastases ('poor prognosis' signature) in patients without tumour cells in local lymph nodes at diagnosis (lymph node negative). In addition, we established a signature that identifies tumours of BRCA1 carriers. The poor prognosis signature consists of genes regulating cell cycle, invasion, metastasis and angiogenesis. This gene expression profile will outperform all currently used clinical parameters in predicting disease outcome. Our findings provide a strategy to select patients who would benefit from adjuvant therapy.
Publication
Journal: Molecular Biology and Evolution
June/14/2000
Abstract
The use of some multiple-sequence alignments in phylogenetic analysis, particularly those that are not very well conserved, requires the elimination of poorly aligned positions and divergent regions, since they may not be homologous or may have been saturated by multiple substitutions. A computerized method that eliminates such positions and at the same time tries to minimize the loss of informative sites is presented here. The method is based on the selection of blocks of positions that fulfill a simple set of requirements with respect to the number of contiguous conserved positions, lack of gaps, and high conservation of flanking positions, making the final alignment more suitable for phylogenetic analysis. To illustrate the efficiency of this method, alignments of 10 mitochondrial proteins from several completely sequenced mitochondrial genomes belonging to diverse eukaryotes were used as examples. The percentages of removed positions were higher in the most divergent alignments. After removing divergent segments, the amino acid composition of the different sequences was more uniform, and pairwise distances became much smaller. Phylogenetic trees show that topologies can be different after removing conserved blocks, particularly when there are several poorly resolved nodes. Strong support was found for the grouping of animals and fungi but not for the position of more basal eukaryotes. The use of a computerized method such as the one presented here reduces to a certain extent the necessity of manually editing multiple alignments, makes the automation of phylogenetic analysis of large data sets feasible, and facilitates the reproduction of the final alignment by other researchers.
Publication
Journal: Analytical Chemistry
September/12/1996
Abstract
Proteins from silver-stained gels can be digested enzymatically and the resulting peptide analyzed and sequenced by mass spectrometry. Standard proteins yield the same peptide maps when extracted from Coomassie- and silver-stained gels, as judged by electrospray and MALDI mass spectrometry. The low nanogram range can be reached by the protocols described here, and the method is robust. A silver-stained one-dimensional gel of a fraction from yeast proteins was analyzed by nano-electrospray tandem mass spectrometry. In the sequencing, more than 1000 amino acids were covered, resulting in no evidence of chemical modifications due to the silver staining procedure. Silver staining allows a substantial shortening of sample preparation time and may, therefore, be preferable over Coomassie staining. This work removes a major obstacle to the low-level sequence analysis of proteins separated on polyacrylamide gels.
Publication
Journal: Bioinformatics
September/7/2006
Abstract
BACKGROUND
In 2001 and 2002, we published two papers (Bioinformatics, 17, 282-283, Bioinformatics, 18, 77-82) describing an ultrafast protein sequence clustering program called cd-hit. This program can efficiently cluster a huge protein database with millions of sequences. However, the applications of the underlying algorithm are not limited to only protein sequences clustering, here we present several new programs using the same algorithm including cd-hit-2d, cd-hit-est and cd-hit-est-2d. Cd-hit-2d compares two protein datasets and reports similar matches between them; cd-hit-est clusters a DNA/RNA sequence database and cd-hit-est-2d compares two nucleotide datasets. All these programs can handle huge datasets with millions of sequences and can be hundreds of times faster than methods based on the popular sequence comparison and database search tools, such as BLAST.
Publication
Journal: New England Journal of Medicine
March/15/2012
Abstract
BACKGROUND
Intratumor heterogeneity may foster tumor evolution and adaptation and hinder personalized-medicine strategies that depend on results from single tumor-biopsy samples.
METHODS
To examine intratumor heterogeneity, we performed exome sequencing, chromosome aberration analysis, and ploidy profiling on multiple spatially separated samples obtained from primary renal carcinomas and associated metastatic sites. We characterized the consequences of intratumor heterogeneity using immunohistochemical analysis, mutation functional analysis, and profiling of messenger RNA expression.
RESULTS
Phylogenetic reconstruction revealed branched evolutionary tumor growth, with 63 to 69% of all somatic mutations not detectable across every tumor region. Intratumor heterogeneity was observed for a mutation within an autoinhibitory domain of the mammalian target of rapamycin (mTOR) kinase, correlating with S6 and 4EBP phosphorylation in vivo and constitutive activation of mTOR kinase activity in vitro. Mutational intratumor heterogeneity was seen for multiple tumor-suppressor genes converging on loss of function; SETD2, PTEN, and KDM5C underwent multiple distinct and spatially separated inactivating mutations within a single tumor, suggesting convergent phenotypic evolution. Gene-expression signatures of good and poor prognosis were detected in different regions of the same tumor. Allelic composition and ploidy profiling analysis revealed extensive intratumor heterogeneity, with 26 of 30 tumor samples from four tumors harboring divergent allelic-imbalance profiles and with ploidy heterogeneity in two of four tumors.
CONCLUSIONS
Intratumor heterogeneity can lead to underestimation of the tumor genomics landscape portrayed from single tumor-biopsy samples and may present major challenges to personalized-medicine and biomarker development. Intratumor heterogeneity, associated with heterogeneous protein function, may foster tumor adaptation and therapeutic failure through Darwinian selection. (Funded by the Medical Research Council and others.).
Pulse
Views:
1
Posts:
No posts
Rating:
Not rated
Publication
Journal: Evolution; international journal of organic evolution
May/30/2017
Abstract
The recently-developed statistical method known as the "bootstrap" can be used to place confidence intervals on phylogenies. It involves resampling points from one's own data, with replacement, to create a series of bootstrap samples of the same size as the original data. Each of these is analyzed, and the variation among the resulting estimates taken to indicate the size of the error involved in making estimates from the original data. In the case of phylogenies, it is argued that the proper method of resampling is to keep all of the original species while sampling characters with replacement, under the assumption that the characters have been independently drawn by the systematist and have evolved independently. Majority-rule consensus trees can be used to construct a phylogeny showing all of the inferred monophyletic groups that occurred in a majority of the bootstrap samples. If a group shows up 95% of the time or more, the evidence for it is taken to be statistically significant. Existing computer programs can be used to analyze different bootstrap samples by using weights on the characters, the weight of a character being how many times it was drawn in bootstrap sampling. When all characters are perfectly compatible, as envisioned by Hennig, bootstrap sampling becomes unnecessary; the bootstrap method would show significant evidence for a group if it is defined by three or more characters.
Publication
Journal: American Journal of Respiratory and Critical Care Medicine
October/17/2007
Abstract
Chronic obstructive pulmonary disease (COPD) remains a major public health problem. It is the fourth leading cause of chronic morbidity and mortality in the United States, and is projected to rank fifth in 2020 in burden of disease worldwide, according to a study published by the World Bank/World Health Organization. Yet, COPD remains relatively unknown or ignored by the public as well as public health and government officials. In 1998, in an effort to bring more attention to COPD, its management, and its prevention, a committed group of scientists encouraged the U.S. National Heart, Lung, and Blood Institute and the World Health Organization to form the Global Initiative for Chronic Obstructive Lung Disease (GOLD). Among the important objectives of GOLD are to increase awareness of COPD and to help the millions of people who suffer from this disease and die prematurely of it or its complications. The first step in the GOLD program was to prepare a consensus report, Global Strategy for the Diagnosis, Management, and Prevention of COPD, published in 2001. The present, newly revised document follows the same format as the original consensus report, but has been updated to reflect the many publications on COPD that have appeared. GOLD national leaders, a network of international experts, have initiated investigations of the causes and prevalence of COPD in their countries, and developed innovative approaches for the dissemination and implementation of COPD management guidelines. We appreciate the enormous amount of work the GOLD national leaders have done on behalf of their patients with COPD. Despite the achievements in the 5 years since the GOLD report was originally published, considerable additional work is ahead of us if we are to control this major public health problem. The GOLD initiative will continue to bring COPD to the attention of governments, public health officials, health care workers, and the general public, but a concerted effort by all involved in health care will be necessary.
load more...