The two most commonly used methods to analyze data from real-time, quantitative PCR experiments are absolute quantification and relative quantification. Absolute quantification determines the input copy number, usually by relating the PCR signal to a standard curve. Relative quantification relates the PCR signal of the target transcript in a treatment group to that of another sample such as an untreated control. The 2(-Delta Delta C(T)) method is a convenient way to analyze the relative changes in gene expression from real-time quantitative PCR experiments. The purpose of this report is to present the derivation, assumptions, and applications of the 2(-Delta Delta C(T)) method. In addition, we present the derivation and applications of two variations of the 2(-Delta Delta C(T)) method that may be useful in the analysis of real-time, quantitative PCR data.
The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSI-BLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily.
A new method of total RNA isolation by a single extraction with an acid guanidinium thiocyanate-phenol-chloroform mixture is described. The method provides a pure preparation of undegraded RNA in high yield and can be completed within 4 h. It is particularly useful for processing large numbers of samples and for isolation of RNA from minute quantities of cells or tissue samples.
The global burden of cancer continues to increase largely because of the aging and growth of the world population alongside an increasing adoption of cancer-causing behaviors, particularly smoking, in economically developing countries. Based on the GLOBOCAN 2008 estimates, about 12.7 million cancer cases and 7.6 million cancer deaths are estimated to have occurred in 2008; of these, 56% of the cases and 64% of the deaths occurred in the economically developing world. Breast cancer is the most frequently diagnosed cancer and the leading cause of cancer death among females, accounting for 23% of the total cancer cases and 14% of the cancer deaths. Lung cancer is the leading cancer site in males, comprising 17% of the total new cancer cases and 23% of the total cancer deaths. Breast cancer is now also the leading cause of cancer death among females in economically developing countries, a shift from the previous decade during which the most common cause of cancer death was cervical cancer. Further, the mortality burden for lung cancer among females in developing countries is as high as the burden for cervical cancer, with each accounting for 11% of the total female cancer deaths. Although overall cancer incidence rates in the developing world are half those seen in the developed world in both sexes, the overall cancer mortality rates are generally similar. Cancer survival tends to be poorer in developing countries, most likely because of a combination of a late stage at diagnosis and limited access to timely and standard treatment. A substantial proportion of the worldwide burden of cancer could be prevented through the application of existing cancer control knowledge and by implementing programs for tobacco control, vaccination (for liver and cervical cancers), and early detection and treatment, as well as public health campaigns promoting physical activity and a healthier dietary intake. Clinicians, public health professionals, and policy makers can play an active role in accelerating the application of such interventions globally.
The hallmarks of cancer comprise six biological capabilities acquired during the multistep development of human tumors. The hallmarks constitute an organizing principle for rationalizing the complexities of neoplastic disease. They include sustaining proliferative signaling, evading growth suppressors, resisting cell death, enabling replicative immortality, inducing angiogenesis, and activating invasion and metastasis. Underlying these hallmarks are genome instability, which generates the genetic diversity that expedites their acquisition, and inflammation, which fosters multiple hallmark functions. Conceptual progress in the last decade has added two emerging hallmarks of potential generality to this list-reprogramming of energy metabolism and evading immune destruction. In addition to cancer cells, tumors exhibit another dimension of complexity: they contain a repertoire of recruited, ostensibly normal cells that contribute to the acquisition of hallmark traits by creating the "tumor microenvironment." Recognition of the widespread applicability of these concepts will increasingly affect the development of new means to treat human cancer.
MicroRNAs (miRNAs) are endogenous approximately 22 nt RNAs that can play important regulatory roles in animals and plants by targeting mRNAs for cleavage or translational repression. Although they escaped notice until relatively recently, miRNAs comprise one of the more abundant classes of gene regulatory molecules in multicellular organisms and likely influence the output of many protein-coding genes.
Whole-genome association studies (WGAS) bring new computational, as well as analytic, challenges to researchers. Many existing genetic-analysis tools are not designed to handle such large data sets in a convenient manner and do not necessarily exploit the new opportunities that whole-genome data bring. To address these issues, we developed PLINK, an open-source C/C++ WGAS tool set. With PLINK, large data sets comprising hundreds of thousands of markers genotyped for thousands of individuals can be rapidly manipulated and analyzed in their entirety. As well as providing tools to make the basic analytic steps computationally efficient, PLINK also supports some novel approaches to whole-genome data that take advantage of whole-genome coverage. We introduce PLINK and describe the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation. In particular, we focus on the estimation and use of identity-by-state and identity-by-descent information in the context of population-based whole-genome studies. This information can be used to detect and correct for population stratification and to identify extended chromosomal segments that are shared identical by descent between very distantly related individuals. Analysis of the patterns of segmental sharing has the potential to map disease loci that contain multiple rare variants in a population-based linkage analysis.
Although genomewide RNA expression analysis has become a routine tool in biomedical research, extracting biological insight from such information remains a major challenge. Here, we describe a powerful analytical method called Gene Set Enrichment Analysis (GSEA) for interpreting gene expression data. The method derives its power by focusing on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation. We demonstrate how GSEA yields insights into several cancer-related data sets, including leukemia and lung cancer. Notably, where single-gene analysis finds little similarity between two independent studies of patient survival in lung cancer, GSEA reveals many biological pathways in common. The GSEA method is embodied in a freely available software package, together with an initial database of 1,325 biologically defined gene sets.
A technique for conveniently radiolabeling DNA restriction endonuclease fragments to high specific activity is described. DNA fragments are purified from agarose gels directly by ethanol precipitation and are then denatured and labeled with the large fragment of DNA polymerase I, using random oligonucleotides as primers. Over 70% of the precursor triphosphate is routinely incorporated into complementary DNA, and specific activities of over 10(9) dpm/microgram of DNA can be obtained using relatively small amounts of precursor. These "oligolabeled" DNA fragments serve as efficient probes in filter hybridization experiments.
This paper examines eight published reviews each reporting results from several related trials. Each review pools the results from the relevant trials in order to evaluate the efficacy of a certain treatment for a specified medical condition. These reviews lack consistent assessment of homogeneity of treatment effect before pooling. We discuss a random effects approach to combining evidence from a series of experiments comparing two treatments. This approach incorporates the heterogeneity of effects in the analysis of the overall treatment efficacy. The model can be extended to include relevant covariates which would reduce the heterogeneity and allow for more specific therapeutic recommendations. We suggest a simple noniterative procedure for characterizing the distribution of treatment effects in a series of studies.
In clinical measurement comparison of a new measurement technique with an established one is often needed to see whether they agree sufficiently for the new to replace the old. Such investigations are often analysed inappropriately, notably by using correlation coefficients. The use of correlation is misleading. An alternative approach, based on graphical techniques and simple calculations, is described, together with the relation between this analysis and the assessment of repeatability.
This paper presents a general statistical methodology for the analysis of multivariate categorical data arising from observer reliability studies. The procedure essentially involves the construction of functions of the observed proportions which are directed at the extent to which the observers agree among themselves and the construction of test statistics for hypotheses involving these functions. Tests for interobserver bias are presented in terms of first-order marginal homogeneity and measures of interobserver agreement are developed as generalized kappa-type statistics. These procedures are illustrated with a clinical diagnosis example from the epidemiological literature.
The objective of this study was to develop a prospectively applicable method for classifying comorbid conditions which might alter the risk of mortality for use in longitudinal studies. A weighted index that takes into account the number and the seriousness of comorbid disease was developed in a cohort of 559 medical patients. The 1-yr mortality rates for the different scores were: "0", 12% (181); "1-2", 26% (225); "3-4", 52% (71); and "greater than or equal to 5", 85% (82). The index was tested for its ability to predict risk of death from comorbid disease in the second cohort of 685 patients during a 10-yr follow-up. The percent of patients who died of comorbid disease for the different scores were: "0", 8% (588); "1", 25% (54); "2", 48% (25); "greater than or equal to 3", 59% (18). With each increased level of the comorbidity index, there were stepwise increases in the cumulative mortality attributable to comorbid disease (log rank chi 2 = 165; p less than 0.0001). In this longer follow-up, age was also a predictor of mortality (p less than 0.001). The new index performed similarly to a previous system devised by Kaplan and Feinstein. The method of classifying comorbidity provides a simple, readily applicable and valid method of estimating risk of death from comorbid disease for use in longitudinal studies. Further work in larger populations is still required to refine the approach because the number of patients with any given condition in this study was relatively small.
Bowtie is an ultrafast, memory-efficient alignment program for aligning short DNA sequence reads to large genomes. For the human genome, Burrows-Wheeler indexing allows Bowtie to align more than 25 million reads per CPU hour with a memory footprint of approximately 1.3 gigabytes. Bowtie extends previous Burrows-Wheeler techniques with a novel quality-aware backtracking algorithm that permits mismatches. Multiple processor cores can be used simultaneously to achieve even greater alignment speeds. Bowtie is open source (http://bowtie.cbcb.umd.edu).
A 36-item short-form (SF-36) was constructed to survey health status in the Medical Outcomes Study. The SF-36 was designed for use in clinical practice and research, health policy evaluations, and general population surveys. The SF-36 includes one multi-item scale that assesses eight health concepts: 1) limitations in physical activities because of health problems; 2) limitations in social activities because of physical or emotional problems; 3) limitations in usual role activities because of physical health problems; 4) bodily pain; 5) general mental health (psychological distress and well-being); 6) limitations in usual role activities because of emotional problems; 7) vitality (energy and fatigue); and 8) general health perceptions. The survey was constructed for self-administration by persons 14 years of age and older, and for administration by a trained interviewer in person or by telephone. The history of the development of the SF-36, the origin of specific items, and the logic underlying their selection are summarized. The content and features of the SF-36 are compared with the 20-item Medical Outcomes Study short-form.
Differentiated cells can be reprogrammed to an embryonic-like state by transfer of nuclear contents into oocytes or by fusion with embryonic stem (ES) cells. Little is known about factors that induce this reprogramming. Here, we demonstrate induction of pluripotent stem cells from mouse embryonic or adult fibroblasts by introducing four factors, Oct3/4, Sox2, c-Myc, and Klf4, under ES cell culture conditions. Unexpectedly, Nanog was dispensable. These cells, which we designated iPS (induced pluripotent stem) cells, exhibit the morphology and growth properties of ES cells and express ES cell marker genes. Subcutaneous transplantation of iPS cells into nude mice resulted in tumors containing a variety of tissues from all three germ layers. Following injection into blastocysts, iPS cells contributed to mouse embryonic development. These data demonstrate that pluripotent stem cells can be directly generated from fibroblast cultures by the addition of only a few defined factors.
The steady-state basal plasma glucose and insulin concentrations are determined by their interaction in a feedback loop. A computer-solved model has been used to predict the homeostatic concentrations which arise from varying degrees beta-cell deficiency and insulin resistance. Comparison of a patient's fasting values with the model's predictions allows a quantitative assessment of the contributions of insulin resistance and deficient beta-cell function to the fasting hyperglycaemia (homeostasis model assessment, HOMA). The accuracy and precision of the estimate have been determined by comparison with independent measures of insulin resistance and beta-cell function using hyperglycaemic and euglycaemic clamps and an intravenous glucose tolerance test. The estimate of insulin resistance obtained by homeostasis model assessment correlated with estimates obtained by use of the euglycaemic clamp (Rs = 0.88, p less than 0.0001), the fasting insulin concentration (Rs = 0.81, p less than 0.0001), and the hyperglycaemic clamp, (Rs = 0.69, p less than 0.01). There was no correlation with any aspect of insulin-receptor binding. The estimate of deficient beta-cell function obtained by homeostasis model assessment correlated with that derived using the hyperglycaemic clamp (Rs = 0.61, p less than 0.01) and with the estimate from the intravenous glucose tolerance test (Rs = 0.64, p less than 0.05). The low precision of the estimates from the model (coefficients of variation: 31% for insulin resistance and 32% for beta-cell deficit) limits its use, but the correlation of the model's estimates with patient data accords with the hypothesis that basal glucose and insulin interactions are largely determined by a simple feed back loop.
MicroRNAs (miRNAs) are endogenous approximately 23 nt RNAs that play important gene-regulatory roles in animals and plants by pairing to the mRNAs of protein-coding genes to direct their posttranscriptional repression. This review outlines the current understanding of miRNA target recognition in animals and discusses the widespread impact of miRNAs on both the expression and evolution of protein-coding genes.
The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.
Estimates of the worldwide incidence, mortality and prevalence of 26 cancers in the year 2002 are now available in the GLOBOCAN series of the International Agency for Research on Cancer. The results are presented here in summary form, including the geographic variation between 20 large "areas" of the world. Overall, there were 10.9 million new cases, 6.7 million deaths, and 24.6 million persons alive with cancer (within three years of diagnosis). The most commonly diagnosed cancers are lung (1.35 million), breast (1.15 million), and colorectal (1 million); the most common causes of cancer death are lung cancer (1.18 million deaths), stomach cancer (700,000 deaths), and liver cancer (598,000 deaths). The most prevalent cancer in the world is breast cancer (4.4 million survivors up to 5 years following diagnosis). There are striking variations in the risk of different cancers by geographic area. Most of the international variation is due to exposure to known or suspected risk factors related to lifestyle or environment, and provides a clear challenge to prevention.