Publications about ESA

Publication

Journal: Journal of Applied Crystallography

September/21/2017

Abstract

Phaser is a program for phasing macromolecular crystal structures by both molecular replacement and experimental phasing methods. The novel phasing algorithms implemented in Phaser have been developed using maximum likelihood and multivariate statistics. For molecular replacement, the new algorithms have proved to be significantly better than traditional methods in discriminating correct solutions from noise, and for single-wavelength anomalous dispersion experimental phasing, the new algorithms, which account for correlations between F(+) and F(-), give better phases (lower mean phase error with respect to the phases given by the refined structure) than those that use mean F and anomalous differences DeltaF. One of the design concepts of Phaser was that it be capable of a high degree of automation. To this end, Phaser (written in C++) can be called directly from Python, although it can also be called using traditional CCP4 keyword-style input. Phaser is a platform for future development of improved phasing methods and their release, including source code, to the crystallographic community.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2483472/bin/j-40-00658-fig1.jpg

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2483472/bin/j-40-00658-fig2.jpg

Authors

Airlie J McCoy; Ralf W Grosse-Kunstleve; Paul D Adams; Martyn D Winn; Laurent C Storoni; Randy J Read

Related with

Citations(6524)References(36)Authors(6)

Pulse

Views:

69

Posts:

No posts

Rating:

Not rated

Publication

MicroRNAs: target recognition and regulatory functions.

Download PDF

Journal: Cell

February/9/2009

Abstract

MicroRNAs (miRNAs) are endogenous approximately 23 nt RNAs that play important gene-regulatory roles in animals and plants by pairing to the mRNAs of protein-coding genes to direct their posttranscriptional repression. This review outlines the current understanding of miRNA target recognition in animals and discusses the widespread impact of miRNAs on both the expression and evolution of protein-coding genes.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3794896/bin/nihms513712f1.jpg

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3794896/bin/nihms513712f2.jpg

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3794896/bin/nihms513712f3.jpg

Authors

David P Bartel

Pulse

Views:

8

Posts:

No posts

Rating:

Not rated

Publication

edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.

Download PDF

Journal: Bioinformatics

March/1/2010

Abstract

CONCLUSIONS

It is expected that emerging digital gene expression (DGE) technologies will overtake microarray technologies in the near future for many functional genomics applications. One of the fundamental data analysis tasks, especially for gene expression studies, involves determining whether there is evidence that counts for a transcript or exon are significantly different across experimental conditions. edgeR is a Bioconductor software package for examining differential expression of replicated count data. An overdispersed Poisson model is used to account for both biological and technical variability. Empirical Bayes methods are used to moderate the degree of overdispersion across transcripts, improving the reliability of inference. The methodology can be used even with the most minimal levels of replication, provided at least one phenotype or experimental condition is replicated. The software may have other applications beyond sequencing data, such as proteome peptide count data.

BACKGROUND

The package is freely available under the LGPL licence from the Bioconductor web site (http://bioconductor.org).

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2796818/bin/btp616f1.jpg

Authors

Mark D Robinson; Davis J McCarthy; Gordon K Smyth

Pulse

Views:

4

Posts:

No posts

Rating:

Not rated

Publication

Trimmomatic: a flexible trimmer for Illumina sequence data.

Download PDF

Journal: Bioinformatics

October/1/2014

Abstract

BACKGROUND

Although many next-generation sequencing (NGS) read preprocessing tools already existed, we could not find any tool or combination of tools that met our requirements in terms of flexibility, correct handling of paired-end data and high performance. We have developed Trimmomatic as a more flexible and efficient preprocessing tool, which could correctly handle paired-end data.

RESULTS

The value of NGS read preprocessing is demonstrated for both reference-based and reference-free tasks. Trimmomatic is shown to produce output that is at least competitive with, and in many cases superior to, that produced by other tools, in all scenarios tested.

METHODS

Trimmomatic is licensed under GPL V3. It is cross-platform (Java 1.5+ required) and available at http://www.usadellab.org/cms/index.php?page=trimmomatic

BACKGROUND

usadel@bio1.rwth-aachen.de

BACKGROUND

Supplementary data are available at Bioinformatics online.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4103590/bin/btu170f1p.jpg

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4103590/bin/btu170f2p.jpg

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4103590/bin/btu170f3p.jpg

Authors

Anthony M Bolger; Marc Lohse; Bjoern Usadel

Pulse

Views:

2

Posts:

No posts

Rating:

Not rated

Publication

The hospital anxiety and depression scale.

Journal: Acta Psychiatrica Scandinavica

September/22/1983

Abstract

A self-assessment scale has been developed and found to be a reliable instrument for detecting states of depression and anxiety in the setting of an hospital medical outpatient clinic. The anxiety and depressive subscales are also valid measures of severity of the emotional disorder. It is suggested that the introduction of the scales into general hospital practice would facilitate the large task of detection and management of emotional disorder in patients under investigation and treatment in medical and surgical departments.

Authors

A S Zigmond; R P Snaith

Pulse

Views:

2

Posts:

No posts

Rating:

Not rated

Publication

Refinement of macromolecular structures by the maximum-likelihood method.

Journal: Acta crystallographica. Section D, Biological crystallography

February/16/2005

Abstract

This paper reviews the mathematical basis of maximum likelihood. The likelihood function for macromolecular structures is extended to include prior phase information and experimental standard uncertainties. The assumption that different parts of a structure might have different errors is considered. A method for estimating sigma(A) using 'free' reflections is described and its effects analysed. The derived equations have been implemented in the program REFMAC. This has been tested on several proteins at different stages of refinement (bacterial alpha-amylase, cytochrome c', cross-linked insulin and oligopeptide binding protein). The results derived using the maximum-likelihood residual are consistently better than those obtained from least-squares refinement.

Authors

G N Murshudov; A A Vagin; E J Dodson

Pulse

Views:

2

Posts:

No posts

Rating:

Not rated

Publication

Crystallography & NMR system: A new software suite for macromolecular structure determination.

Journal: Acta crystallographica. Section D, Biological crystallography

December/13/1998

Abstract

A new software suite, called Crystallography & NMR System (CNS), has been developed for macromolecular structure determination by X-ray crystallography or solution nuclear magnetic resonance (NMR) spectroscopy. In contrast to existing structure-determination programs, the architecture of CNS is highly flexible, allowing for extension to other structure-determination methods, such as electron microscopy and solid-state NMR spectroscopy. CNS has a hierarchical structure: a high-level hypertext markup language (HTML) user interface, task-oriented user input files, module files, a symbolic structure-determination language (CNS language), and low-level source code. Each layer is accessible to the user. The novice user may just use the HTML interface, while the more advanced user may use any of the other layers. The source code will be distributed, thus source-code modification is possible. The CNS language is sufficiently powerful and flexible that many new algorithms can be easily implemented in the CNS language without changes to the source code. The CNS language allows the user to perform operations on data structures, such as structure factors, electron-density maps, and atomic properties. The power of the CNS language has been demonstrated by the implementation of a comprehensive set of crystallographic procedures for phasing, density modification and refinement. User-friendly task-oriented input files are available for nearly all aspects of macromolecular structure determination by X-ray crystallography and solution NMR.

Authors

A T Brünger; P D Adams; G M Clore; W L DeLano; P Gros; R W Grosse-Kunstleve+8 authors

Publication

Features and development of Coot.

Download PDF

Journal: Acta crystallographica. Section D, Biological crystallography

May/26/2010

Abstract

Coot is a molecular-graphics application for model building and validation of biological macromolecules. The program displays electron-density maps and atomic models and allows model manipulations such as idealization, real-space refinement, manual rotation/translation, rigid-body fitting, ligand search, solvation, mutations, rotamers and Ramachandran idealization. Furthermore, tools are provided for model validation as well as interfaces to external programs for refinement, validation and graphics. The software is designed to be easy to learn for novice users, which is achieved by ensuring that tools for common tasks are 'discoverable' through familiar user-interface elements (menus and toolbars) or by intuitive behaviour (mouse controls). Recent developments have focused on providing tools for expert users, with customisable key bindings, extensions and an extensive scripting interface. The software is under rapid development, but has already achieved very widespread use within the crystallographic community. The current state of the software is presented, with a description of the facilities available and of some of the underlying methods employed.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2852313/bin/d-66-00486-fig1.jpg

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2852313/bin/d-66-00486-fig2.jpg

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2852313/bin/d-66-00486-fig3.jpg

Authors

P Emsley; B Lohkamp; W G Scott; K Cowtan

Pulse

Views:

4

Posts:

No posts

Rating:

Not rated

Publication

Inference of population structure using multilocus genotype data.

Download PDF

Journal: Genetics

September/7/2000

Abstract

We describe a model-based clustering method for using multilocus genotype data to infer population structure and assign individuals to populations. We assume a model in which there are K populations (where K may be unknown), each of which is characterized by a set of allele frequencies at each locus. Individuals in the sample are assigned (probabilistically) to populations, or jointly to two or more populations if their genotypes indicate that they are admixed. Our model does not assume a particular mutation process, and it can be applied to most of the commonly used genetic markers, provided that they are not closely linked. Applications of our method include demonstrating the presence of population structure, assigning individuals to populations, studying hybrid zones, and identifying migrants and admixed individuals. We show that the method can produce highly accurate assignments using modest numbers of loci-e.g. , seven microsatellite loci in an example using genotype data from an endangered bird species. The software used for this article is available from http://www.stats.ox.ac.uk/ approximately pritch/home. html.

Authors

J K Pritchard; M Stephens; P Donnelly

Pulse

Views:

3

Posts:

No posts

Rating:

Not rated

Publication

Initial sequencing and analysis of the human genome.

Journal: Nature

March/21/2001

Abstract

The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

Authors

E S Lander; L M Linton; B Birren; C Nusbaum; M C Zody; J Baldwin+251 authors

Pulse

Views:

4

Posts:

No posts

Rating:

Not rated

Publication

Fiji: an open-source platform for biological-image analysis.

Download PDF

Journal: Nature Methods

September/16/2012

Abstract

Fiji is a distribution of the popular open-source software ImageJ focused on biological-image analysis. Fiji uses modern software engineering practices to combine powerful software libraries with a broad range of scripting languages to enable rapid prototyping of image-processing algorithms. Fiji facilitates the transformation of new algorithms into ImageJ plugins that can be shared with end users through an integrated update system. We propose Fiji as a platform for productive collaboration between computer science and biology research communities.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3855844/bin/nihms517436f1.jpg

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3855844/bin/nihms517436f4.jpg

Authors

Johannes Schindelin; Ignacio Arganda-Carreras; Erwin Frise; Verena Kaynig; Mark Longair; Tobias Pietzsch+10 authors

Pulse

Views:

3

Posts:

No posts

Rating:

Not rated

Publication

Improved patch-clamp techniques for high-resolution current recording from cells and cell-free membrane patches.

Journal: Pflugers Archiv European Journal of Physiology

December/20/1981

Abstract

1. The extracellular patch clamp method, which first allowed the detection of single channel currents in biological membranes, has been further refined to enable higher current resolution, direct membrane patch potential control, and physical isolation of membrane patches. 2. A description of a convenient method for the fabrication of patch recording pipettes is given together with procedures followed to achieve giga-seals i.e. pipette-membrane seals with resistances of 10(9) - 10(11) omega. 3. The basic patch clamp recording circuit, and designs for improved frequency response are described along with the present limitations in recording the currents from single channels. 4. Procedures for preparation and recording from three representative cell types are given. Some properties of single acetylcholine-activated channels in muscle membrane are described to illustrate the improved current and time resolution achieved with giga-seals. 5. A description is given of the various ways that patches of membrane can be physically isolated from cells. This isolation enables the recording of single channel currents with well-defined solutions on both sides of the membrane. Two types of isolated cell-free patch configurations can be formed: an inside-out patch with its cytoplasmic membrane face exposed to the bath solution, and an outside-out patch with its extracellular membrane face exposed to the bath solution. 6. The application of the method for the recording of ionic currents and internal dialysis of small cells is considered. Single channel resolution can be achieved when recording from whole cells, if the cell diameter is small (less than 20 micrometer). 7. The wide range of cell types amenable to giga-seal formation is discussed.

Authors

O P Hamill; A Marty; E Neher; B Sakmann; F J Sigworth

Pulse

Views:

1

Posts:

No posts

Rating:

Not rated

Publication

Haploview: analysis and visualization of LD and haplotype maps.

Journal: Bioinformatics

March/7/2005

Abstract

Research over the last few years has revealed significant haplotype structure in the human genome. The characterization of these patterns, particularly in the context of medical genetic association studies, is becoming a routine research activity. Haploview is a software package that provides computation of linkage disequilibrium statistics and population haplotype patterns from primary genotype data in a visually appealing and interactive interface.

BACKGROUND

http://www.broad.mit.edu/mpg/haploview/

BACKGROUND

jcbarret@broad.mit.edu

Authors

J C Barrett; B Fry; J Maller; M J Daly

Publication

Quantifying heterogeneity in a meta-analysis.

Journal: Statistics in Medicine

August/6/2002

Abstract

The extent of heterogeneity in a meta-analysis partly determines the difficulty in drawing overall conclusions. This extent may be measured by estimating a between-study variance, but interpretation is then specific to a particular treatment effect metric. A test for the existence of heterogeneity exists, but depends on the number of studies in the meta-analysis. We develop measures of the impact of heterogeneity on a meta-analysis, from mathematical criteria, that are independent of the number of studies and the treatment effect metric. We derive and propose three suitable statistics: H is the square root of the chi2 heterogeneity statistic divided by its degrees of freedom; R is the ratio of the standard error of the underlying mean from a random effects meta-analysis to the standard error of a fixed effect meta-analytic estimate, and I2 is a transformation of (H) that describes the proportion of total variation in study estimates that is due to heterogeneity. We discuss interpretation, interval estimates and other properties of these measures and examine them in five example data sets showing different amounts of heterogeneity. We conclude that H and I2, which can usually be calculated for published meta-analyses, are particularly useful summaries of the impact of heterogeneity. One or both should be presented in published meta-analyses in preference to the test for heterogeneity.

Authors

Julian P T Higgins; Simon G Thompson

Publication

Clinical diagnosis of Alzheimer's disease: report of the NINCDS-ADRDA Work Group under the auspices of Department of Health and Human Services Task Force on Alzheimer's Disease.

Journal: Neurology

July/29/1984

Abstract

Clinical criteria for the diagnosis of Alzheimer's disease include insidious onset and progressive impairment of memory and other cognitive functions. There are no motor, sensory, or coordination deficits early in the disease. The diagnosis cannot be determined by laboratory tests. These tests are important primarily in identifying other possible causes of dementia that must be excluded before the diagnosis of Alzheimer's disease may be made with confidence. Neuropsychological tests provide confirmatory evidence of the diagnosis of dementia and help to assess the course and response to therapy. The criteria proposed are intended to serve as a guide for the diagnosis of probable, possible, and definite Alzheimer's disease; these criteria will be revised as more definitive information become available.

Authors

G McKhann; D Drachman; M Folstein; R Katzman; D Price; E M Stadlan

Pulse

Views:

8

Posts:

No posts

Rating:

Not rated

Publication

Induction of pluripotent stem cells from adult human fibroblasts by defined factors.

Journal: Cell

January/29/2008

Abstract

Successful reprogramming of differentiated human somatic cells into a pluripotent state would allow creation of patient- and disease-specific stem cells. We previously reported generation of induced pluripotent stem (iPS) cells, capable of germline transmission, from mouse somatic cells by transduction of four defined transcription factors. Here, we demonstrate the generation of iPS cells from adult human dermal fibroblasts with the same four factors: Oct3/4, Sox2, Klf4, and c-Myc. Human iPS cells were similar to human embryonic stem (ES) cells in morphology, proliferation, surface antigens, gene expression, epigenetic status of pluripotent cell-specific genes, and telomerase activity. Furthermore, these cells could differentiate into cell types of the three germ layers in vitro and in teratomas. These findings demonstrate that iPS cells can be generated from adult human fibroblasts.

Authors

Kazutoshi Takahashi; Koji Tanabe; Mari Ohnuki; Megumi Narita; Tomoko Ichisaka; Kiichiro Tomoda; Shinya Yamanaka

Pulse

Views:

2

Posts:

No posts

Rating:

Not rated

Publication

A comprehensive set of sequence analysis programs for the VAX.

Download PDF

Journal: Nucleic Acids Research

March/7/1984

Abstract

The University of Wisconsin Genetics Computer Group (UWGCG) has been organized to develop computational tools for the analysis and publication of biological sequence data. A group of programs that will interact with each other has been developed for the Digital Equipment Corporation VAX computer using the VMS operating system. The programs available and the conditions for transfer are described.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC321012/bin/nar00343-0394.tif

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC321012/bin/nar00343-0395.tif

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC321012/bin/nar00343-0396.tif

Authors

J Devereux; P Haeberli; O Smithies

Pulse

Views:

3

Posts:

No posts

Rating:

Not rated

Publication

Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes.

Download PDF

Journal: Genome Biology

September/19/2002

Abstract

BACKGROUND

Gene-expression analysis is increasingly important in biological research, with real-time reverse transcription PCR (RT-PCR) becoming the method of choice for high-throughput and accurate expression profiling of selected genes. Given the increased sensitivity, reproducibility and large dynamic range of this methodology, the requirements for a proper internal control gene for normalization have become increasingly stringent. Although housekeeping gene expression has been reported to vary considerably, no systematic survey has properly determined the errors related to the common practice of using only one control gene, nor presented an adequate way of working around this problem.

RESULTS

We outline a robust and innovative strategy to identify the most stably expressed control genes in a given set of tissues, and to determine the minimum number of genes required to calculate a reliable normalization factor. We have evaluated ten housekeeping genes from different abundance and functional classes in various human tissues, and demonstrated that the conventional use of a single gene for normalization leads to relatively large errors in a significant proportion of samples tested. The geometric mean of multiple carefully selected housekeeping genes was validated as an accurate normalization factor by analyzing publicly available microarray data.

CONCLUSIONS

The normalization strategy presented here is a prerequisite for accurate RT-PCR expression profiling, which, among other things, opens up the possibility of studying the biological relevance of small expression differences.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC126239/bin/gb-2002-3-7-research0034-1.jpg

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC126239/bin/gb-2002-3-7-research0034-2.jpg

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC126239/bin/gb-2002-3-7-research0034-3.jpg

Authors

Jo Vandesompele; Katleen De Preter; Filip Pattyn; Bruce Poppe; Nadine Van Roy; Anne De Paepe; Frank Speleman

Pulse

Views:

5

Posts:

No posts

Rating:

Not rated

Publication

The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data.

Download PDF

Journal: Genome Research

December/22/2010

Abstract

Next-generation DNA sequencing (NGS) projects, such as the 1000 Genomes Project, are already revolutionizing our understanding of genetic variation among individuals. However, the massive data sets generated by NGS--the 1000 Genome pilot alone includes nearly five terabases--make writing feature-rich, efficient, and robust analysis tools difficult for even computationally sophisticated individuals. Indeed, many professionals are limited in the scope and the ease with which they can answer scientific questions by the complexity of accessing and manipulating the data produced by these machines. Here, we discuss our Genome Analysis Toolkit (GATK), a structured programming framework designed to ease the development of efficient and robust analysis tools for next-generation DNA sequencers using the functional programming philosophy of MapReduce. The GATK provides a small but rich set of data access patterns that encompass the majority of analysis tool needs. Separating specific analysis calculations from common data management infrastructure enables us to optimize the GATK framework for correctness, stability, and CPU and memory efficiency and to enable distributed and shared memory parallelization. We highlight the capabilities of the GATK by describing the implementation and application of robust, scale-tolerant tools like coverage calculators and single nucleotide polymorphism (SNP) calling. We conclude that the GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2928508/bin/1297tbl1.jpg

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2928508/bin/1297fig1.jpg

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2928508/bin/1297fig2.jpg

Authors

Aaron McKenna; Matthew Hanna; Eric Banks; Andrey Sivachenko; Kristian Cibulskis; Andrew Kernytsky+5 authors

Pulse

Views:

5

Posts:

No posts

Rating:

Not rated

Publication

One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products.

Download PDF

Journal: Proceedings of the National Academy of Sciences of the United States of America

July/12/2000

Abstract

We have developed a simple and highly efficient method to disrupt chromosomal genes in Escherichia coli in which PCR primers provide the homology to the targeted gene(s). In this procedure, recombination requires the phage lambda Red recombinase, which is synthesized under the control of an inducible promoter on an easily curable, low copy number plasmid. To demonstrate the utility of this approach, we generated PCR products by using primers with 36- to 50-nt extensions that are homologous to regions adjacent to the gene to be inactivated and template plasmids carrying antibiotic resistance genes that are flanked by FRT (FLP recognition target) sites. By using the respective PCR products, we made 13 different disruptions of chromosomal genes. Mutants of the arcB, cyaA, lacZYA, ompR-envZ, phnR, pstB, pstCA, pstS, pstSCAB-phoU, recA, and torSTRCAD genes or operons were isolated as antibiotic-resistant colonies after the introduction into bacteria carrying a Red expression plasmid of synthetic (PCR-generated) DNA. The resistance genes were then eliminated by using a helper plasmid encoding the FLP recombinase which is also easily curable. This procedure should be widely useful, especially in genome analysis of E. coli and other bacteria because the procedure can be done in wild-type cells.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC18686/bin/pq1201632001.jpg

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC18686/bin/pq1201632002.jpg

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC18686/bin/pq1201632003.jpg

Authors

K A Datsenko; B L Wanner

Pulse

Views:

4

Posts:

No posts

Rating:

Not rated

Publication

The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations.

Journal: Journal of Personality and Social Psychology

February/26/1987

Abstract

In this article, we attempt to distinguish between the properties of moderator and mediator variables at a number of levels. First, we seek to make theorists and researchers aware of the importance of not using the terms moderator and mediator interchangeably by carefully elaborating, both conceptually and strategically, the many ways in which moderators and mediators differ. We then go beyond this largely pedagogical function and delineate the conceptual and strategic implications of making use of such distinctions with regard to a wide range of phenomena, including control and stress, attitudes, and personality traits. We also provide a specific compendium of analytic procedures appropriate for making the most effective use of the moderator and mediator distinction, both separately and in terms of a broader causal system that includes both moderators and mediators.

Authors

R M Baron; D A Kenny

Publication

Cluster analysis and display of genome-wide expression patterns.

Download PDF

Journal: Proceedings of the National Academy of Sciences of the United States of America

January/13/1999

Abstract

A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression. The output is displayed graphically, conveying the clustering and the underlying expression data simultaneously in a form intuitive for biologists. We have found in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function, and we find a similar tendency in human data. Thus patterns seen in genome-wide expression experiments can be interpreted as indications of the status of cellular processes. Also, coexpression of genes of known function with poorly characterized or novel genes may provide a simple means of gaining leads to the functions of many genes for which information is not available currently.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC24541/bin/pq2583917001.jpg

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC24541/bin/pq2583917002.jpg

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC24541/bin/pq2583917003.jpg

Authors

M B Eisen; P T Spellman; P O Brown; D Botstein

Pulse

Views:

2

Posts:

No posts

Rating:

Not rated

Publication

Linear models and empirical bayes methods for assessing differential expression in microarray experiments.

Journal: Statistical Applications in Genetics and Molecular Biology

May/17/2006

Abstract

The problem of identifying differentially expressed genes in designed microarray experiments is considered. Lonnstedt and Speed (2002) derived an expression for the posterior odds of differential expression in a replicated two-color experiment using a simple hierarchical parametric model. The purpose of this paper is to develop the hierarchical model of Lonnstedt and Speed (2002) into a practical approach for general microarray experiments with arbitrary numbers of treatments and RNA samples. The model is reset in the context of general linear models with arbitrary coefficients and contrasts of interest. The approach applies equally well to both single channel and two color microarray experiments. Consistent, closed form estimators are derived for the hyperparameters in the model. The estimators proposed have robust behavior even for small numbers of arrays and allow for incomplete data arising from spot filtering or spot quality weights. The posterior odds statistic is reformulated in terms of a moderated t-statistic in which posterior residual standard deviations are used in place of ordinary standard deviations. The empirical Bayes approach is equivalent to shrinkage of the estimated sample variances towards a pooled estimate, resulting in far more stable inference when the number of arrays is small. The use of moderated t-statistics has the advantage over the posterior odds that the number of hyperparameters which need to estimated is reduced; in particular, knowledge of the non-null prior for the fold changes are not required. The moderated t-statistic is shown to follow a t-distribution with augmented degrees of freedom. The moderated t inferential approach extends to accommodate tests of composite null hypotheses through the use of moderated F-statistics. The performance of the methods is demonstrated in a simulation study. Results are presented for two publicly available data sets.

Authors

Gordon K Smyth

Publication

NMRPipe: a multidimensional spectral processing system based on UNIX pipes.

Journal: Journal of Biomolecular NMR

January/24/1996

Abstract

The NMRPipe system is a UNIX software environment of processing, graphics, and analysis tools designed to meet current routine and research-oriented multidimensional processing requirements, and to anticipate and accommodate future demands and developments. The system is based on UNIX pipes, which allow programs running simultaneously to exchange streams of data under user control. In an NMRPipe processing scheme, a stream of spectral data flows through a pipeline of processing programs, each of which performs one component of the overall scheme, such as Fourier transformation or linear prediction. Complete multidimensional processing schemes are constructed as simple UNIX shell scripts. The processing modules themselves maintain and exploit accurate records of data sizes, detection modes, and calibration information in all dimensions, so that schemes can be constructed without the need to explicitly define or anticipate data sizes or storage details of real and imaginary channels during processing. The asynchronous pipeline scheme provides other substantial advantages, including high flexibility, favorable processing speeds, choice of both all-in-memory and disk-bound processing, easy adaptation to different data formats, simpler software development and maintenance, and the ability to distribute processing tasks on multi-CPU computers and computer networks.

Authors

F Delaglio; S Grzesiek; G W Vuister; G Zhu; J Pfeifer; A Bax