The plasma peptides of breast versus ovarian cancer.
Journal: 2019/December - Clinical Proteomics
ISSN: 1542-6416
Abstract:
There is a need to demonstrate a proof of principle that proteomics has the capacity to analyze plasma from breast cancer versus other diseases and controls in a multisite clinical trial design. The peptides or proteins that show a high observation frequency, and/or precursor intensity, specific to breast cancer plasma might be discovered by comparison to other diseases and matched controls. The endogenous tryptic peptides of breast cancer plasma were compared to ovarian cancer, female normal, sepsis, heart attack, Alzheimer's and multiple sclerosis along with the institution-matched normal and control samples collected directly onto ice.

Methods
Endogenous tryptic peptides were extracted from individual breast cancer and control EDTA plasma samples in a step gradient of acetonitrile, and collected over preparative C18 for LC-ESI-MS/MS with a set of LTQ XL linear quadrupole ion traps working together in parallel to randomly and independently sample clinical populations. The MS/MS spectra were fit to fully tryptic peptides or phosphopeptides within proteins using the X!TANDEM algorithm. The protein observation frequency was counted using the SEQUEST algorithm after selecting the single best charge state and peptide sequence for each MS/MS spectra. The observation frequency was subsequently tested by Chi Square analysis. The log10 precursor intensity was compared by ANOVA in the R statistical system.

Results
Peptides and/or phosphopeptides of common plasma proteins such as APOE, C4A, C4B, C3, APOA1, APOC2, APOC4, ITIH3 and ITIH4 showed increased observation frequency and/or precursor intensity in breast cancer. Many cellular proteins also showed large changes in frequency by Chi Square (χ2 > 100, p < 0.0001) in the breast cancer samples such as CPEB1, LTBP4, HIF-1A, IGHE, RAB44, NEFM, C19orf82, SLC35B1, 1D12A, C8orf34, HIF1A, OCLN, EYA1, HLA-DRB1, LARS, PTPDC1, WWC1, ZNF562, PTMA, MGAT1, NDUFA1, NOGOC, OR1E1, OR1E2, CFI, HSA12, GCSH, ELTD1, TBX15, NR2C2, FLJ00045, PDLIM1, GALNT9, ASH2L, PPFIBP1, LRRC4B, SLCO3A1, BHMT2, CS, FAM188B2, LGALS7, SAT2, SFRS8, SLC22A12, WNT9B, SLC2A4, ZNF101, WT1, CCDC47, ERLIN1, SPFH1, EID2, THOC1, DDX47, MREG, PTPRE, EMILIN1, DKFZp779G1236 and MAP3K8 among others. The protein gene symbols with large Chi Square values were significantly enriched in proteins that showed a complex set of previously established functional and structural relationships by STRING analysis. An increase in mean precursor intensity of peptides was observed for QSER1 as well as SLC35B1, IQCJ-SCHIP1, MREG, BHMT2, LGALS7, THOC1, ANXA4, DHDDS, SAT2, PTMA and FYCO1 among others. In contrast, the QSER1 peptide QPKVKAEPPPK was apparently specific to ovarian cancer.

There was striking agreement between the breast cancer plasma peptides and proteins discovered by LC-ESI-MS/MS with previous biomarkers from tumors, cells lines or body fluids by genetic or biochemical methods. The results indicate that variation in plasma peptides from breast cancer versus ovarian cancer may be directly discovered by LC-ESI-MS/MS that will be a powerful tool for clinical research. It may be possible to use a battery of sensitive and robust linear quadrupole ion traps for random and independent sampling of plasma from a multisite clinical trial.
Relations:
Content
Citations
(1)
References
(115)
Diseases
(5)
Conditions
(1)
Drugs
(2)
Chemicals
(9)
Genes
(59)
Anatomy
(1)
Affiliates
(5)
Similar articles
Articles by the same authors
Discussion board
Clin Proteomics 16: 43

The plasma peptides of breast versus ovarian cancer

+16 authors

Abstract

Background

There is a need to demonstrate a proof of principle that proteomics has the capacity to analyze plasma from breast cancer versus other diseases and controls in a multisite clinical trial design. The peptides or proteins that show a high observation frequency, and/or precursor intensity, specific to breast cancer plasma might be discovered by comparison to other diseases and matched controls. The endogenous tryptic peptides of breast cancer plasma were compared to ovarian cancer, female normal, sepsis, heart attack, Alzheimer’s and multiple sclerosis along with the institution-matched normal and control samples collected directly onto ice.

Methods

Endogenous tryptic peptides were extracted from individual breast cancer and control EDTA plasma samples in a step gradient of acetonitrile, and collected over preparative C18 for LC–ESI–MS/MS with a set of LTQ XL linear quadrupole ion traps working together in parallel to randomly and independently sample clinical populations. The MS/MS spectra were fit to fully tryptic peptides or phosphopeptides within proteins using the X!TANDEM algorithm. The protein observation frequency was counted using the SEQUEST algorithm after selecting the single best charge state and peptide sequence for each MS/MS spectra. The observation frequency was subsequently tested by Chi Square analysis. The log10 precursor intensity was compared by ANOVA in the R statistical system.

Results

Peptides and/or phosphopeptides of common plasma proteins such as APOE, C4A, C4B, C3, APOA1, APOC2, APOC4, ITIH3 and ITIH4 showed increased observation frequency and/or precursor intensity in breast cancer. Many cellular proteins also showed large changes in frequency by Chi Square (χ > 100, p < 0.0001) in the breast cancer samples such as CPEB1, LTBP4, HIF-1A, IGHE, RAB44, NEFM, C19orf82, SLC35B1, 1D12A, C8orf34, HIF1A, OCLN, EYA1, HLA-DRB1, LARS, PTPDC1, WWC1, ZNF562, PTMA, MGAT1, NDUFA1, NOGOC, OR1E1, OR1E2, CFI, HSA12, GCSH, ELTD1, TBX15, NR2C2, FLJ00045, PDLIM1, GALNT9, ASH2L, PPFIBP1, LRRC4B, SLCO3A1, BHMT2, CS, FAM188B2, LGALS7, SAT2, SFRS8, SLC22A12, WNT9B, SLC2A4, ZNF101, WT1, CCDC47, ERLIN1, SPFH1, EID2, THOC1, DDX47, MREG, PTPRE, EMILIN1, DKFZp779G1236 and MAP3K8 among others. The protein gene symbols with large Chi Square values were significantly enriched in proteins that showed a complex set of previously established functional and structural relationships by STRING analysis. An increase in mean precursor intensity of peptides was observed for QSER1 as well as SLC35B1, IQCJ-SCHIP1, MREG, BHMT2, LGALS7, THOC1, ANXA4, DHDDS, SAT2, PTMA and FYCO1 among others. In contrast, the QSER1 peptide QPKVKAEPPPK was apparently specific to ovarian cancer.

Conclusion

There was striking agreement between the breast cancer plasma peptides and proteins discovered by LC–ESI–MS/MS with previous biomarkers from tumors, cells lines or body fluids by genetic or biochemical methods. The results indicate that variation in plasma peptides from breast cancer versus ovarian cancer may be directly discovered by LC–ESI–MS/MS that will be a powerful tool for clinical research. It may be possible to use a battery of sensitive and robust linear quadrupole ion traps for random and independent sampling of plasma from a multisite clinical trial.

Keywords: Human EDTA plasma, Organic extraction, Nano chromatography, Electrospray ionization tandem mass spectrometry, LC–ESI–MS/MS, Linear quadrupole ion trap, Discovery of variation, Breast cancer, Random and independent sampling, Chi Square test and ANOVA, SQL SERVER and R

Introduction

Blood peptides

The endogenous peptides of human serum and plasma were first detected by highly sensitive MALDI [13]. The MALDI “patterns” formed by the ex vivo degradation of the major peptides of human blood fluids have been compared using complex multivariate approaches [46]. It was suggested that pattern analysis of endo-proteinases or exo-peptidases would permit the diagnosis of cancer [7, 8]. However, there was no evidence that multivariate pattern analysis of the peptides or exo-peptidase activity will serve as a valid diagnostic [9]. Multivariate pattern analysis is prone to over-interpretation of laboratory or clinical experiments [10, 11]. Univariate ANOVA of the main feature(s) provided about the same statistical power as multivariate analysis [12]. The endogenous peptides of human blood were first identified by MS/MS fragmentation using MALDI-Qq-TOF and LC–ESI–MS/MS with an ion trap mass spectrometer, that showed excellent agreement with exogenous digestions, and the intensity values compared by ANOVA [12, 13]. Random and independent sampling of the endogenous tryptic peptides from clinical plasma samples revealed individual peptides or proteins that show significant variation by standard statistical methods such as the Chi Square test and ANOVA [12, 1418]. Pre-analytical variation was exhaustively studied between fresh EDTA plasma samples on ice versus plasma samples degraded for various lengths of time to control for differences in sample handling and storage. The observation frequency of peptides from many proteins may increase by on average twofold after incubation at room temperature [1719] and indicates that Complement C3 and C4B vary with time of incubation ex vivo [17, 18] in agreement with previous results [12].

Sample preparation

The sensitive analysis of human blood fluids by LC–ESI–MS/MS is dependent on effective fractionation strategies, such as partition chromatography or organic extraction, to relieve suppression and competition for ionization, resulting in high signal to noise ratios and thus low error rates of identification and quantification [20]. Without step wise sample partition only a few high abundance proteins may be observed from blood fluid [13, 21, 22]. In contrast, with sufficient sample preparation, low abundance proteins of ≤ 1 ng/ml could be detected and quantified in blood samples by mass spectrometry [22, 23]. Simple and single-use, i.e. disposable, preparative and analytical separation apparatus permits the identification and quantification of blood peptides and proteins with no possibility of cross contamination between patients that guarantees sampling is statistically independent [12, 13, 17, 22, 23]. Previously, the use of precipitation and selective extraction of the pellet [2326] was shown to be superior to precipitation and analysis of the ACN supernatant [27], ultra-filtration, [28] albumin depletion chromatography [29] or C18 partition chromatography alone [13]. Precipitating all of the polypeptides with 90% ACN followed by step-wise extraction of the peptides with mixtures of organic solvent and water was the optimal method to sensitively detect peptides from blood [21]. Here a step gradient of acetonitrile/water to extract 200 µl of EDTA plasma for analysis by LC–ESI–MS/MS showed a high signal to noise ratio [21] and resulted in the confident identification of tryptic peptides [17] from breast cancer versus normal control samples.

Computation and statistics

Partition of each clinical sample into multiple sub-fractions, that each must be randomly and independently sampled by analytical C18 LC–ESI–MS/MS provides sensitivity [21] but also creates a large computational challenge. Previously the 32-bit computer power was lacking to identify and compare all the peptides and protein from thousands of LC–ESI–MS/MS recordings in a large multisite clinical experiment [30]. Here we show the MS/MS spectra from random and independent sampling of peptides from 1508 LC–ESI–MS/MS experiments from multiple clinical treatments and sites may be fit to peptides using a 64 bit server and then the observation frequency and precursor intensity compared across treatments using SQL SERVER/R that shows excellent data compression and relation [14, 17]. The protein p-values and FDR q-values were computed from organic extraction or chromatography of blood fluid and the peptide-to-protein distribution of the precursor ions of greater than ~ 10,000 (E4) counts were compared to a null (i.e. known false positive) model of noise or computer generated random MS/MS spectra [15, 17, 3134]. Peptides may be identified from the fit of MS/MS spectra to peptide sequences [35] that permits the accurate estimate of the type I error rate (p value) of protein identification that may be corrected by the method Benjamini and Hochberg [36] to yield the FDR (q-value) [17, 21, 31]. The peptide fits may be filtered from redundant results to the single best fit of the peptide sequence and charge state using a complex key in SQL Server [17, 31, 37, 38]. Simulations using random or noise MS/MS spectra distributions may be used to control the type I error of experimental MS/MS spectra correlations to tryptic peptides [1517, 3134, 37]. The peptide and protein observation counts (frequency) may be analyzed using classical statistic methods such as Chi Square analysis [33, 39]. Log10 transformation of precursor intensity yields a normal distribution that permits comparison of peptide and proteins expression levels by ANOVA [15, 16]. The SQL Server system permits the direct interrogation of the related data by the open source R statistical system without proteomic-specific software packages. Here the use of SQL/R has permitted the detailed statistical analysis of randomly and independently sampled LC–ESI–MS/MS data from multiple hospitals in parallel that would be requisite for a multisite clinical trial [37, 39].

Cancer proteins in blood fluids

Markers of breast cancer [40] have been examined from nano vesicles [41] that may mediate tumor invasion [42], in proximal fluid [43, 44] or from serum or plasma [4547]. Many non-specific, i.e. “common distress” or “acute phase” proteins have been detected to increase by the analysis of blood fluids such as amyloids, haptoglobin, alpha 1 antitrypsin, clusterin, apolipoproteins, complement components, heat shock proteins, fibrinogens, hemopexin, alpha 2 macroglobulin and others that may be of limited diagnostic value [20, 48, 49]. There is good evidence that cellular proteins may exist in circulation, and even form supramolecular complexes with other molecules, in the blood [50]. Proteins and nucleic acids may be packaged in exosomes that are challenging to isolate [51, 52] and it appears that cellular proteins may be secreted into circulation [50, 53, 54]. Here, the combination of step wise organic partition [21], random and independent sampling by nano electrospray LC–ESI–MS/MS [17], and 64 bit computation with SQL SERVER/R [14] permitted the sensitive detection of peptides and/or phosphopeptides from human plasma. The variation in endogenous peptides within parent protein chains in computed complexes from breast cancer patients versus ovarian cancer and other disease and normal plasma were compared by the classical statistical approaches of the Chi Square test followed by univariate ANOVA [12, 15, 16].

Blood peptides

The endogenous peptides of human serum and plasma were first detected by highly sensitive MALDI [13]. The MALDI “patterns” formed by the ex vivo degradation of the major peptides of human blood fluids have been compared using complex multivariate approaches [46]. It was suggested that pattern analysis of endo-proteinases or exo-peptidases would permit the diagnosis of cancer [7, 8]. However, there was no evidence that multivariate pattern analysis of the peptides or exo-peptidase activity will serve as a valid diagnostic [9]. Multivariate pattern analysis is prone to over-interpretation of laboratory or clinical experiments [10, 11]. Univariate ANOVA of the main feature(s) provided about the same statistical power as multivariate analysis [12]. The endogenous peptides of human blood were first identified by MS/MS fragmentation using MALDI-Qq-TOF and LC–ESI–MS/MS with an ion trap mass spectrometer, that showed excellent agreement with exogenous digestions, and the intensity values compared by ANOVA [12, 13]. Random and independent sampling of the endogenous tryptic peptides from clinical plasma samples revealed individual peptides or proteins that show significant variation by standard statistical methods such as the Chi Square test and ANOVA [12, 1418]. Pre-analytical variation was exhaustively studied between fresh EDTA plasma samples on ice versus plasma samples degraded for various lengths of time to control for differences in sample handling and storage. The observation frequency of peptides from many proteins may increase by on average twofold after incubation at room temperature [1719] and indicates that Complement C3 and C4B vary with time of incubation ex vivo [17, 18] in agreement with previous results [12].

Sample preparation

The sensitive analysis of human blood fluids by LC–ESI–MS/MS is dependent on effective fractionation strategies, such as partition chromatography or organic extraction, to relieve suppression and competition for ionization, resulting in high signal to noise ratios and thus low error rates of identification and quantification [20]. Without step wise sample partition only a few high abundance proteins may be observed from blood fluid [13, 21, 22]. In contrast, with sufficient sample preparation, low abundance proteins of ≤ 1 ng/ml could be detected and quantified in blood samples by mass spectrometry [22, 23]. Simple and single-use, i.e. disposable, preparative and analytical separation apparatus permits the identification and quantification of blood peptides and proteins with no possibility of cross contamination between patients that guarantees sampling is statistically independent [12, 13, 17, 22, 23]. Previously, the use of precipitation and selective extraction of the pellet [2326] was shown to be superior to precipitation and analysis of the ACN supernatant [27], ultra-filtration, [28] albumin depletion chromatography [29] or C18 partition chromatography alone [13]. Precipitating all of the polypeptides with 90% ACN followed by step-wise extraction of the peptides with mixtures of organic solvent and water was the optimal method to sensitively detect peptides from blood [21]. Here a step gradient of acetonitrile/water to extract 200 µl of EDTA plasma for analysis by LC–ESI–MS/MS showed a high signal to noise ratio [21] and resulted in the confident identification of tryptic peptides [17] from breast cancer versus normal control samples.

Computation and statistics

Partition of each clinical sample into multiple sub-fractions, that each must be randomly and independently sampled by analytical C18 LC–ESI–MS/MS provides sensitivity [21] but also creates a large computational challenge. Previously the 32-bit computer power was lacking to identify and compare all the peptides and protein from thousands of LC–ESI–MS/MS recordings in a large multisite clinical experiment [30]. Here we show the MS/MS spectra from random and independent sampling of peptides from 1508 LC–ESI–MS/MS experiments from multiple clinical treatments and sites may be fit to peptides using a 64 bit server and then the observation frequency and precursor intensity compared across treatments using SQL SERVER/R that shows excellent data compression and relation [14, 17]. The protein p-values and FDR q-values were computed from organic extraction or chromatography of blood fluid and the peptide-to-protein distribution of the precursor ions of greater than ~ 10,000 (E4) counts were compared to a null (i.e. known false positive) model of noise or computer generated random MS/MS spectra [15, 17, 3134]. Peptides may be identified from the fit of MS/MS spectra to peptide sequences [35] that permits the accurate estimate of the type I error rate (p value) of protein identification that may be corrected by the method Benjamini and Hochberg [36] to yield the FDR (q-value) [17, 21, 31]. The peptide fits may be filtered from redundant results to the single best fit of the peptide sequence and charge state using a complex key in SQL Server [17, 31, 37, 38]. Simulations using random or noise MS/MS spectra distributions may be used to control the type I error of experimental MS/MS spectra correlations to tryptic peptides [1517, 3134, 37]. The peptide and protein observation counts (frequency) may be analyzed using classical statistic methods such as Chi Square analysis [33, 39]. Log10 transformation of precursor intensity yields a normal distribution that permits comparison of peptide and proteins expression levels by ANOVA [15, 16]. The SQL Server system permits the direct interrogation of the related data by the open source R statistical system without proteomic-specific software packages. Here the use of SQL/R has permitted the detailed statistical analysis of randomly and independently sampled LC–ESI–MS/MS data from multiple hospitals in parallel that would be requisite for a multisite clinical trial [37, 39].

Cancer proteins in blood fluids

Markers of breast cancer [40] have been examined from nano vesicles [41] that may mediate tumor invasion [42], in proximal fluid [43, 44] or from serum or plasma [4547]. Many non-specific, i.e. “common distress” or “acute phase” proteins have been detected to increase by the analysis of blood fluids such as amyloids, haptoglobin, alpha 1 antitrypsin, clusterin, apolipoproteins, complement components, heat shock proteins, fibrinogens, hemopexin, alpha 2 macroglobulin and others that may be of limited diagnostic value [20, 48, 49]. There is good evidence that cellular proteins may exist in circulation, and even form supramolecular complexes with other molecules, in the blood [50]. Proteins and nucleic acids may be packaged in exosomes that are challenging to isolate [51, 52] and it appears that cellular proteins may be secreted into circulation [50, 53, 54]. Here, the combination of step wise organic partition [21], random and independent sampling by nano electrospray LC–ESI–MS/MS [17], and 64 bit computation with SQL SERVER/R [14] permitted the sensitive detection of peptides and/or phosphopeptides from human plasma. The variation in endogenous peptides within parent protein chains in computed complexes from breast cancer patients versus ovarian cancer and other disease and normal plasma were compared by the classical statistical approaches of the Chi Square test followed by univariate ANOVA [12, 15, 16].

Materials and methods

Materials

Anonymous human EDTA plasma with no identifying information from multiple disease and control populations were transported frozen and stored in a − 80 ºC freezer. Breast cancer vs ovarian cancer disease and matched normal female human EDTA plasma was obtained from the Ontario Tumor Bank of the Ontario Institute of Cancer Research, Toronto Ontario. Additional controls of heart attack (venous and arterial) and normal pre-operative orthopedic samples were from St. Joseph’s Hospital of McMaster University. ICU-Sepsis and ICU-Alone were obtained from St. Michael’s Hospital Toronto. Multiple sclerosis, Alzheimer’s dementia and normal controls were from Amsterdam University Medical Center, Vrije Universiteit Amsterdam. In addition, EDTA plasma samples collected onto ice as a baseline degradation controls were obtained from IBBL Luxembourg and stored freeze dried. The anonymous plasma samples with no identifying information from the multiple clinical locations were analyzed under the Ryerson Research Ethics Board Protocol REB 2015-207. C18 zip tips were obtained from Millipore (Bedford, MA), C18 HPLC resin was from Agilent (Zorbax 300 SB-C18 5-micron). Solvents were obtained from Caledon Laboratories (Georgetown, Ontario, Canada). All other salts and reagents were obtained from Sigma-Aldrich-Fluka (St Louis, MO) except where indicated. The level of replication in the LC–ESI–MS-MS experiments was typically between 9 and 26 independent patient plasma samples for each disease and control.

Sample preparation

Human EDTA plasma samples (200 μl) were precipitated with 9 volumes of acetonitrile (90% ACN) [23], followed by the selective extraction of the pellet using a step gradient to achieve selectivity across sub-fractions and thus greater sensitivity [21]. Disposable plastic 2 ml sample tubes and plastic pipette tips were used to handle samples. The acetonitrile suspension was separated with a centrifuge at 12,000 RCF for 5 min. The acetonitrile supernatant, that contains few peptides, was collected, transferred to a fresh sample tube and dried in a rotary lyophilizer. The organic precipitate (pellet) that contains a much larger total amount of endogenous polypeptides [23] was manually re-suspended using a step gradient of increasing water content to yield 10 fractions from those soluble in 90% ACN to 10% ACN, followed by 100% H2O, and then 5% formic acid [21]. The step-wise extracts were clarified with a centrifuge at 12,000 RCF for 5 min. The extracted sample fractions were dried under vacuum in a rotary lyophillizer and stored at − 80 °C for subsequent analysis.

Preparative C18 chromatography

The peptides of EDTA plasma were precipitated in ACN, extracted from the pellet in a step-gradient with increasing water, dried and then collected over C18 preparative partition chromatography. Preparative C18 separation provided the best results for peptide and phosphopeptide analysis in a “blind” analysis [55]. Solid phase extraction with C18 for LC–ESI–MS/MS was performed as previously described [12, 13, 2224]. The C18 chromatography resin (Zip Tip) was wet with 65% acetonitrile and 5% formic acid before equilibration in water with 5% formic acid. The plasma extract was dissolved in 200 μl of 5% formic acid in water for C18 binding. The resin was washed with at least five volumes of the binding buffer. The resin was eluted with ≥ 3 column volumes of 65% acetonitrile (2 µl) in 5% formic acid. In order to avoid cross-contamination the preparative C18 resin was discarded after a single use.

LC–ESI–MS/MS

In order to entirely prevent any possibility of cross contamination, a new disposable nano analytical HPLC column and nano emitter was fabricated for recording each patient sample-fraction set. The ion traps were cleaned and tested for sensitivity with angiotensin and glu fibrinogen prior to recordings. The new column was conditioned and quality controlled with a mixture of three non-human protein standards [32] using a digest of Bovine Cytochrome C, Yeast alcohol dehydrogenase (ADH) and Rabbit Glycogen Phosphorylase B to confirm the sensitivity and mass accuracy of the system prior to each patient sample set. The statistical validity of the LTQ XL (Thermo Electron Corporation, Waltham, MA, USA) linear quadrupole ion trap for LC–ESI–MS/MS of human plasma [21] was in agreement with the results from the 3D Paul ion trap [15, 3234]. The stepwise extractions were collected and desalted over C18 preparative micro columns, eluted in 2 µl of 65% ACN and 5% formic acid, diluted tenfold with 5% formic acid in water and immediately loaded manually into a 20 μl metal sample loop before injecting onto the analytical column via a Rhodynne injector. Endogenous peptide samples were analyzed over a discontinuous gradient generated at a flow rate of ~ 10 μl per minute with an Agilent 1100 series capillary pump and split upstream of the injector during recording to about ~ 200 nl per minute. The separation was performed with a C18 (150 mm × 0.15 mm) fritted capillary column. The acetonitrile profile was started at 5%, ramped to 12% after 5 min and then increased to 65% over ~ 90 min, remained at 65% for 5 min, decreased to 50% for 15 min and then declined to a final proportion of 5% prior to injection of the next step fraction from the same patient. The nano HPLC effluent was analyzed by ESI ionization with detection by MS and fragmentation by MS/MS with a linear quadrupole ion trap [56]. The device was set to collect the precursors for up to 200 ms prior to MS/MS fragmentation with up to four fragmentations per precursor ion that were averaged. Individual, independent samples from disease, normal and ice cold control were precipitated, fractionated over a step gradient and collected over C18 for manual injection.

Correlation analysis

Correlation analysis of ion trap data was performed using a goodness of fit test by X!TANDEM [35] and by cross-correlation using SEQUEST [57] on separate servers to match tandem mass spectra to peptide sequences from the Homo sapiens RefSeq, Ensembl, SwissProt, including hypothetical proteins XP or Genomic loci [13, 14, 58]. Endogenous peptides with precursors greater than 10,000 (E4) arbitrary counts were searched only as fully tryptic peptides (TRYP) and/or phosphopeptides (TYRP STYP) and compared in SQL Server/R. The X!TANDEM default ion trap data settings of ± 3 m/z from precursors peptides considered from 300 to 2000 m/z with a tolerance of 0.5 Da error in the fragments were used [15, 22, 3335, 59]. The best fit peptide of the MS/MS spectra to fully tryptic and/or phospho-tryptic peptides at charge states of + 2 versus + 3 were accepted with additional acetylation, or oxidation of methionine and with possible loss of water or ammonia. The resulting accession numbers, actual and estimated masses, correlated peptide sequences, peptide and protein scores, resulting protein sequences and other associated data were captured and assembled together in an SQL Server relational database [14].

Data sampling, sorting, transformation and visualization

Each disease and normal treatment was represented by 9 to 26 independent patient samples that were resolved into 10 organic/water sub-fractions resulting in 90 to 260 sub-samples per treatment for a total of 1508 LC–ESI–MS/MS experiments that were archived together in SQL Server for statistical analysis [37, 39]. The linear quadrupole ion trap provided the precursor ion intensity values and the peptide fragment MS/MS spectra. The peptides and proteins were identified from MS/MS spectra by X!TANDEM and the observation frequency was counted by the SEQUEST algorithm. The large number of redundant correlations to each MS/MS at different charge states or to different peptides sequences may be a source of type I error that can be filtered out by a complex key or hashtag in SQL Server to ensure that each MS/MS spectra is only fit to one peptide and charge state. The MS and MS/MS spectra together with the results of the X!TANDEM and SEQUEST algorithms were parsed into an SQL Server database and filtered [14] before statistical and graphical analysis with the generic R data system [1416, 32, 58]. The sum of the MS/MS spectra collected in breast versus ovarian cancer were summed to correct the observation frequency using Eq. 1 and the χ p-values converted to FDR q-values by the method of Benjamini and Hochberg [36]:

(Breast-Ovarian)2/(Ovarian+1)
1

Correction by sum correlations yielded similar results (not shown). The precursor intensity data for MS/MS spectra were log10 transformed, tested for normality and analyzed across institution/study and diseases verses controls by means, standard errors and ANOVA [15, 16, 32]. The entirely independent analysis of the precursor intensity using the rigorous ANOVA with Tukey–Kramer HSD test versus multiple controls was achieved using a 64 bit R server.

Materials

Anonymous human EDTA plasma with no identifying information from multiple disease and control populations were transported frozen and stored in a − 80 ºC freezer. Breast cancer vs ovarian cancer disease and matched normal female human EDTA plasma was obtained from the Ontario Tumor Bank of the Ontario Institute of Cancer Research, Toronto Ontario. Additional controls of heart attack (venous and arterial) and normal pre-operative orthopedic samples were from St. Joseph’s Hospital of McMaster University. ICU-Sepsis and ICU-Alone were obtained from St. Michael’s Hospital Toronto. Multiple sclerosis, Alzheimer’s dementia and normal controls were from Amsterdam University Medical Center, Vrije Universiteit Amsterdam. In addition, EDTA plasma samples collected onto ice as a baseline degradation controls were obtained from IBBL Luxembourg and stored freeze dried. The anonymous plasma samples with no identifying information from the multiple clinical locations were analyzed under the Ryerson Research Ethics Board Protocol REB 2015-207. C18 zip tips were obtained from Millipore (Bedford, MA), C18 HPLC resin was from Agilent (Zorbax 300 SB-C18 5-micron). Solvents were obtained from Caledon Laboratories (Georgetown, Ontario, Canada). All other salts and reagents were obtained from Sigma-Aldrich-Fluka (St Louis, MO) except where indicated. The level of replication in the LC–ESI–MS-MS experiments was typically between 9 and 26 independent patient plasma samples for each disease and control.

Sample preparation

Human EDTA plasma samples (200 μl) were precipitated with 9 volumes of acetonitrile (90% ACN) [23], followed by the selective extraction of the pellet using a step gradient to achieve selectivity across sub-fractions and thus greater sensitivity [21]. Disposable plastic 2 ml sample tubes and plastic pipette tips were used to handle samples. The acetonitrile suspension was separated with a centrifuge at 12,000 RCF for 5 min. The acetonitrile supernatant, that contains few peptides, was collected, transferred to a fresh sample tube and dried in a rotary lyophilizer. The organic precipitate (pellet) that contains a much larger total amount of endogenous polypeptides [23] was manually re-suspended using a step gradient of increasing water content to yield 10 fractions from those soluble in 90% ACN to 10% ACN, followed by 100% H2O, and then 5% formic acid [21]. The step-wise extracts were clarified with a centrifuge at 12,000 RCF for 5 min. The extracted sample fractions were dried under vacuum in a rotary lyophillizer and stored at − 80 °C for subsequent analysis.

Preparative C18 chromatography

The peptides of EDTA plasma were precipitated in ACN, extracted from the pellet in a step-gradient with increasing water, dried and then collected over C18 preparative partition chromatography. Preparative C18 separation provided the best results for peptide and phosphopeptide analysis in a “blind” analysis [55]. Solid phase extraction with C18 for LC–ESI–MS/MS was performed as previously described [12, 13, 2224]. The C18 chromatography resin (Zip Tip) was wet with 65% acetonitrile and 5% formic acid before equilibration in water with 5% formic acid. The plasma extract was dissolved in 200 μl of 5% formic acid in water for C18 binding. The resin was washed with at least five volumes of the binding buffer. The resin was eluted with ≥ 3 column volumes of 65% acetonitrile (2 µl) in 5% formic acid. In order to avoid cross-contamination the preparative C18 resin was discarded after a single use.

LC–ESI–MS/MS

In order to entirely prevent any possibility of cross contamination, a new disposable nano analytical HPLC column and nano emitter was fabricated for recording each patient sample-fraction set. The ion traps were cleaned and tested for sensitivity with angiotensin and glu fibrinogen prior to recordings. The new column was conditioned and quality controlled with a mixture of three non-human protein standards [32] using a digest of Bovine Cytochrome C, Yeast alcohol dehydrogenase (ADH) and Rabbit Glycogen Phosphorylase B to confirm the sensitivity and mass accuracy of the system prior to each patient sample set. The statistical validity of the LTQ XL (Thermo Electron Corporation, Waltham, MA, USA) linear quadrupole ion trap for LC–ESI–MS/MS of human plasma [21] was in agreement with the results from the 3D Paul ion trap [15, 3234]. The stepwise extractions were collected and desalted over C18 preparative micro columns, eluted in 2 µl of 65% ACN and 5% formic acid, diluted tenfold with 5% formic acid in water and immediately loaded manually into a 20 μl metal sample loop before injecting onto the analytical column via a Rhodynne injector. Endogenous peptide samples were analyzed over a discontinuous gradient generated at a flow rate of ~ 10 μl per minute with an Agilent 1100 series capillary pump and split upstream of the injector during recording to about ~ 200 nl per minute. The separation was performed with a C18 (150 mm × 0.15 mm) fritted capillary column. The acetonitrile profile was started at 5%, ramped to 12% after 5 min and then increased to 65% over ~ 90 min, remained at 65% for 5 min, decreased to 50% for 15 min and then declined to a final proportion of 5% prior to injection of the next step fraction from the same patient. The nano HPLC effluent was analyzed by ESI ionization with detection by MS and fragmentation by MS/MS with a linear quadrupole ion trap [56]. The device was set to collect the precursors for up to 200 ms prior to MS/MS fragmentation with up to four fragmentations per precursor ion that were averaged. Individual, independent samples from disease, normal and ice cold control were precipitated, fractionated over a step gradient and collected over C18 for manual injection.

Correlation analysis

Correlation analysis of ion trap data was performed using a goodness of fit test by X!TANDEM [35] and by cross-correlation using SEQUEST [57] on separate servers to match tandem mass spectra to peptide sequences from the Homo sapiens RefSeq, Ensembl, SwissProt, including hypothetical proteins XP or Genomic loci [13, 14, 58]. Endogenous peptides with precursors greater than 10,000 (E4) arbitrary counts were searched only as fully tryptic peptides (TRYP) and/or phosphopeptides (TYRP STYP) and compared in SQL Server/R. The X!TANDEM default ion trap data settings of ± 3 m/z from precursors peptides considered from 300 to 2000 m/z with a tolerance of 0.5 Da error in the fragments were used [15, 22, 3335, 59]. The best fit peptide of the MS/MS spectra to fully tryptic and/or phospho-tryptic peptides at charge states of + 2 versus + 3 were accepted with additional acetylation, or oxidation of methionine and with possible loss of water or ammonia. The resulting accession numbers, actual and estimated masses, correlated peptide sequences, peptide and protein scores, resulting protein sequences and other associated data were captured and assembled together in an SQL Server relational database [14].

Data sampling, sorting, transformation and visualization

Each disease and normal treatment was represented by 9 to 26 independent patient samples that were resolved into 10 organic/water sub-fractions resulting in 90 to 260 sub-samples per treatment for a total of 1508 LC–ESI–MS/MS experiments that were archived together in SQL Server for statistical analysis [37, 39]. The linear quadrupole ion trap provided the precursor ion intensity values and the peptide fragment MS/MS spectra. The peptides and proteins were identified from MS/MS spectra by X!TANDEM and the observation frequency was counted by the SEQUEST algorithm. The large number of redundant correlations to each MS/MS at different charge states or to different peptides sequences may be a source of type I error that can be filtered out by a complex key or hashtag in SQL Server to ensure that each MS/MS spectra is only fit to one peptide and charge state. The MS and MS/MS spectra together with the results of the X!TANDEM and SEQUEST algorithms were parsed into an SQL Server database and filtered [14] before statistical and graphical analysis with the generic R data system [1416, 32, 58]. The sum of the MS/MS spectra collected in breast versus ovarian cancer were summed to correct the observation frequency using Eq. 1 and the χ p-values converted to FDR q-values by the method of Benjamini and Hochberg [36]:

(Breast-Ovarian)2/(Ovarian+1)
1

Correction by sum correlations yielded similar results (not shown). The precursor intensity data for MS/MS spectra were log10 transformed, tested for normality and analyzed across institution/study and diseases verses controls by means, standard errors and ANOVA [15, 16, 32]. The entirely independent analysis of the precursor intensity using the rigorous ANOVA with Tukey–Kramer HSD test versus multiple controls was achieved using a 64 bit R server.

Results

Partition of plasma samples using differential solubility in organic/water mixtures combined with random and independent sampling by LC–ESI–MS/MS detected peptides from proteins that were more frequently observed and/or showed greater intensity in breast versus ovarian cancer. Here four independent lines of evidence, Chi Square analysis of observation frequency, previously established structural/functional relationships from STRING, ANOVA analysis of peptide intensity, and agreement with the previous genetic or biochemical experiments, all indicated that there was significant variation in the peptides of breast cancer patients compared to ovarian cancer and other diseases or normal plasma samples.

LC–ESI–MS/MS

The pool of endogenous tryptic (TRYP) and/or tryptic phosphopeptides (TRYP STYP) were randomly and independently sampled without replacement by liquid chromatography, nano electrospray ionization and tandem mass spectrometry (LC–ESI–MS/MS) [17] from breast vs ovarian cancer, or female normal, other disease and normal plasma, and ice cold controls to serve as a baseline [18, 19]. Some 15,968,550 MS/MS spectra ≥ E4 intensity counts were correlated by the SEQUEST and X!TANDEM algorithms that resulted in a total of 19,197,152 redundant MS/MS spectra to peptide in protein matches. The redundant correlations from SEQUEST were filtered to retain only the best fit by charge state and peptide sequence in SQL Server to entirely avoid re-use of the same MS/MS spectra [17, 31, 37, 39]. The filtered results were then analyzed by the generic R statistical system in a matrix of disease and controls that reveals the set of blood peptides and proteins specific to each disease state. The statistical validity of the extraction and sampling system were previously established by computation of protein (gene symbol) p-values and FDR corrected q-values by the method of Benjamini and Hochberg [36] and frequency comparison to false positive noise or random spectra [17, 21].

Frequency correction

A total of 455,426 MS/MS ≥ E4 counts were collected from breast cancer samples and 498,616 MS/MS ≥ E4 counts were collected from ovarian cancer plasma and these sums were used to correct observation frequency. A small subset of proteins show large increases or decreases in observation frequency between breast versus ovarian cancer resulting in large Chi Square values (Fig. 1). Similar results were obtained from comparison to female normal (not shown).

An external file that holds a picture, illustration, etc.
Object name is 12014_2019_9262_Fig1_HTML.jpg

Quantile plots of the corrected difference and Chi Square values of the Breast Cancer versus Ovarian Cancer results after frequency correction. The difference of breast cancer (n ≥ 9) versus ovarian cancer (n ≥ 9) using the quantile plot that tended to zero (see quantile line). Similar results were obtained by comparison to breast cancer or other controls (not shown). Plots: a quantile plot of the observation frequency of tryptic peptides from breast cancer–ovarian cancer; b χ plot of the observation frequency of tryptic peptides from breast cancer–ovarian cancer tryptic peptides; c quantile plot of the observation frequency of tryptic STYP peptides from breast cancer–ovarian cancer; d χ plot of the observation frequency of tryptic STYP peptides from breast cancer–ovarian cancer tryptic peptides

Comparison of breast cancer to ovarian cancer by Chi square analysis

A set of ~ 500 gene symbols showed Chi Square (χ) values ≥ 15 between breast cancer versus ovarian cancer. Specific peptides and/or phosphopeptides from cellular proteins, membrane proteins, nucleic acid binding proteins, signaling factors, metabolic enzymes and others, including uncharacterized proteins, showed significantly greater observation frequency in breast cancer. In agreement with the literature, peptides from many established plasma proteins including acute phase or common distress proteins such as APOE, C4A, C4B, C4B2, C3, CFI, APOA1, APOC2, APOC4-APOC2, IGHE, ITIH3, and ITIH4 [60, 61] were observed to vary between cancer and control samples. The Chi Square analysis showed some proteins with χ values that were apparently too large (χ ≥ 60, p < 0.0001, d.f. 1) to all have resulted from random sampling error. Many cellular proteins also showed large changes in frequency by Chi Square (χ > 100, p < 0.0001) in the breast cancer samples such as CPEB1, LTBP4, HIF-1A, IGHE, RAB44, NEFM, C19orf82, SLC35B1, 1D12A, C8orf34, HIF1A, OCLN, EYA1, HLA-DRB1, LARS, PTPDC1, WWC1, ZNF562, PTMA, MGAT1, NDUFA1, NOGOC, OR1E1, OR1E2, CFI, HSA12, GCSH, ELTD1, TBX15, NR2C2, FLJ00045, PDLIM1 GALNT9, ASH2L, PPFIBP1, LRRC4B, SLCO3A1, BHMT2, CS, FAM188B2, LGALS7, SAT2, SFRS8, SLC22A12, WNT9B, SLC2A4, ZNF101, WT1, CCDC47, ERLIN1, SPFH1, EID2, THOC1, DDX47, MREG, PTPRE, EMILIN1, DKFZp779G1236 and MAP3K8 among others (Table 1). The full list of Chi Square results are found in the Additional file 1: Table S1.

Table 1

Breast cancer specific proteins detected by fully tryptic peptides and/or fully tryptic phosphopeptides (STYP) that show a Chi Square (χ) value of ≥ 200. N is the number of protein accessions per Gene Symbol

Tryptic Gene_SymbolTryptic STYP
Gene SymbolMean X2nGene SymbolMean X2n
CPEB13632.9193378LTBP44340.2175661
LTBP42560.4715171C19orf823256.7035661
HIF-1A1640.9750191PMEPA11849.2572011
C4A1626.8669282C4A1703.1282642
C4B1626.8669282HIF-1A1668.9546241
C4B_21612.0063551C4B_21648.1029361
C3757.0579692C4B1637.2278962
IGHE656.1050421CA71582.2706931
RAB44656.1050421PCDHGA51462.8528422
NEFM652.1409575C8orf341189.4417685
C19orf82613.8831731C3835.3431962
SLC35B1479.466771KNOP1822.6367313
C8orf34460.1130725AMMECR1L794.0248115
1D12A432.718761HMMR699.7053361
HIF1A352.5166793HTR3B670.7911561
OCLN341.8355143PCDHJ611.6471951
APOE336.1486973ZFAND1522.9664222
PTPDC1316.1831872PPID522.5277351
EYA1306.8587331OXER1509.7015161
HLA-DRB1306.8587331DCHS2507.1034361
WWC1294.6790579RAB44449.0291891
ZNF562273.55129113NUP50431.6355554
CFI251.9961917HLA-DRB1417.2386561
MGAT1241.8144911PCED1A375.6303694
NDUFA1241.8144911HIF1A304.827443
NOGOC241.8144911CHMP5297.0803682
OR1E1241.8144911HMP19289.4364345
OR1E2241.8144911LOC102723665286.5018571
PTMA234.9387171CYC1260.8175372
HSA12218.3366551GCSH260.0517941
ELTD1206.6443341CNBP259.2434577
GCSH202.574711SMIM12256.5485071

Pathway and gene ontology analysis using the STRING algorithm

The protein gene symbols with large Chi Square values were significantly enriched in proteins that showed a complex set of previously established functional and structural relationships by STRING analysis. In a computationally independent method to ensure the variation in proteins associated with breast cancer were not just the result of some random process, we analyzed the distribution of the known protein–protein interactions and the distribution of the cellular location, molecular function and biological processes of the proteins identified from endogenous peptides with respect to a random sampling of the human genome. There were many protein interactions apparent between the proteins computed to be specific to breast cancer from fully tryptic (Fig. 2) and/or phospho tryptic peptides (Fig. 3). The breast cancer samples showed statistically significant enrichment of protein interactions and Gene Ontology terms that were consistent with structural and functional relationships between the proteins identified in breast cancer compared to a random sampling of the human genome (Tables 2, ,3,3, ,4):4): STRING analysis of the breast cancer specific proteins detected by fully tryptic peptides and/or fully tryptic phosphopeptides with a Chi Square (χ) value of ≥ 9 showed a significant protein interaction [Network Stats: number of nodes, 1580; number of edges, 9987; average node degree, 12.6; avg. local clustering coefficient, 0.272; expected number of edges, 8736; PPI enrichment p-value < 1.0e−16].

An external file that holds a picture, illustration, etc.
Object name is 12014_2019_9262_Fig2_HTML.jpg

The breast cancer STRING network where Chi Square χ ≥ 15 from fully tryptic peptides. Breast cancer tryptic peptide frequency difference greater than 15 and χ value greater than 15 at degrees of freedom of 1 (p < 0.0001). Network Stats: number of nodes, 173; number of edges, 260; average node degree, 3.01; avg. local clustering coefficient, 0.378; expected number of edges, 206; PPI enrichment p-value, 0.000175

An external file that holds a picture, illustration, etc.
Object name is 12014_2019_9262_Fig3_HTML.jpg

The breast cancer STRING network where Chi Square χ ≥ 15 from fully tryptic phospho peptides. Breast cancer TRYP STYP, frequency difference greater than 15 and χ value greater than 15 at degrees of freedom of 1 (p < 0.0001). Network Information: number of nodes, 191; number of edges, 182; average node degree, 1.91; avg. local clustering coefficient, 0.335; expected number of edges, 152; PPI enrichment p-value, 0.00911

Table 2

STRING analysis of Biological Process of Gene Symbol distributions from the TRYP and TRYP STYP where delta and χ were both greater than 9 after correction

#Term IDTerm descriptionObserved gene countBackground gene countFalse discovery rate
O:0016043Cellular component organization55151634.00E−09
O:0071840Cellular component organization or biogenesis56753424.00E−09
O:0007017Microtubule-based process1066051.17E−08
O:0051641Cellular localization26721807.59E−08
O:0006996Organelle organization35631318.96E−08
O:0007010Cytoskeleton organization1399533.36E−07
O:0007018Microtubule-based movement572763.34E−06
O:0007399Nervous system development25722067.50E−06
O:0008104Protein localization23019663.52E−05
O:0120036Plasma membrane bounded cell projection organization13810343.52E−05
O:0048731System development42741444.68E−05
O:0033036Macromolecule localization25722684.69E−05
O:0070727Cellular macromolecule localization17113744.83E−05
O:0030030Cell projection organization14010674.96E−05
O:0034613Cellular protein localization16913677.45E−05
O:0009987Cellular process1271146520.00013
O:0051179Localization51652330.00015
O:0043170Macromolecule metabolic process70274530.00018
O:0007275Multicellular organism development47047260.00025
O:0032502Developmental process52854010.00025
O:0051649Establishment of localization in cell18916160.0003
O:0046907Intracellular transport16713900.00031
O:0090304Nucleic acid metabolic process39939410.00043
O:0051128Regulation of cellular component organization25223060.00047
O:0007156Homophilic cell adhesion via plasma membrane adhesion molecules341580.00063
O:0048856Anatomical structure development49650850.00066
O:0006139Nucleobase-containing compound metabolic process44945510.00082
O:0007155Cell adhesion1108430.00082
O:0006928Movement of cell or subcellular component16013550.001
O:0051276Chromosome organization1259990.001
O:0097435Supramolecular fiber organization603830.0012
O:0046483Heterocycle metabolic process45947160.002
O:0048666Neuron development997580.002
O:0000226Microtubule cytoskeleton organization603930.0022
O:0019219Regulation of nucleobase-containing compound metabolic process40841330.0022
O:0044260Cellular macromolecule metabolic process60264130.0022
O:0051130Positive regulation of cellular component organization13511280.0025
O:0006725Cellular aromatic compound metabolic process46047540.0028
O:0060255Regulation of macromolecule metabolic process57260720.0028
O:0098609Cell–cell adhesion624160.0028
O:0044085Cellular component biogenesis26725560.0029
O:0051252Regulation of RNA metabolic process38538900.0029
O:0010468Regulation of gene expression44045330.0033
O:0022607Cellular component assembly24723430.0034
O:0048699Generation of neurons16214220.0034
O:0071166Ribonucleoprotein complex localization271250.0034
O:0030182Neuron differentiation1159400.0038
O:0032989Cellular component morphogenesis937200.0038
O:0098742Cell–cell adhesion via plasma-membrane adhesion molecules402300.0038
O:0031175Neuron projection development826160.0043
O:0006611Protein export from nucleus291440.0048
O:0016070RNA metabolic process34234300.0048
O:0031323Regulation of cellular metabolic process56960820.0048
O:0050794Regulation of cellular process929104840.0048
O:1901360Organic cyclic compound metabolic process47449630.0049
O:0051168Nuclear export311610.005
O:0080090Regulation of primary metabolic process56059820.005
O:0051640Organelle localization775740.0051
O:0006403RNA localization372110.0053
O:0019222Regulation of metabolic process60465160.0053
O:0035023Regulation of Rho protein signal transduction271310.0053
O:2000112Regulation of cellular macromolecule biosynthetic process39540500.0053
O:0000902Cell morphogenesis826260.0054
O:0051171Regulation of nitrogen compound metabolic process54658270.0054
O:0071426Ribonucleoprotein complex export from nucleus261240.0054
O:0033043Regulation of organelle organization13411550.0058
O:0048468Cell development16614930.0058
O:0050658RNA transport341890.006
O:0006355Regulation of transcription, DNA-templated36036610.0061
O:0006405RNA export from nucleus271340.0061
O:0010467Gene expression36637330.0061
O:0022008Neurogenesis16815190.0061
O:0051056Regulation of small TPase mediated signal transduction483100.0061
O:0065007Biological regulation1026117400.0061
O:0003205Cardiac chamber development311660.0062
O:1903506Regulation of nucleic acid-templated transcription36136830.0068
O:0010556Regulation of macromolecule biosynthetic process40041430.0079
O:0006406mRNA export from nucleus231070.0083
O:0015833Peptide transport15714160.0084
O:0032501Multicellular organismal process59965070.0092
O:0051493Regulation of cytoskeleton organization654770.0092

The protein–protein interaction statistics were: 485 nodes; 1148 edges; average node degree, 4.73; avg. local clustering coefficient, 0.325; expected number of edges: 851; PPI enrichment p-value: < 1.0e−16

Table 3

STRING analysis of Molecular Function of Gene Symbol distributions from the TRYP and TRYP STYP where delta and χ were both greater than 9 after correction

#Term IDTerm descriptionObserved gene countBackground gene countFalse discovery rate
GO:0005488Binding1152118789.77E−20
GO:0005515Protein binding69466053.83E−13
GO:0005524ATP binding20914621.30E−11
GO:0043167Ion binding63760661.30E−11
GO:0032559Adenyl ribonucleotide binding21315141.80E−11
GO:0008144Drug binding22717103.62E−10
GO:0035639Purine ribonucleoside triphosphate binding23217941.81E−09
GO:0032553Ribonucleotide binding23818683.13E−09
GO:0032555Purine ribonucleotide binding23618533.33E−09
GO:0097159Organic cyclic compound binding56053823.33E−09
GO:1901363Heterocyclic compound binding55253054.36E−09
GO:0097367Carbohydrate derivative binding26521634.89E−09
GO:0000166Nucleotide binding25820975.95E−09
GO:0008092Cytoskeletal protein binding1308825.00E−08
GO:0003779Actin binding764136.27E−08
GO:0043168Anion binding30926966.27E−08
GO:0016887ATPase activity733927.90E−08
GO:0036094Small molecule binding28224603.63E−07
GO:0042623ATPase activity, coupled603201.76E−06
GO:0017111Nucleoside-triphosphatase activity1117783.30E−06
GO:0004386Helicase activity361474.31E−06
GO:0016462Pyrophosphatase activity1148196.34E−06
GO:0046872Metal ion binding42040876.91E−06
GO:0043169Cation binding42541701.22E−05
GO:0003777Microtubule motor activity291101.74E−05
GO:0008017Microtubule binding482532.25E−05
GO:0051015Actin filament binding351583.76E−05
GO:0019899Enzyme binding24121979.02E−05
GO:0003774Motor activity301310.00012
GO:0015631Tubulin binding553440.00032
GO:0051020GTPase binding836140.00064
GO:0017048Rho GTPase binding321620.00073
GO:0003682Chromatin binding695010.0018
GO:0005089Rho guanyl-nucleotide exchange factor activity19760.0025
GO:0003676Nucleic acid binding33033320.0028
GO:0005198Structural molecule activity866790.0032
GO:0031267Small GTPase binding705250.0036
GO:0004672Protein kinase activity816350.0039
GO:0140096Catalytic activity, acting on a protein22521760.005
GO:0019904Protein domain specific binding877060.0061
GO:0005085Guanyl-nucleotide exchange factor activity463110.0066
GO:0005509Calcium ion binding867000.007
GO:0017016Ras GTPase binding665100.0103
GO:0005516Calmodulin binding321940.0106
GO:0004674Protein serine/threonine kinase activity594440.011
GO:0051010Microtubule plus-end binding7130.0119
GO:0005088Ras guanyl-nucleotide exchange factor activity372430.0143
GO:0005096GTPase activator activity402780.023
GO:0004004ATP-dependent RNA helicase activity15660.0237
GO:0016773Phosphotransferase activity, alcohol group as acceptor897670.0237
GO:0030695GTPase regulator activity433070.0237
GO:0060589Nucleoside-triphosphatase regulator activity473450.0237
GO:0044877Protein-containing complex binding1089680.0241
GO:0016772Transferase activity, transferring phosphorus-containing groups1099820.0255
GO:0032947Protein-containing complex scaffold activity15680.0267
GO:0008047Enzyme activator activity635100.0303
GO:0097493Structural molecule activity conferring elasticity7170.0325
GO:0016301Kinase activity948350.0352
GO:0051959Dynein light intermediate chain binding9290.0352
GO:0042800Histone methyltransferase activity (H3-K4 specific)580.0388
GO:0008094DNA-dependent ATPase activity14660.0482
GO:0140030Modification-dependent protein binding221310.0482
GO:0008026ATP-dependent helicase activity17900.0499

Additional details see Table 2

Table 4

STRING analysis of cellular component of Gene Symbol distribution from the TRYP and TRYP STYP where delta and χ were both greater than 9 after correction

#Term IDTerm descriptionObserved gene countBackground gene countFalse discovery rate
GO:0005622Intracellular1302142861.22E−14
GO:0044424Intracellular part128213,9961.22E−14
GO:0005856Cytoskeleton28120684.49E−14
GO:0043232Intracellular non-membrane-bounded organelle46740054.49E−14
GO:0044464Cell part141716,2444.88E−11
GO:0043226Organelle114312,4327.73E−11
GO:0043229Intracellular organelle112412,1939.50E−11
GO:0044430Cytoskeletal part20715471.04E−09
GO:0032991Protein-containing complex50147922.68E−08
GO:0042995Cell projection24219692.68E−08
GO:0044422Organelle part86291114.13E−08
GO:0120025Plasma membrane bounded cell projection23419004.13E−08
GO:0005737Cytoplasm103011,2385.39E−08
GO:0005634Nucleus67668929.10E−08
GO:0044428Nuclear part45543592.36E−07
GO:0031981Nuclear lumen42540302.95E−07
GO:0015630Microtubule cytoskeleton15011184.50E−07
GO:0044446Intracellular organelle part83488824.50E−07
GO:0044451Nucleoplasm part14510734.89E−07
GO:0043005Neuron projection14911422.14E−06
GO:0099081Supramolecular polymer1228802.14E−06
GO:0070013Intracellular organelle lumen51651622.48E−06
GO:0120038Plasma membrane bounded cell projection part16513163.34E−06
GO:0099568Cytoplasmic region684023.64E−06
GO:0099512Supramolecular fiber1188738.36E−06
GO:0030054Cell junction13110061.11E−05
GO:0043227Membrane-bounded organelle100711,2441.79E−05
GO:0005930Axoneme281071.90E−05
GO:0005654Nucleoplasm35734462.24E−05
GO:0043231Intracellular membrane-bounded organelle93610,3652.24E−05
GO:0044420Extracellular matrix component20592.55E−05
GO:0097458Neuron part17114494.59E−05
GO:0005829Cytosol48549585.91E−05
GO:0032838Plasma membrane bounded cell projection cytoplasm361799.90E−05
GO:0098644Complex of collagen trimmers11190.00014
GO:0015629Actin cytoskeleton654320.00016
GO:0030424Axon755300.00023
GO:0030016Myofibril392160.00034
GO:0005911Cell–cell junction604020.00042
GO:0043292Contractile fiber402280.00045
GO:0062023Collagen-containing extracellular matrix291440.00069
GO:0016604Nuclear body947420.00088
GO:0044449Contractile fiber part372120.00093
GO:0031012Extracellular matrix452830.0011
GO:0016459Myosin complex18690.0012
GO:0031965Nuclear membrane463000.0019
GO:0005874Microtubule553850.0022
GO:0005581Collagen trimer20880.0024
GO:0098862Cluster of actin-based cell projections271430.0028
GO:0005815Microtubule organizing center856830.0029
GO:0044444Cytoplasmic part83293770.0029
GO:0044441Ciliary part584210.0032
GO:0005583Fibrillar collagen trimer7110.0033
GO:0016460Myosin II complex11320.0039
GO:0033267Axon part493410.0039
GO:0014704Intercalated disc14510.0041
GO:0005859Muscle myosin complex10270.0044
GO:0008023Transcription elongation factor complex14520.0047
GO:0032982Myosin filament9220.0047
GO:0034399Nuclear periphery251340.0047
GO:0044291Cell–cell contact zone16670.0055
GO:0005915Zonula adherens690.0069
GO:0005694Chromosome1089500.0076
GO:0005929Cilium715700.0076
GO:0030496Midbody281650.0076
GO:0043034Costamere8190.008
GO:0044447Axoneme part10310.0093
GO:0005913Cell–cell adherens junction16720.0098
GO:0032420Stereocilium12440.0098
GO:0005875Microtubule associated complex251440.01
GO:0016607Nuclear speck513810.01
GO:0031252Cell leading edge503710.01
GO:0032421Stereocilium bundle13510.01
GO:0033268Node of Ranvier7150.01
GO:0097060Synaptic membrane433080.0114
GO:0034708Methyltransferase complex18900.0124
GO:0042383Sarcolemma221220.0124
GO:0030056Hemidesmosome570.0137
GO:0098590Plasma membrane region11610610.0141
GO:0044450Microtubule organizing center part271670.0147
GO:0090543Flemming body9280.0147
GO:0005814Centriole221250.0152
GO:0030017Sarcomere301950.0159
GO:0042405Nuclear inclusion body6120.016
GO:0070161Anchoring junction382700.0172
GO:0005635Nuclear envelope564460.0183
GO:0036396RNA N6-methyladenosine methyltransferase complex580.019
GO:0005813Centrosome584680.0194
GO:0005730Nucleolus1029260.0196
GO:0030427Site of polarized growth261640.0203
GO:0045211Postsynaptic membrane342370.0207
GO:0030018Z disc211220.0217
GO:0098858Actin-based cell projection291920.0217
GO:0016363Nuclear matrix191060.0228
GO:0005938Cell cortex332300.0229
GO:0030027Lamellipodium281850.024
GO:0044304Main axon14670.0242
GO:0070449Elongin complex590.0246
GO:0005604Basement membrane17910.0248
GO:0043194Axon initial segment6140.0248
GO:0005912Adherens junction352520.0263
GO:0099513Polymeric cytoskeletal fiber736450.0402
GO:0005587Collagen type IV trimer460.0406
GO:1990752Microtubule end7220.0413
GO:0030426Growth cone241590.0442
GO:0044427Chromosomal part898190.0442
GO:0005858Axonemal dynein complex6170.0499
GO:0035371Microtubule plus-end6170.0499

Additional details see Table 2

ANOVA analysis across disease, normal and control plasma treatments

Many proteins that showed greater observation frequency in breast cancer also showed significant variation in precursor intensity compared to ovarian cancer, the female normal controls and male or female EDTA plasma from other disease and normal plasma by ANOVA comparison. The mean precursor intensity values from gene symbols that varied by Chi Square (χ > 15) were subsequently analyzed by univariate ANOVA in R to look for proteins that showed differences in ion precursor intensity values across treatments [12, 16] (Figs. 4, ,5,5, ,6).6). Common plasma proteins including APOE, ITIH4 and C3 showed significantly different intensity between breast cancer versus ovarian cancer and normal plasma (Fig. 4). Analysis of the frequently observed proteins by quantile box plots and ANOVA confirmed increases in mean precursor intensity in cancer associated proteins as SLC35B1, IQCJ-SCHIP1, MREG, BHMT2, LGALS7, THOC1, ANXA4, DHDDS, SAT2, PTMA, FYCO1 and ZNF562 among others between breast cancer versus ovarian cancer and/or other disease or normal plasma (Fig. 5). HSA12 represents many proteins that were observed only in breast cancer but were apparently only sporadically detected and require further consideration. Glutamine Serine Rich Protein 1 (QSER1) was observed most frequently in ovarian cancer (Table 5). In contrast, QSER1 showed higher average intensity in breast cancer than ovarian cancer or any other disease and normal by ANOVA followed by the Tukey–Kramer HSD test (Fig. 6) when all peptides were considered. However, the peptide QPKVKAEPPPK, that was specific to QSER1 by BLAST [62], was observed in ovarian cancer but was not observed in other samples (Fig. 6d).

An external file that holds a picture, illustration, etc.
Object name is 12014_2019_9262_Fig4_HTML.jpg

The distributions of log10 precursor intensity by quantile and quantile box plots of APOE, ITIH4, and C3 across the disease and control treatments. a APOE log10 peptide intensity quantile plot; b APOE log10 peptide intensity quantile box plot; c ITIH4 log10 peptide intensity quantile plot; d ITIH4 log10 peptide intensity quantile box plot; e C3 log10 peptide intensity quantile plot; f C3 log10 peptide intensity quantile box plot; Treatment ID numbers: 1, Alzheimer normal; 2, Alzheimer’s normal control STYP; 3, Alzheimer’s dementia; 4, Alzheimer’s dementia STYP; 5, Cancer breast; 6, Cancer breast STYP; 7, Cancer control; 8, Cancer control STYP; 9, Cancer ovarian; 10, Cancer ovarian STYP; 11, Ice Cold; 12, Ice Cold STYP; 13, Heart attack Arterial; 14 Heart attack Arterial STYP; 15, Heart attack normal control, 16, Heart attack normal Control STYP; 17, Heart attack; 18, Heart attack STYP; 19, Multiple Sclerosis normal control; 20, Multiple sclerosis normal control STYP; Multiple sclerosis; 22, Multiple Sclerosis STYP, 23 Sepsis; 24, Sepsis STYP; 25, Sepsis normal control; 26, Sepsis normal control STYP. There was significant effects of treatments and peptides by two-way ANOVA. Analysis of the proteins shown across treatments produced a significant F Statistic by one-way ANOVA. Note that many proteins were not detected in the ice cold plasma

An external file that holds a picture, illustration, etc.
Object name is 12014_2019_9262_Fig5_HTML.jpg

Quantile box plots showing the distribution of log10 precursor intensity by quantile box plots of HSA12, BHMT2, DHDDS, SLC35B1, LGALS7, SAT2, IQCJ-SCHIP1 fusion, THOC1, PTMA, MREG, ANXA4 and FYCO1 across the disease and control treatments. Box plots show log10 intensity versus treatment number for gene symbol indicated. Treatment ID numbers: 1, Alzheimer normal; 2, Alzheimer’s normal control STYP; 3, Alzheimer’s dementia; 4, Alzheimer’s dementia STYP; 5, Cancer breast; 6, Cancer breast STYP; 7, Cancer control; 8, Cancer control STYP; 9, Cancer ovarian; 10, Cancer ovarian STYP; 11, Ice Cold; 12, Ice Cold STYP; 13, Heart attack Arterial; 14 Heart attack Arterial STYP; 15, Heart attack normal control, 16, Heart attack normal Control STYP; 17, Heart attack; 18, Heart attack STYP; 19, Multiple Sclerosis normal control; 20, Multiple sclerosis normal control STYP; Multiple Sclerosis; 22, Multiple sclerosis STYP, 23 Sepsis; 24, Sepsis STYP; 25, Sepsis normal control; 26, Sepsis normal control STYP. There was significant effects of treatments and peptides by two-way ANOVA. Analysis of the proteins shown across treatments produced a significant F Statistic by one-way ANOVA. Note that many proteins were not detected in the ice cold plasma

An external file that holds a picture, illustration, etc.
Object name is 12014_2019_9262_Fig6_HTML.jpg

QSER1 ANOVA analysis and Tukey–Kramer HSD multiple means comparison of breast versus ovarian cancer and other diseases and normal treatments. a All QSER1 peptides quantile plot; b QSER1 peptide QPKVKAEPPPK quantile plot; c All QSER1 peptides box plot see ANOVA below; d QSER1 peptide QPKVKAEPPPK box plot. Treatment ID numbers: 1, Alzheimer normal; 2, Alzheimer’s normal control STYP; 3, Alzheimer’s dementia; 4, Alzheimer’s dementia STYP; 5, Cancer breast; 6, Cancer breast STYP; 7, Cancer control; 8, Cancer control STYP; 9, Cancer ovarian; 10, Cancer ovarian STYP; 11, Ice Cold; 12, Ice Cold STYP; 13, Heart attack Arterial; 14 Heart attack Arterial STYP; 15, Heart attack normal control, 16, Heart attack normal Control STYP; 17, Heart attack; 18, Heart attack STYP; 19, Multiple Sclerosis normal control; 20, Multiple Sclerosis normal control STYP; Multiple sclerosis; 22, Multiple sclerosis STYP, 23 Sepsis; 24, Sepsis STYP; 25, Sepsis normal control; 26, Sepsis normal control STYP. There was significant effects of treatments and peptides by two-way ANOVA (not shown). One way ANOVA:Df Sum Sq Mean Sq F value Pr(> F), Treatment_ID 23 113.0 4.912 16.55 < 2e−16 ***Residuals 808 239.9 0.297

Table 5

The analysis of mean peptide intensity per gene symbol for QSER1 protein by ANOVA with Tukey–Kramer multiple means comparison

TreatmentMeanSDData NTukey–Kramer
15.0727690.30298621d
24.5934090.51198967cde
34.6334970.328526bde
44.0563120.16103733a
55.9182120.76085125h
65.7175920.76334618h
74.8372760.2165738bdef
94.5426930.65645141ceg
104.6002090.64009766cde
114.5121030.5156318acde
124.02977404acde
134.4529350.49166450aceg
144.124790.35146935af
154.4193550.19876353ace
164.3242120.50453832ace
174.9288810.94731922dg
184.1734030.47833936ab
194.7403430.42814258cde
204.801510.47590735de
214.7495830.51368636cde
224.7555530.51711725cde
234.583920.46614711acde
243.73629304abc
254.8817610.95309818de

Treatment ID numbers: 1, Alzheimer normal; 2, Alzheimer’s normal control STYP; 3, Alzheimer’s dementia; 4, Alzheimer’s dementia STYP; 5, Cancer breast; 6, Cancer breast STYP; 7, Cancer control; 8, Cancer control STYP; 9, Cancer ovarian; 10, Cancer ovarian STYP; 11, Ice Cold; 12, Ice Cold STYP; 13, Heart attack Arterial; 14 Heart attack Arterial STYP; 15, Heart attack normal control, 16, Heart attack normal Control STYP; 17, Heart attack; 18, Heart attack STYP; 19, Multiple Sclerosis normal control; 20, Multiple sclerosis normal control STYP; Multiple sclerosis; 22, Multiple Sclerosis STYP, 23 Sepsis; 24, Sepsis STYP; 25, Sepsis normal control; 26, Sepsis normal control STYP. The Tukey–Kramer multiple comparison ranking of mean intensity from R is shown by letters

LC–ESI–MS/MS

The pool of endogenous tryptic (TRYP) and/or tryptic phosphopeptides (TRYP STYP) were randomly and independently sampled without replacement by liquid chromatography, nano electrospray ionization and tandem mass spectrometry (LC–ESI–MS/MS) [17] from breast vs ovarian cancer, or female normal, other disease and normal plasma, and ice cold controls to serve as a baseline [18, 19]. Some 15,968,550 MS/MS spectra ≥ E4 intensity counts were correlated by the SEQUEST and X!TANDEM algorithms that resulted in a total of 19,197,152 redundant MS/MS spectra to peptide in protein matches. The redundant correlations from SEQUEST were filtered to retain only the best fit by charge state and peptide sequence in SQL Server to entirely avoid re-use of the same MS/MS spectra [17, 31, 37, 39]. The filtered results were then analyzed by the generic R statistical system in a matrix of disease and controls that reveals the set of blood peptides and proteins specific to each disease state. The statistical validity of the extraction and sampling system were previously established by computation of protein (gene symbol) p-values and FDR corrected q-values by the method of Benjamini and Hochberg [36] and frequency comparison to false positive noise or random spectra [17, 21].

Frequency correction

A total of 455,426 MS/MS ≥ E4 counts were collected from breast cancer samples and 498,616 MS/MS ≥ E4 counts were collected from ovarian cancer plasma and these sums were used to correct observation frequency. A small subset of proteins show large increases or decreases in observation frequency between breast versus ovarian cancer resulting in large Chi Square values (Fig. 1). Similar results were obtained from comparison to female normal (not shown).

An external file that holds a picture, illustration, etc.
Object name is 12014_2019_9262_Fig1_HTML.jpg

Quantile plots of the corrected difference and Chi Square values of the Breast Cancer versus Ovarian Cancer results after frequency correction. The difference of breast cancer (n ≥ 9) versus ovarian cancer (n ≥ 9) using the quantile plot that tended to zero (see quantile line). Similar results were obtained by comparison to breast cancer or other controls (not shown). Plots: a quantile plot of the observation frequency of tryptic peptides from breast cancer–ovarian cancer; b χ plot of the observation frequency of tryptic peptides from breast cancer–ovarian cancer tryptic peptides; c quantile plot of the observation frequency of tryptic STYP peptides from breast cancer–ovarian cancer; d χ plot of the observation frequency of tryptic STYP peptides from breast cancer–ovarian cancer tryptic peptides

Comparison of breast cancer to ovarian cancer by Chi square analysis

A set of ~ 500 gene symbols showed Chi Square (χ) values ≥ 15 between breast cancer versus ovarian cancer. Specific peptides and/or phosphopeptides from cellular proteins, membrane proteins, nucleic acid binding proteins, signaling factors, metabolic enzymes and others, including uncharacterized proteins, showed significantly greater observation frequency in breast cancer. In agreement with the literature, peptides from many established plasma proteins including acute phase or common distress proteins such as APOE, C4A, C4B, C4B2, C3, CFI, APOA1, APOC2, APOC4-APOC2, IGHE, ITIH3, and ITIH4 [60, 61] were observed to vary between cancer and control samples. The Chi Square analysis showed some proteins with χ values that were apparently too large (χ ≥ 60, p < 0.0001, d.f. 1) to all have resulted from random sampling error. Many cellular proteins also showed large changes in frequency by Chi Square (χ > 100, p < 0.0001) in the breast cancer samples such as CPEB1, LTBP4, HIF-1A, IGHE, RAB44, NEFM, C19orf82, SLC35B1, 1D12A, C8orf34, HIF1A, OCLN, EYA1, HLA-DRB1, LARS, PTPDC1, WWC1, ZNF562, PTMA, MGAT1, NDUFA1, NOGOC, OR1E1, OR1E2, CFI, HSA12, GCSH, ELTD1, TBX15, NR2C2, FLJ00045, PDLIM1 GALNT9, ASH2L, PPFIBP1, LRRC4B, SLCO3A1, BHMT2, CS, FAM188B2, LGALS7, SAT2, SFRS8, SLC22A12, WNT9B, SLC2A4, ZNF101, WT1, CCDC47, ERLIN1, SPFH1, EID2, THOC1, DDX47, MREG, PTPRE, EMILIN1, DKFZp779G1236 and MAP3K8 among others (Table 1). The full list of Chi Square results are found in the Additional file 1: Table S1.

Table 1

Breast cancer specific proteins detected by fully tryptic peptides and/or fully tryptic phosphopeptides (STYP) that show a Chi Square (χ) value of ≥ 200. N is the number of protein accessions per Gene Symbol

Tryptic Gene_SymbolTryptic STYP
Gene SymbolMean X2nGene SymbolMean X2n
CPEB13632.9193378LTBP44340.2175661
LTBP42560.4715171C19orf823256.7035661
HIF-1A1640.9750191PMEPA11849.2572011
C4A1626.8669282C4A1703.1282642
C4B1626.8669282HIF-1A1668.9546241
C4B_21612.0063551C4B_21648.1029361
C3757.0579692C4B1637.2278962
IGHE656.1050421CA71582.2706931
RAB44656.1050421PCDHGA51462.8528422
NEFM652.1409575C8orf341189.4417685
C19orf82613.8831731C3835.3431962
SLC35B1479.466771KNOP1822.6367313
C8orf34460.1130725AMMECR1L794.0248115
1D12A432.718761HMMR699.7053361
HIF1A352.5166793HTR3B670.7911561
OCLN341.8355143PCDHJ611.6471951
APOE336.1486973ZFAND1522.9664222
PTPDC1316.1831872PPID522.5277351
EYA1306.8587331OXER1509.7015161
HLA-DRB1306.8587331DCHS2507.1034361
WWC1294.6790579RAB44449.0291891
ZNF562273.55129113NUP50431.6355554
CFI251.9961917HLA-DRB1417.2386561
MGAT1241.8144911PCED1A375.6303694
NDUFA1241.8144911HIF1A304.827443
NOGOC241.8144911CHMP5297.0803682
OR1E1241.8144911HMP19289.4364345
OR1E2241.8144911LOC102723665286.5018571
PTMA234.9387171CYC1260.8175372
HSA12218.3366551GCSH260.0517941
ELTD1206.6443341CNBP259.2434577
GCSH202.574711SMIM12256.5485071

Pathway and gene ontology analysis using the STRING algorithm

The protein gene symbols with large Chi Square values were significantly enriched in proteins that showed a complex set of previously established functional and structural relationships by STRING analysis. In a computationally independent method to ensure the variation in proteins associated with breast cancer were not just the result of some random process, we analyzed the distribution of the known protein–protein interactions and the distribution of the cellular location, molecular function and biological processes of the proteins identified from endogenous peptides with respect to a random sampling of the human genome. There were many protein interactions apparent between the proteins computed to be specific to breast cancer from fully tryptic (Fig. 2) and/or phospho tryptic peptides (Fig. 3). The breast cancer samples showed statistically significant enrichment of protein interactions and Gene Ontology terms that were consistent with structural and functional relationships between the proteins identified in breast cancer compared to a random sampling of the human genome (Tables 2, ,3,3, ,4):4): STRING analysis of the breast cancer specific proteins detected by fully tryptic peptides and/or fully tryptic phosphopeptides with a Chi Square (χ) value of ≥ 9 showed a significant protein interaction [Network Stats: number of nodes, 1580; number of edges, 9987; average node degree, 12.6; avg. local clustering coefficient, 0.272; expected number of edges, 8736; PPI enrichment p-value < 1.0e−16].

An external file that holds a picture, illustration, etc.
Object name is 12014_2019_9262_Fig2_HTML.jpg

The breast cancer STRING network where Chi Square χ ≥ 15 from fully tryptic peptides. Breast cancer tryptic peptide frequency difference greater than 15 and χ value greater than 15 at degrees of freedom of 1 (p < 0.0001). Network Stats: number of nodes, 173; number of edges, 260; average node degree, 3.01; avg. local clustering coefficient, 0.378; expected number of edges, 206; PPI enrichment p-value, 0.000175

An external file that holds a picture, illustration, etc.
Object name is 12014_2019_9262_Fig3_HTML.jpg

The breast cancer STRING network where Chi Square χ ≥ 15 from fully tryptic phospho peptides. Breast cancer TRYP STYP, frequency difference greater than 15 and χ value greater than 15 at degrees of freedom of 1 (p < 0.0001). Network Information: number of nodes, 191; number of edges, 182; average node degree, 1.91; avg. local clustering coefficient, 0.335; expected number of edges, 152; PPI enrichment p-value, 0.00911

Table 2

STRING analysis of Biological Process of Gene Symbol distributions from the TRYP and TRYP STYP where delta and χ were both greater than 9 after correction

#Term IDTerm descriptionObserved gene countBackground gene countFalse discovery rate
O:0016043Cellular component organization55151634.00E−09
O:0071840Cellular component organization or biogenesis56753424.00E−09
O:0007017Microtubule-based process1066051.17E−08
O:0051641Cellular localization26721807.59E−08
O:0006996Organelle organization35631318.96E−08
O:0007010Cytoskeleton organization1399533.36E−07
O:0007018Microtubule-based movement572763.34E−06
O:0007399Nervous system development25722067.50E−06
O:0008104Protein localization23019663.52E−05
O:0120036Plasma membrane bounded cell projection organization13810343.52E−05
O:0048731System development42741444.68E−05
O:0033036Macromolecule localization25722684.69E−05
O:0070727Cellular macromolecule localization17113744.83E−05
O:0030030Cell projection organization14010674.96E−05
O:0034613Cellular protein localization16913677.45E−05
O:0009987Cellular process1271146520.00013
O:0051179Localization51652330.00015
O:0043170Macromolecule metabolic process70274530.00018
O:0007275Multicellular organism development47047260.00025
O:0032502Developmental process52854010.00025
O:0051649Establishment of localization in cell18916160.0003
O:0046907Intracellular transport16713900.00031
O:0090304Nucleic acid metabolic process39939410.00043
O:0051128Regulation of cellular component organization25223060.00047
O:0007156Homophilic cell adhesion via plasma membrane adhesion molecules341580.00063
O:0048856Anatomical structure development49650850.00066
O:0006139Nucleobase-containing compound metabolic process44945510.00082
O:0007155Cell adhesion1108430.00082
O:0006928Movement of cell or subcellular component16013550.001
O:0051276Chromosome organization1259990.001
O:0097435Supramolecular fiber organization603830.0012
O:0046483Heterocycle metabolic process45947160.002
O:0048666Neuron development997580.002
O:0000226Microtubule cytoskeleton organization603930.0022
O:0019219Regulation of nucleobase-containing compound metabolic process40841330.0022
O:0044260Cellular macromolecule metabolic process60264130.0022
O:0051130Positive regulation of cellular component organization13511280.0025
O:0006725Cellular aromatic compound metabolic process46047540.0028
O:0060255Regulation of macromolecule metabolic process57260720.0028
O:0098609Cell–cell adhesion624160.0028
O:0044085Cellular component biogenesis26725560.0029
O:0051252Regulation of RNA metabolic process38538900.0029
O:0010468Regulation of gene expression44045330.0033
O:0022607Cellular component assembly24723430.0034
O:0048699Generation of neurons16214220.0034
O:0071166Ribonucleoprotein complex localization271250.0034
O:0030182Neuron differentiation1159400.0038
O:0032989Cellular component morphogenesis937200.0038
O:0098742Cell–cell adhesion via plasma-membrane adhesion molecules402300.0038
O:0031175Neuron projection development826160.0043
O:0006611Protein export from nucleus291440.0048
O:0016070RNA metabolic process34234300.0048
O:0031323Regulation of cellular metabolic process56960820.0048
O:0050794Regulation of cellular process929104840.0048
O:1901360Organic cyclic compound metabolic process47449630.0049
O:0051168Nuclear export311610.005
O:0080090Regulation of primary metabolic process56059820.005
O:0051640Organelle localization775740.0051
O:0006403RNA localization372110.0053
O:0019222Regulation of metabolic process60465160.0053
O:0035023Regulation of Rho protein signal transduction271310.0053
O:2000112Regulation of cellular macromolecule biosynthetic process39540500.0053
O:0000902Cell morphogenesis826260.0054
O:0051171Regulation of nitrogen compound metabolic process54658270.0054
O:0071426Ribonucleoprotein complex export from nucleus261240.0054
O:0033043Regulation of organelle organization13411550.0058
O:0048468Cell development16614930.0058
O:0050658RNA transport341890.006
O:0006355Regulation of transcription, DNA-templated36036610.0061
O:0006405RNA export from nucleus271340.0061
O:0010467Gene expression36637330.0061
O:0022008Neurogenesis16815190.0061
O:0051056Regulation of small TPase mediated signal transduction483100.0061
O:0065007Biological regulation1026117400.0061
O:0003205Cardiac chamber development311660.0062
O:1903506Regulation of nucleic acid-templated transcription36136830.0068
O:0010556Regulation of macromolecule biosynthetic process40041430.0079
O:0006406mRNA export from nucleus231070.0083
O:0015833Peptide transport15714160.0084
O:0032501Multicellular organismal process59965070.0092
O:0051493Regulation of cytoskeleton organization654770.0092

The protein–protein interaction statistics were: 485 nodes; 1148 edges; average node degree, 4.73; avg. local clustering coefficient, 0.325; expected number of edges: 851; PPI enrichment p-value: < 1.0e−16

Table 3

STRING analysis of Molecular Function of Gene Symbol distributions from the TRYP and TRYP STYP where delta and χ were both greater than 9 after correction

#Term IDTerm descriptionObserved gene countBackground gene countFalse discovery rate
GO:0005488Binding1152118789.77E−20
GO:0005515Protein binding69466053.83E−13
GO:0005524ATP binding20914621.30E−11
GO:0043167Ion binding63760661.30E−11
GO:0032559Adenyl ribonucleotide binding21315141.80E−11
GO:0008144Drug binding22717103.62E−10
GO:0035639Purine ribonucleoside triphosphate binding23217941.81E−09
GO:0032553Ribonucleotide binding23818683.13E−09
GO:0032555Purine ribonucleotide binding23618533.33E−09
GO:0097159Organic cyclic compound binding56053823.33E−09
GO:1901363Heterocyclic compound binding55253054.36E−09
GO:0097367Carbohydrate derivative binding26521634.89E−09
GO:0000166Nucleotide binding25820975.95E−09
GO:0008092Cytoskeletal protein binding1308825.00E−08
GO:0003779Actin binding764136.27E−08
GO:0043168Anion binding30926966.27E−08
GO:0016887ATPase activity733927.90E−08
GO:0036094Small molecule binding28224603.63E−07
GO:0042623ATPase activity, coupled603201.76E−06
GO:0017111Nucleoside-triphosphatase activity1117783.30E−06
GO:0004386Helicase activity361474.31E−06
GO:0016462Pyrophosphatase activity1148196.34E−06
GO:0046872Metal ion binding42040876.91E−06
GO:0043169Cation binding42541701.22E−05
GO:0003777Microtubule motor activity291101.74E−05
GO:0008017Microtubule binding482532.25E−05
GO:0051015Actin filament binding351583.76E−05
GO:0019899Enzyme binding24121979.02E−05
GO:0003774Motor activity301310.00012
GO:0015631Tubulin binding553440.00032
GO:0051020GTPase binding836140.00064
GO:0017048Rho GTPase binding321620.00073
GO:0003682Chromatin binding695010.0018
GO:0005089Rho guanyl-nucleotide exchange factor activity19760.0025
GO:0003676Nucleic acid binding33033320.0028
GO:0005198Structural molecule activity866790.0032
GO:0031267Small GTPase binding705250.0036
GO:0004672Protein kinase activity816350.0039
GO:0140096Catalytic activity, acting on a protein22521760.005
GO:0019904Protein domain specific binding877060.0061
GO:0005085Guanyl-nucleotide exchange factor activity463110.0066
GO:0005509Calcium ion binding867000.007
GO:0017016Ras GTPase binding665100.0103
GO:0005516Calmodulin binding321940.0106
GO:0004674Protein serine/threonine kinase activity594440.011
GO:0051010Microtubule plus-end binding7130.0119
GO:0005088Ras guanyl-nucleotide exchange factor activity372430.0143
GO:0005096GTPase activator activity402780.023
GO:0004004ATP-dependent RNA helicase activity15660.0237
GO:0016773Phosphotransferase activity, alcohol group as acceptor897670.0237
GO:0030695GTPase regulator activity433070.0237
GO:0060589Nucleoside-triphosphatase regulator activity473450.0237
GO:0044877Protein-containing complex binding1089680.0241
GO:0016772Transferase activity, transferring phosphorus-containing groups1099820.0255
GO:0032947Protein-containing complex scaffold activity15680.0267
GO:0008047Enzyme activator activity635100.0303
GO:0097493Structural molecule activity conferring elasticity7170.0325
GO:0016301Kinase activity948350.0352
GO:0051959Dynein light intermediate chain binding9290.0352
GO:0042800Histone methyltransferase activity (H3-K4 specific)580.0388
GO:0008094DNA-dependent ATPase activity14660.0482
GO:0140030Modification-dependent protein binding221310.0482
GO:0008026ATP-dependent helicase activity17900.0499

Additional details see Table 2

Table 4

STRING analysis of cellular component of Gene Symbol distribution from the TRYP and TRYP STYP where delta and χ were both greater than 9 after correction

#Term IDTerm descriptionObserved gene countBackground gene countFalse discovery rate
GO:0005622Intracellular1302142861.22E−14
GO:0044424Intracellular part128213,9961.22E−14
GO:0005856Cytoskeleton28120684.49E−14
GO:0043232Intracellular non-membrane-bounded organelle46740054.49E−14
GO:0044464Cell part141716,2444.88E−11
GO:0043226Organelle114312,4327.73E−11
GO:0043229Intracellular organelle112412,1939.50E−11
GO:0044430Cytoskeletal part20715471.04E−09
GO:0032991Protein-containing complex50147922.68E−08
GO:0042995Cell projection24219692.68E−08
GO:0044422Organelle part86291114.13E−08
GO:0120025Plasma membrane bounded cell projection23419004.13E−08
GO:0005737Cytoplasm103011,2385.39E−08
GO:0005634Nucleus67668929.10E−08
GO:0044428Nuclear part45543592.36E−07
GO:0031981Nuclear lumen42540302.95E−07
GO:0015630Microtubule cytoskeleton15011184.50E−07
GO:0044446Intracellular organelle part83488824.50E−07
GO:0044451Nucleoplasm part14510734.89E−07
GO:0043005Neuron projection14911422.14E−06
GO:0099081Supramolecular polymer1228802.14E−06
GO:0070013Intracellular organelle lumen51651622.48E−06
GO:0120038Plasma membrane bounded cell projection part16513163.34E−06
GO:0099568Cytoplasmic region684023.64E−06
GO:0099512Supramolecular fiber1188738.36E−06
GO:0030054Cell junction13110061.11E−05
GO:0043227Membrane-bounded organelle100711,2441.79E−05
GO:0005930Axoneme281071.90E−05
GO:0005654Nucleoplasm35734462.24E−05
GO:0043231Intracellular membrane-bounded organelle93610,3652.24E−05
GO:0044420Extracellular matrix component20592.55E−05
GO:0097458Neuron part17114494.59E−05
GO:0005829Cytosol48549585.91E−05
GO:0032838Plasma membrane bounded cell projection cytoplasm361799.90E−05
GO:0098644Complex of collagen trimmers11190.00014
GO:0015629Actin cytoskeleton654320.00016
GO:0030424Axon755300.00023
GO:0030016Myofibril392160.00034
GO:0005911Cell–cell junction604020.00042
GO:0043292Contractile fiber402280.00045
GO:0062023Collagen-containing extracellular matrix291440.00069
GO:0016604Nuclear body947420.00088
GO:0044449Contractile fiber part372120.00093
GO:0031012Extracellular matrix452830.0011
GO:0016459Myosin complex18690.0012
GO:0031965Nuclear membrane463000.0019
GO:0005874Microtubule553850.0022
GO:0005581Collagen trimer20880.0024
GO:0098862Cluster of actin-based cell projections271430.0028
GO:0005815Microtubule organizing center856830.0029
GO:0044444Cytoplasmic part83293770.0029
GO:0044441Ciliary part584210.0032
GO:0005583Fibrillar collagen trimer7110.0033
GO:0016460Myosin II complex11320.0039
GO:0033267Axon part493410.0039
GO:0014704Intercalated disc14510.0041
GO:0005859Muscle myosin complex10270.0044
GO:0008023Transcription elongation factor complex14520.0047
GO:0032982Myosin filament9220.0047
GO:0034399Nuclear periphery251340.0047
GO:0044291Cell–cell contact zone16670.0055
GO:0005915Zonula adherens690.0069
GO:0005694Chromosome1089500.0076
GO:0005929Cilium715700.0076
GO:0030496Midbody281650.0076
GO:0043034Costamere8190.008
GO:0044447Axoneme part10310.0093
GO:0005913Cell–cell adherens junction16720.0098
GO:0032420Stereocilium12440.0098
GO:0005875Microtubule associated complex251440.01
GO:0016607Nuclear speck513810.01
GO:0031252Cell leading edge503710.01
GO:0032421Stereocilium bundle13510.01
GO:0033268Node of Ranvier7150.01
GO:0097060Synaptic membrane433080.0114
GO:0034708Methyltransferase complex18900.0124
GO:0042383Sarcolemma221220.0124
GO:0030056Hemidesmosome570.0137
GO:0098590Plasma membrane region11610610.0141
GO:0044450Microtubule organizing center part271670.0147
GO:0090543Flemming body9280.0147
GO:0005814Centriole221250.0152
GO:0030017Sarcomere301950.0159
GO:0042405Nuclear inclusion body6120.016
GO:0070161Anchoring junction382700.0172
GO:0005635Nuclear envelope564460.0183
GO:0036396RNA N6-methyladenosine methyltransferase complex580.019
GO:0005813Centrosome584680.0194
GO:0005730Nucleolus1029260.0196
GO:0030427Site of polarized growth261640.0203
GO:0045211Postsynaptic membrane342370.0207
GO:0030018Z disc211220.0217
GO:0098858Actin-based cell projection291920.0217
GO:0016363Nuclear matrix191060.0228
GO:0005938Cell cortex332300.0229
GO:0030027Lamellipodium281850.024
GO:0044304Main axon14670.0242
GO:0070449Elongin complex590.0246
GO:0005604Basement membrane17910.0248
GO:0043194Axon initial segment6140.0248
GO:0005912Adherens junction352520.0263
GO:0099513Polymeric cytoskeletal fiber736450.0402
GO:0005587Collagen type IV trimer460.0406
GO:1990752Microtubule end7220.0413
GO:0030426Growth cone241590.0442
GO:0044427Chromosomal part898190.0442
GO:0005858Axonemal dynein complex6170.0499
GO:0035371Microtubule plus-end6170.0499

Additional details see Table 2

ANOVA analysis across disease, normal and control plasma treatments

Many proteins that showed greater observation frequency in breast cancer also showed significant variation in precursor intensity compared to ovarian cancer, the female normal controls and male or female EDTA plasma from other disease and normal plasma by ANOVA comparison. The mean precursor intensity values from gene symbols that varied by Chi Square (χ > 15) were subsequently analyzed by univariate ANOVA in R to look for proteins that showed differences in ion precursor intensity values across treatments [12, 16] (Figs. 4, ,5,5, ,6).6). Common plasma proteins including APOE, ITIH4 and C3 showed significantly different intensity between breast cancer versus ovarian cancer and normal plasma (Fig. 4). Analysis of the frequently observed proteins by quantile box plots and ANOVA confirmed increases in mean precursor intensity in cancer associated proteins as SLC35B1, IQCJ-SCHIP1, MREG, BHMT2, LGALS7, THOC1, ANXA4, DHDDS, SAT2, PTMA, FYCO1 and ZNF562 among others between breast cancer versus ovarian cancer and/or other disease or normal plasma (Fig. 5). HSA12 represents many proteins that were observed only in breast cancer but were apparently only sporadically detected and require further consideration. Glutamine Serine Rich Protein 1 (QSER1) was observed most frequently in ovarian cancer (Table 5). In contrast, QSER1 showed higher average intensity in breast cancer than ovarian cancer or any other disease and normal by ANOVA followed by the Tukey–Kramer HSD test (Fig. 6) when all peptides were considered. However, the peptide QPKVKAEPPPK, that was specific to QSER1 by BLAST [62], was observed in ovarian cancer but was not observed in other samples (Fig. 6d).

An external file that holds a picture, illustration, etc.
Object name is 12014_2019_9262_Fig4_HTML.jpg

The distributions of log10 precursor intensity by quantile and quantile box plots of APOE, ITIH4, and C3 across the disease and control treatments. a APOE log10 peptide intensity quantile plot; b APOE log10 peptide intensity quantile box plot; c ITIH4 log10 peptide intensity quantile plot; d ITIH4 log10 peptide intensity quantile box plot; e C3 log10 peptide intensity quantile plot; f C3 log10 peptide intensity quantile box plot; Treatment ID numbers: 1, Alzheimer normal; 2, Alzheimer’s normal control STYP; 3, Alzheimer’s dementia; 4, Alzheimer’s dementia STYP; 5, Cancer breast; 6, Cancer breast STYP; 7, Cancer control; 8, Cancer control STYP; 9, Cancer ovarian; 10, Cancer ovarian STYP; 11, Ice Cold; 12, Ice Cold STYP; 13, Heart attack Arterial; 14 Heart attack Arterial STYP; 15, Heart attack normal control, 16, Heart attack normal Control STYP; 17, Heart attack; 18, Heart attack STYP; 19, Multiple Sclerosis normal control; 20, Multiple sclerosis normal control STYP; Multiple sclerosis; 22, Multiple Sclerosis STYP, 23 Sepsis; 24, Sepsis STYP; 25, Sepsis normal control; 26, Sepsis normal control STYP. There was significant effects of treatments and peptides by two-way ANOVA. Analysis of the proteins shown across treatments produced a significant F Statistic by one-way ANOVA. Note that many proteins were not detected in the ice cold plasma

An external file that holds a picture, illustration, etc.
Object name is 12014_2019_9262_Fig5_HTML.jpg

Quantile box plots showing the distribution of log10 precursor intensity by quantile box plots of HSA12, BHMT2, DHDDS, SLC35B1, LGALS7, SAT2, IQCJ-SCHIP1 fusion, THOC1, PTMA, MREG, ANXA4 and FYCO1 across the disease and control treatments. Box plots show log10 intensity versus treatment number for gene symbol indicated. Treatment ID numbers: 1, Alzheimer normal; 2, Alzheimer’s normal control STYP; 3, Alzheimer’s dementia; 4, Alzheimer’s dementia STYP; 5, Cancer breast; 6, Cancer breast STYP; 7, Cancer control; 8, Cancer control STYP; 9, Cancer ovarian; 10, Cancer ovarian STYP; 11, Ice Cold; 12, Ice Cold STYP; 13, Heart attack Arterial; 14 Heart attack Arterial STYP; 15, Heart attack normal control, 16, Heart attack normal Control STYP; 17, Heart attack; 18, Heart attack STYP; 19, Multiple Sclerosis normal control; 20, Multiple sclerosis normal control STYP; Multiple Sclerosis; 22, Multiple sclerosis STYP, 23 Sepsis; 24, Sepsis STYP; 25, Sepsis normal control; 26, Sepsis normal control STYP. There was significant effects of treatments and peptides by two-way ANOVA. Analysis of the proteins shown across treatments produced a significant F Statistic by one-way ANOVA. Note that many proteins were not detected in the ice cold plasma

An external file that holds a picture, illustration, etc.
Object name is 12014_2019_9262_Fig6_HTML.jpg

QSER1 ANOVA analysis and Tukey–Kramer HSD multiple means comparison of breast versus ovarian cancer and other diseases and normal treatments. a All QSER1 peptides quantile plot; b QSER1 peptide QPKVKAEPPPK quantile plot; c All QSER1 peptides box plot see ANOVA below; d QSER1 peptide QPKVKAEPPPK box plot. Treatment ID numbers: 1, Alzheimer normal; 2, Alzheimer’s normal control STYP; 3, Alzheimer’s dementia; 4, Alzheimer’s dementia STYP; 5, Cancer breast; 6, Cancer breast STYP; 7, Cancer control; 8, Cancer control STYP; 9, Cancer ovarian; 10, Cancer ovarian STYP; 11, Ice Cold; 12, Ice Cold STYP; 13, Heart attack Arterial; 14 Heart attack Arterial STYP; 15, Heart attack normal control, 16, Heart attack normal Control STYP; 17, Heart attack; 18, Heart attack STYP; 19, Multiple Sclerosis normal control; 20, Multiple Sclerosis normal control STYP; Multiple sclerosis; 22, Multiple sclerosis STYP, 23 Sepsis; 24, Sepsis STYP; 25, Sepsis normal control; 26, Sepsis normal control STYP. There was significant effects of treatments and peptides by two-way ANOVA (not shown). One way ANOVA:Df Sum Sq Mean Sq F value Pr(> F), Treatment_ID 23 113.0 4.912 16.55 < 2e−16 ***Residuals 808 239.9 0.297

Table 5

The analysis of mean peptide intensity per gene symbol for QSER1 protein by ANOVA with Tukey–Kramer multiple means comparison

TreatmentMeanSDData NTukey–Kramer
15.0727690.30298621d
24.5934090.51198967cde
34.6334970.328526bde
44.0563120.16103733a
55.9182120.76085125h
65.7175920.76334618h
74.8372760.2165738bdef
94.5426930.65645141ceg
104.6002090.64009766cde
114.5121030.5156318acde
124.02977404acde
134.4529350.49166450aceg
144.124790.35146935af
154.4193550.19876353ace
164.3242120.50453832ace
174.9288810.94731922dg
184.1734030.47833936ab
194.7403430.42814258cde
204.801510.47590735de
214.7495830.51368636cde
224.7555530.51711725cde
234.583920.46614711acde
243.73629304abc
254.8817610.95309818de

Treatment ID numbers: 1, Alzheimer normal; 2, Alzheimer’s normal control STYP; 3, Alzheimer’s dementia; 4, Alzheimer’s dementia STYP; 5, Cancer breast; 6, Cancer breast STYP; 7, Cancer control; 8, Cancer control STYP; 9, Cancer ovarian; 10, Cancer ovarian STYP; 11, Ice Cold; 12, Ice Cold STYP; 13, Heart attack Arterial; 14 Heart attack Arterial STYP; 15, Heart attack normal control, 16, Heart attack normal Control STYP; 17, Heart attack; 18, Heart attack STYP; 19, Multiple Sclerosis normal control; 20, Multiple sclerosis normal control STYP; Multiple sclerosis; 22, Multiple Sclerosis STYP, 23 Sepsis; 24, Sepsis STYP; 25, Sepsis normal control; 26, Sepsis normal control STYP. The Tukey–Kramer multiple comparison ranking of mean intensity from R is shown by letters

Discussion

A simple and direct strategy to discover breast cancer-specific variation may be to compare plasma peptides and proteins to ovarian cancer and other disease and control sample sets under identical conditions. The aim and objective of this study was proof of concept towards a method to compare the endogenous trytic peptides of breast cancer plasma to those from multiple clinical treatments and locations that utilized random and independent sampling by a battery of robust and sensitive linear quadrupole ion traps where the results were compiled using the standard SQL Server and R statistical systems. Random and independent sampling of peptides from step-wise fractionation followed by LC–ESI–MS/MS is a time and manual labor intensive approach that is sensitive, direct, and rests on few assumptions [17, 38]. High signal to noise ratio of blood peptides is dependent on sample preparation to break the sample into many sub-fractions to relieve competition and suppression of ionization and thus achieve sensitivity [13, 21, 22] but then requires large computing power to re-assemble the sub-fractions, samples and treatments [14, 21, 38]. The careful study of pre-clinical variation over time, and under various storage and preservation conditions, seems to rule out pre-clinical variation as the most important source of variation between breast cancer and other disease and control treatments [1719]. Together the results amount to a successful proof of principal for the application of random and independent sampling of plasma from multiple clinical locations by LC–ESI–MS/MS to identify and quantify proteins and peptides that show variation between sample populations. The approach shows great sensitivity and flexibility but relies on the fit of MS/MS spectra to assign peptide identity and statistical analysis of precursor ion counts and intensity by Chi Square and ANOVA and so is computationally intensive.

Chi Square analysis of breast cancer versus ovarian cancer

The SQL Server and R statistical system permits the rapid statistical and graphical analysis of the data at the level of Gene symbols, proteins or peptides. The large differences in observation frequency between breast and ovarian cancer using Chi Square after correction by the number of mass spectra collected was a simple means to reveal proteins that may vary in expression between the related disease states. Examining the observation frequency across all twelve disease and control clinical sample sets was a direct means to look for Gene Symbols that showed greater frequency in one sample set such QSER1 or to look for its peptide QPKVKAEPPPK that was highly specific to ovarian cancer [39].

Pathway and gene ontology analysis by the STRING algorithm

The set of breast cancer gene symbols that were significant from Chi Square analysis of the peptide frequency counts were independently confirmed by STRING analysis. The network analysis by STRING indicated that the peptides and proteins detected were not merely a random selection of the proteins from the human genome but showed statistically significant protein–protein interactions, and enrichment of specific cellular components, biological processes, and molecular functions associated with the biology of cancer. The significant results from STRING analysis indicated that the results could not have resulted from random sampling error between breast versus ovarian cancer. The previously established structural or functional relationships observed among the breast cancer specific gene symbols filtered by χ were consistent with the detection of bone fide variation between breast versus ovarian cancer. The STRING results apparently indicated that specific cellular protein complexes are released into the circulation of breast cancer patients [50]. The enrichment of proteins associated with cell polarity, cytoskeleton, plasma membrane bounded cell projection, microtubule cytoskeleton, supramolecular fiber and membrane-bounded organelle were all consistent with the activation of phagocytic functions in motile cancer cells.

Breast versus ovarian cancer specific variation by ANOVA

ANOVA may be an independent means to confirm the results of frequency analysis. However, the interpretation of mean precursor intensity data by ANOVA [12] and the use of the Tukey–Kramer multiple comparison [15, 16] may be confounded by the different peptide sequences within each protein [32]. Specific endogenous tryptic peptides, were detected from breast cancer versus the corresponding ovarian cancer or the other disease and normal plasma after filtering proteins by Chi Square and ANOVA. When all peptides were considered, QSER1 showed significantly higher mean intensity in breast cancer but the QSER1 peptide QPKVKAEPPPK was observed more frequently in ovarian cancer. The exclusive observation of the peptide QPKVKAEPPPK in ovarian cancer samples seemed to indicate the presence or activation of a tryptic protease with a different selectivity for QSER1. An automated examination at the level of peptides and proteins may be required that is an even larger computational challenge. It should be possible to specifically compare and confirm the disease specific expression peptides and parent proteins by automatic targeted proteomics [18] after extraction of peptides [25] or after collection of the parent protein over the best partition chromatography resin [22] followed by tryptic digestion and analysis to test the discovery from this small experiment on a larger set of samples. For example, C4B peptides discovered by random and independent sampling were shown to be a marker of sample degradation by automatic targeted assays [1719]. Automatic targeted analysis of peptides from independent analysis provided relative quantification to rapidly confirm the potential utility of C4B peptide as a marker of sample degradation [18]. Subsequently, the best performing peptides and proteins may be absolutely quantified by external or internal-isotopic standards to provide absolute quantification.

Agreement with previous genetic and biochemical experiments

The striking agreement between the peptides and proteins observed in the plasma of breast cancer patients and the previous literature on breast cancer tumors, adjacent fluids, cell lines or blood fluids indicates that LC–ESI–MS/MS of blood peptides will be a powerful tool for selecting plasma proteins and peptides for further research and confirmation. The results of mass spectrometry show striking agreement with previous genetic or biochemical experiments on cancer tissues, tumors, biopsies or cell lines: CPEB1 [63], LTBP4 [64], HIF1A [65, 66], IGHE [67], RAB44 [68], NEFM [39], C19orf82, SLC35B1 [69], 1D12A that shows a cyptic alignment with cyclin-dependent kinase-like isoform 1 [70], C8orf34 [71], OCLN [72], EYA1 [73], HLA-DRB1 [74], LAR [75] and LRRC4B that interacts with the LARS receptor phosphatases [76], PTPDC1 [77], WWC1 [78], ZNF562, PTMA [79], MGAT1 [80], NDUFA1 [81], NOGOC [82], olfactory receptors OR1E or the HSA12 protein [83], GCSH [84], ELTD1 [85], TBX15 [86], orphan nuclear receptors such as NR2C2 [87], autophagy related proteins such as ATG16L1 (FLJ00045) that regulate the production of extracellular vesicles called exosomes [88], PDLIM1 [89, 90], GALNT9 [91], ASH2L [92], PPFIBP1 [93], SLCO3A1 [94], BHMT2 [95], CS citrate synthase [96] FAM188B2 inactive ubiquitin carboxyl-terminal hydrolase MINDY4B that is expressed in breast cancer tissue, LGALS7 [97] SAT2 [98], SFRS8, SLC22A12 [99], WNT9B [100], SLC2A4 [101], ZNF101, WT1 (Wilms Tumor Protein) [102], CCDC47 [103], ERLIN1 (SPFH1) and MREG [104], EID2 [105], THOC1 [106, 107], DDX47 [108], PTPRE [109], EMILIN1 [110], DKFZp779G1236 (piccolo, or piBRCA2) [111], MAP3K8 [112] regulated by Serine/Arginine-Rich Splicing Factor Kinase [113], QSER1 [39], IQCJ-SCHIP1 [114, 115], ANXA4 [116] and DHDDS [117] among others. The disease-specific proteins and peptides may result from the introduction of new proteins into circulation, or the release/activation of proteases in circulation, as a result of disease. The striking agreement of the plasma proteins observed here with the previous genomic, RNA expression and proteomic experiments on cancer tumors, fluids and cells indicates that comparing many and disease and control plasma samples by random and independent sampling with LC–ESI–MS/MS may be a direct and practical means to look for selective diagnostic and prognostic markers.

Chi Square analysis of breast cancer versus ovarian cancer

The SQL Server and R statistical system permits the rapid statistical and graphical analysis of the data at the level of Gene symbols, proteins or peptides. The large differences in observation frequency between breast and ovarian cancer using Chi Square after correction by the number of mass spectra collected was a simple means to reveal proteins that may vary in expression between the related disease states. Examining the observation frequency across all twelve disease and control clinical sample sets was a direct means to look for Gene Symbols that showed greater frequency in one sample set such QSER1 or to look for its peptide QPKVKAEPPPK that was highly specific to ovarian cancer [39].

Pathway and gene ontology analysis by the STRING algorithm

The set of breast cancer gene symbols that were significant from Chi Square analysis of the peptide frequency counts were independently confirmed by STRING analysis. The network analysis by STRING indicated that the peptides and proteins detected were not merely a random selection of the proteins from the human genome but showed statistically significant protein–protein interactions, and enrichment of specific cellular components, biological processes, and molecular functions associated with the biology of cancer. The significant results from STRING analysis indicated that the results could not have resulted from random sampling error between breast versus ovarian cancer. The previously established structural or functional relationships observed among the breast cancer specific gene symbols filtered by χ were consistent with the detection of bone fide variation between breast versus ovarian cancer. The STRING results apparently indicated that specific cellular protein complexes are released into the circulation of breast cancer patients [50]. The enrichment of proteins associated with cell polarity, cytoskeleton, plasma membrane bounded cell projection, microtubule cytoskeleton, supramolecular fiber and membrane-bounded organelle were all consistent with the activation of phagocytic functions in motile cancer cells.

Breast versus ovarian cancer specific variation by ANOVA

ANOVA may be an independent means to confirm the results of frequency analysis. However, the interpretation of mean precursor intensity data by ANOVA [12] and the use of the Tukey–Kramer multiple comparison [15, 16] may be confounded by the different peptide sequences within each protein [32]. Specific endogenous tryptic peptides, were detected from breast cancer versus the corresponding ovarian cancer or the other disease and normal plasma after filtering proteins by Chi Square and ANOVA. When all peptides were considered, QSER1 showed significantly higher mean intensity in breast cancer but the QSER1 peptide QPKVKAEPPPK was observed more frequently in ovarian cancer. The exclusive observation of the peptide QPKVKAEPPPK in ovarian cancer samples seemed to indicate the presence or activation of a tryptic protease with a different selectivity for QSER1. An automated examination at the level of peptides and proteins may be required that is an even larger computational challenge. It should be possible to specifically compare and confirm the disease specific expression peptides and parent proteins by automatic targeted proteomics [18] after extraction of peptides [25] or after collection of the parent protein over the best partition chromatography resin [22] followed by tryptic digestion and analysis to test the discovery from this small experiment on a larger set of samples. For example, C4B peptides discovered by random and independent sampling were shown to be a marker of sample degradation by automatic targeted assays [1719]. Automatic targeted analysis of peptides from independent analysis provided relative quantification to rapidly confirm the potential utility of C4B peptide as a marker of sample degradation [18]. Subsequently, the best performing peptides and proteins may be absolutely quantified by external or internal-isotopic standards to provide absolute quantification.

Agreement with previous genetic and biochemical experiments

The striking agreement between the peptides and proteins observed in the plasma of breast cancer patients and the previous literature on breast cancer tumors, adjacent fluids, cell lines or blood fluids indicates that LC–ESI–MS/MS of blood peptides will be a powerful tool for selecting plasma proteins and peptides for further research and confirmation. The results of mass spectrometry show striking agreement with previous genetic or biochemical experiments on cancer tissues, tumors, biopsies or cell lines: CPEB1 [63], LTBP4 [64], HIF1A [65, 66], IGHE [67], RAB44 [68], NEFM [39], C19orf82, SLC35B1 [69], 1D12A that shows a cyptic alignment with cyclin-dependent kinase-like isoform 1 [70], C8orf34 [71], OCLN [72], EYA1 [73], HLA-DRB1 [74], LAR [75] and LRRC4B that interacts with the LARS receptor phosphatases [76], PTPDC1 [77], WWC1 [78], ZNF562, PTMA [79], MGAT1 [80], NDUFA1 [81], NOGOC [82], olfactory receptors OR1E or the HSA12 protein [83], GCSH [84], ELTD1 [85], TBX15 [86], orphan nuclear receptors such as NR2C2 [87], autophagy related proteins such as ATG16L1 (FLJ00045) that regulate the production of extracellular vesicles called exosomes [88], PDLIM1 [89, 90], GALNT9 [91], ASH2L [92], PPFIBP1 [93], SLCO3A1 [94], BHMT2 [95], CS citrate synthase [96] FAM188B2 inactive ubiquitin carboxyl-terminal hydrolase MINDY4B that is expressed in breast cancer tissue, LGALS7 [97] SAT2 [98], SFRS8, SLC22A12 [99], WNT9B [100], SLC2A4 [101], ZNF101, WT1 (Wilms Tumor Protein) [102], CCDC47 [103], ERLIN1 (SPFH1) and MREG [104], EID2 [105], THOC1 [106, 107], DDX47 [108], PTPRE [109], EMILIN1 [110], DKFZp779G1236 (piccolo, or piBRCA2) [111], MAP3K8 [112] regulated by Serine/Arginine-Rich Splicing Factor Kinase [113], QSER1 [39], IQCJ-SCHIP1 [114, 115], ANXA4 [116] and DHDDS [117] among others. The disease-specific proteins and peptides may result from the introduction of new proteins into circulation, or the release/activation of proteases in circulation, as a result of disease. The striking agreement of the plasma proteins observed here with the previous genomic, RNA expression and proteomic experiments on cancer tumors, fluids and cells indicates that comparing many and disease and control plasma samples by random and independent sampling with LC–ESI–MS/MS may be a direct and practical means to look for selective diagnostic and prognostic markers.

Conclusion

The results of the step-wise organic extraction of peptides [21] provided for the enrichment of endogenous tryptic peptides with high signal to noise for random sampling [18] across disease and normal treatments. A large amount of proteomic data from multiple diseases, controls and institutions may be collected by random and independent sampling with a battery of robust and sensitive linear quadrupole ion traps and the results stored, related and statistically analyzed in 64 bit SQL Server/R. The LC–ESI–MS/MS of plasma endogenous tryptic peptides identified many blood proteins elevated in breast cancer that were previously associated with the biology of cancer or that have been shown to be biomarkers of solid tumors by genetic or biochemical methods. The striking level of agreement between the results of random and independent sampling of plasma by mass spectrometry with those from cancer tissues, fluids or cells indicated that clinical discovery of plasma by LC–ESI–MS/MS will be a powerful tool for clinical research. Peptide or proteins discovered by random and independent sampling of test samples might be confirmed by automatic targeted LC–ESI–MS/MS [1719] from a larger cohort of independent samples. It was possible to discover peptides and/or proteins specific to breast cancer versus ovarian cancer and other diseases or normal plasma samples from many institutions using simple and disposable sample preparation, common instrumentation from the fit of MS/MS spectra using simple cross correlation or goodness of fit for storage with standard SQL database and classical statistical analysis with generic software.

Ryerson Analytical Biochemistry Laboratory (RABL), Department of Chemistry and Biology, Faculty of Science, Ryerson University, 350 Victoria St., Toronto, ON Canada
Institute of Cardiovascular Sciences, St. Boniface Hospital Research Center, University of Manitoba, Winnipeg, Canada
Division of Cardiology, Department of Medicine, McMaster University, Hamilton, Canada
St. Michael’s Hospital, Keenan Chair in Medicine, Professor of Medicine, Surgery &amp; Biomedical Engineering, University of Toronto, Toronto, Canada
St. Michael’s Hospital, Keenan Research Centre for Biomedical Science, Toronto, Canada
Program for Cancer Therapeutics, Ottawa Hospital Research Institute, Ottawa, Canada
Clinical Evaluation Research Unit, Kingston General Hospital, Kingston, Canada
Alzheimer Center, Dept of Neurology, Amsterdam University Medical Centers, Vrije Universiteit, Amsterdam Neuroscience, Amsterdam, The Netherlands
MS Center, Dept of Neurology, Amsterdam University Medical Centers, Vrije Universiteit, Amsterdam Neuroscience, Amsterdam, The Netherlands
Neurochemistry Lab and Biobank, Dept of Clinical Chemsitry, Amsterdam University Medical Centers, Vrije Universiteit, Amsterdam Neuroscience, Amsterdam, The Netherlands
Mount Sinai Hospital Research Institute, University of Toronto, Toronto, Canada
University of Windsor, Windsor, Canada
International Biobank of Luxembourg (IBBL), Luxembourg Institute of Health (formerly CRP Sante Luxembourg), Strassen, Luxembourg
Jaimie Dufresne, Email: ac.nosreyr@enserfud.eimiaj.
Contributor Information.
Corresponding author.
Received 2019 Aug 2; Accepted 2019 Dec 5.
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Additional file 1: Table S1. Breast versus ovarian MSMS TRYP and STYP where both X2 where the corrected delta frequency is greater than 9.(60K, xlsx)

Acknowledgements

We thank Dr. R.A. Phillips for his long running support for this program of research, his aid in obtaining human EDTA plasma from the Ontario Tumor Bank, which is funded by the Ontario Institute for Cancer Research, and his help and opinions in the preparation of the manuscript.

Acknowledgements

Abbreviations

TRYPfully tryptic
TRYP STYPfully tryptic and/or S, T or Y tryptic phosphopeptide
Abbreviations

Authors’ contributions

JD, prepared samples and performed LC–ESI–MS/MS analysis. PB, performed SEQUEST and X!TANDEM correlation and parsed the results into an SQL Server database. TT, prepared samples and performed LC–ESI–MS/MS analysis. AFM, prepared samples, performed LC–ESI–MS/MS analysis, and proofed the manuscript. ZZC, prepared samples and performed LC–ESI–MS/MS analysis. MT, prepared samples and performed LC–ESI–MS/MS analysis. TN, performed LC–ESI–MS/MS analysis. MTH, performed LC–ESI–MS/MS analysis. MP, performed LC–ESI–MS/MS analysis. NM, performed LC–ESI–MS/MS analysis. AR, planned the study and collected heart attack samples. ES, planned the study and collected heart attack samples. ASS, planned the study and wrote a grant in support of the study. CCS, planned the study and collected sepsis samples. AR, planned the study, collected sepsis samples, and devised the peptide collection and sample injection method. JCM, planned the study and collected sepsis samples. CA, planned the study and collected cancer samples. SM, planned the study and collected cancer samples. DH, planned the study and collected sepsis. PS, planned the study and collected Alzheimer’s dementia samples. JK, planned the study and collected multiple sclerosis samples. CET, planned the study, collected multiple sclerosis and Alzheimer’s samples and helped write the study. EPD, planned the study and wrote a grant in support of the study. KWMS, planned the study and wrote a grant in support of the study. JGM, planned the study, wrote grants in support of the study, performed the R statistical analysis and wrote the manuscript. All authors read and approved the final manuscript.

Authors’ contributions

Funding

Funding to develop the SQL SERVER-R computation platform, and to sample the breast and ovarian cancer samples, provided by the Ontario Institute of Cancer Research through the Ontario Cancer Biomarker Network to KWS, EPD, and JGM. The funding to create the reference control samples and sample the AD and MS plasma and controls was from Fonds National de la Recherche, through Luxembourg Institute of Health LIH (formerly CRP Sante) and the Integrated Biobank of Luxembourg (IBBL) to JGM. The heart attack results were collected using funding from the Heart and Stroke Foundation of Ontario and Canada to JGM. Funding for wet lab and LC–ESI–MS/MS instruments and for sampling Sepsis was from the Natural Science and Engineering Research Council of Canada (NSERC) for the Discovery Grant and CRD Grant with YYZ Pharmatech to JGM.

Funding

Availability of data and materials

The raw data is provided in companion publication and the supplemental data.

Availability of data and materials

Ethics approval and consent to participate

Human EDTA plasma samples were obtained under Ryerson Ethical Reviews Board Protocol REB 2015-207.

Ethics approval and consent to participate

Consent for publication

No original figures or tables from any other publisher was reproduced in this publication.

Consent for publication

Competing interests

The authors declare that they have no competing interests.

Competing interests

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Footnotes

Contributor Information

Jaimie Dufresne, Email: ac.nosreyr@enserfud.eimiaj.

Pete Bowden, Email: ac.psai@nedwobp.

Thanusi Thavarajah, Email: ac.nosreyr@aravahtt.

Angelique Florentinus-Mefailoski, Email: ac.nosreyr@nerolf3a.

Zhuo Zhen Chen, Email: ac.nosreyr@nehczz.

Monika Tucholska, Email: moc.liamg@akslohcut.akinom.

Tenzin Norzin, Email: ac.nosreyr@nizront.

Margaret Truc Ho, Email: ac.owu@642ohm.

Morla Phan, Email: ac.hpleugou@alrom.

Nargiz Mohamed, Email: ac.nosreyr@ihsahdammahom.zigran.

Amir Ravandi, Email: ac.latropdem@idnavar.rima.

Eric Stanton, Email: ac.retsamcm@enotnats.

Arthur S. Slutsky, Email: ac.hms@AYKSTULS.

Claudia C. dos Santos, Email: ac.hms@CsotnaSsoD.

Alexander Romaschin, Email: ac.hms@AnihcsamoR.

John C. Marshall, Email: ac.hms@jllahsram.

Christina Addison, Email: ac.irho@nosiddac.

Shawn Malone, Email: ac.hot@enolaMS.

Daren Heyland, Email: ac.usneeuq@2hkd.

Philip Scheltens, Email: ln.cmuv@snetlehcs.p.

Joep Killestein, Email: ln.cmuv@nietselliK.J.

Charlotte Teunissen, Email: ln.cmuv@nessinuet.c.

Eleftherios P. Diamandis, Email: ac.no.ianistm@sidnamaide.

K. W. M. Siu, Email: ac.rosdniwu@uiS.leahciM.

John G. Marshall, Email: ac.nosreyr@lahsram4.

Contributor Information

References

  • 1. Tiss A, et al A well-characterised peak identification list of MALDI MS profile peaks for human blood serum. Proteomics. 2010;10(18):3388–3392. doi: 10.1002/pmic.201000100.] [[PubMed][Google Scholar]
  • 2. Oleschuk RD, et al. Characterization of plasma proteins adsorbed onto biomaterials. By MALDI-TOFMS. Biomaterials. 2000;21(16):1701–1710. doi: 10.1016/S0142-9612(00)00054-5.] [[PubMed]
  • 3. Tammen H, et al Detection of low-molecular-mass plasma peptides in the cavernous and systemic blood of healthy men during penile flaccidity and rigidity–an experimental approach using the novel differential peptide display technology. Urology. 2002;59(5):784–789. doi: 10.1016/S0090-4295(01)01659-4.] [[PubMed][Google Scholar]
  • 4. Ardekani AM, Liotta LA, Petricoin EF., 3rd Clinical potential of proteomics in the diagnosis of ovarian cancer. Expert Rev Mol Diagn. 2002;2(4):312–320. doi: 10.1586/14737159.2.4.312.] [[PubMed]
  • 5. Petricoin EF, et al Use of proteomic patterns in serum to identify ovarian cancer. Lancet. 2002;359(9306):572–577. doi: 10.1016/S0140-6736(02)07746-2.] [[PubMed][Google Scholar]
  • 6. Villanueva J, et al Differential exoprotease activities confer tumor-specific serum peptidome patterns. J Clin Invest. 2006;116(1):271–284. doi: 10.1172/JCI26022.] [[Google Scholar]
  • 7. Villanueva J, et al Serum peptidome patterns that distinguish metastatic thyroid carcinoma from cancer-free controls are unbiased by gender and age. Mol Cell Proteomics. 2006;5(10):1840–1852. doi: 10.1074/mcp.M600229-MCP200.] [[PubMed][Google Scholar]
  • 8. Villanueva J, et al A sequence-specific exopeptidase activity test (SSEAT) for “functional” biomarker discovery. Mol Cell Proteomics. 2008;7(3):509–518. doi: 10.1074/mcp.M700397-MCP200.] [[PubMed][Google Scholar]
  • 9. Timms JF, et al Peptides generated ex vivo from serum proteins by tumor-specific exopeptidases are not useful biomarkers in ovarian cancer. Clin Chem. 2010;56(2):262–271. doi: 10.1373/clinchem.2009.133363.] [[PubMed][Google Scholar]
  • 10. Eckel-Passow JE, et al An insight into high-resolution mass-spectrometry data. Biostatistics. 2009;10(3):481–500. doi: 10.1093/biostatistics/kxp006.] [[Google Scholar]
  • 11. Baggerly KA, et al A comprehensive approach to the analysis of matrix-assisted laser desorption/ionization-time of flight proteomics spectra from serum samples. Proteomics. 2003;3(9):1667–1672. doi: 10.1002/pmic.200300522.] [[PubMed][Google Scholar]
  • 12. Marshall J, et al Processing of serum proteins underlies the mass spectral fingerprinting of myocardial infarction. J Proteome Res. 2003;2:361–372. doi: 10.1021/pr030003l.] [[PubMed][Google Scholar]
  • 13. Marshall J, et al Human serum proteins preseparated by electrophoresis or chromatography followed by tandem mass spectrometry. J Proteome Res. 2004;3(3):364–382. doi: 10.1021/pr034039p.] [[PubMed][Google Scholar]
  • 14. Bowden P, Beavis R, Marshall JTandem mass spectrometry of human tryptic blood peptides calculated by a statistical algorithm and captured by a relational database with exploration by a general statistical analysis system. J Proteomics. 2009;73:103–111. doi: 10.1016/j.jprot.2009.08.004.] [[PubMed][Google Scholar]
  • 15. Florentinus AK, et al Identification and quantification of peptides and proteins secreted from prostate epithelial cells by unbiased liquid chromatography tandem mass spectrometry using goodness of fit and analysis of variance. J Proteomics. 2012;75:1303–1317. doi: 10.1016/j.jprot.2011.11.002.] [[PubMed][Google Scholar]
  • 16. Florentinus AK, et al The Fc receptor-cytoskeleton complex from human neutrophils. J Proteomics. 2011;75:450–468. doi: 10.1016/j.jprot.2011.08.011.] [[PubMed][Google Scholar]
  • 17. Dufresne J, et al Random and independent sampling of endogenous tryptic peptides from normal human EDTA plasma by liquid chromatography micro electrospray ionization and tandem mass spectrometry. Clin Proteomics. 2017;14:41. doi: 10.1186/s12014-017-9176-7.] [[Google Scholar]
  • 18. Dufresne J, et al Freeze-dried plasma proteins are stable at room temperature for at least 1 year. Clin Proteomics. 2017;14:35. doi: 10.1186/s12014-017-9170-0.] [[Google Scholar]
  • 19. Dufresne J, et al The proteins cleaved by endogenous tryptic proteases in normal EDTA plasma by C18 collection of peptides for liquid chromatography micro electrospray ionization and tandem mass spectrometry. Clin Proteomics. 2017;14:39. doi: 10.1186/s12014-017-9174-9.] [[Google Scholar]
  • 20. Zhu P, et al Mass spectrometry of peptides and proteins from human blood. Mass Spectrom Rev. 2011;30(5):685–732.[PubMed][Google Scholar]
  • 21. Dufresne J, et al A method for the extraction of the endogenous tryptic peptides (peptidome) from human EDTA plasma. Anal Biochem. 2018;549:188–196. doi: 10.1016/j.ab.2018.02.025.] [[PubMed][Google Scholar]
  • 22. Tucholska M, et al Human serum proteins fractionated by preparative partition chromatography prior to LC-ESI-MS/MS. J Proteome Res. 2009;8:1143–1155. doi: 10.1021/pr8005217.] [[PubMed][Google Scholar]
  • 23. Tucholska M, et al Endogenous peptides from biophysical and biochemical fractionation of serum analyzed by matrix-assisted laser desorption/ionization and electrospray ionization hybrid quadrupole time-of-flight. Anal Biochem. 2007;370:228–245. doi: 10.1016/j.ab.2007.07.029.] [[PubMed][Google Scholar]
  • 24. Williams D, Zhu P, Bowden P, Stacey C, McDonell M, Kowalski P, Kowalski JM, Evans K, Diamandis EP, Siu KM, Marshall JComparison of methods to examine the endogenous peptides of fetal calf serum. Clin Proteomics. 2006;2(1):67. doi: 10.1385/CP:2:1:67.[PubMed][Google Scholar]
  • 25. Tucholska M, et al The endogenous peptides of normal human serum extracted from the acetonitrile-insoluble precipitate using modified aqueous buffer with analysis by LC-ESI-Paul ion trap and Qq-TOF. J Proteomics. 2010;73(6):1254–1269. doi: 10.1016/j.jprot.2010.02.022.] [[PubMed][Google Scholar]
  • 26. Williams D, et al Precipitation and selective extraction of human serum endogenous peptides with analysis by quadrupole time-of-flight mass spectrometry reveals posttranslational modifications and low-abundance peptides. Anal Bioanal Chem. 2010;396:1223–1247. doi: 10.1007/s00216-009-3345-0.] [[PubMed][Google Scholar]
  • 27. Chertov O, et al Organic solvent extraction of proteins and peptides from serum as an effective sample preparation for detection and identification of biomarkers by mass spectrometry. Proteomics. 2004;4(4):1195–1203. doi: 10.1002/pmic.200300677.] [[PubMed][Google Scholar]
  • 28. Tirumalai RS, et al Characterization of the low molecular weight human serum proteome. Mol Cell Proteomics. 2003;2(10):1096–1103. doi: 10.1074/mcp.M300031-MCP200.] [[PubMed][Google Scholar]
  • 29. Pieper R, et al The human serum proteome: display of nearly 3700 chromatographically separated protein spots on two-dimensional electrophoresis gels and identification of 325 distinct proteins. Proteomics. 2003;3(7):1345–1364. doi: 10.1002/pmic.200300449.] [[PubMed][Google Scholar]
  • 30. Patterson SDData analysis-the Achilles heel of proteomics. Nat Biotechnol. 2003;21(3):221–222. doi: 10.1038/nbt0303-221.] [[PubMed][Google Scholar]
  • 31. Dufresne J, et al Re-evaluation of the rabbit myosin protein standard used to create the empirical statistical model for decoy library searching. Anal Biochem. 2018;560:39–49. doi: 10.1016/j.ab.2018.08.025.] [[PubMed][Google Scholar]
  • 32. Bowden P, et al Quantitative statistical analysis of standard and human blood proteins from liquid chromatography, electrospray ionization, and tandem mass spectrometry. J Proteome Res. 2012;11:2032–2047. doi: 10.1021/pr2000013.] [[PubMed][Google Scholar]
  • 33. Zhu P, et al Chi square comparison of tryptic peptide-to-protein distributions of tandem mass spectrometry from blood with those of random expectation. Anal Biochem. 2011;409(2):189–194. doi: 10.1016/j.ab.2010.10.027.] [[PubMed][Google Scholar]
  • 34. Zhu P, et al Peptide-to-protein distribution versus a competition for significance to estimate error rate in blood protein identification. Anal Biochem. 2011;411:241–253. doi: 10.1016/j.ab.2010.12.003.] [[PubMed][Google Scholar]
  • 35. Craig R, Beavis RCTANDEM: matching proteins with tandem mass spectra. Bioinformatics. 2004;20(9):1466–1467. doi: 10.1093/bioinformatics/bth092.] [[PubMed][Google Scholar]
  • 36. Benjamini Y, Hochberg YControlling false discovery rate: a practical approach to multiple testing. J Roy Stat Soc. 1995;57(1):289–300.[PubMed][Google Scholar]
  • 37. Dufresne J, et al The plasma peptidome. Clin Proteomics. 2018;15:39. doi: 10.1186/s12014-018-9211-3.] [[Google Scholar]
  • 38. Howard JC, et al OxLDL receptor chromatography from live human U937 cells identifies SYK(L) that regulates phagocytosis of oxLDL. Anal Biochem. 2016;513:7–20. doi: 10.1016/j.ab.2016.07.021.] [[PubMed][Google Scholar]
  • 39. Dufresne J, et al The plasma peptides of ovarian cancer. Clin Proteomics. 2018;15:41. doi: 10.1186/s12014-018-9215-z.] [[Google Scholar]
  • 40. Lam SW, Jimenez CR, Boven EBreast cancer classification by proteomic technologies: current state of knowledge. Cancer Treat Rev. 2014;40(1):129–138. doi: 10.1016/j.ctrv.2013.06.006.] [[PubMed][Google Scholar]
  • 41. Zhang HG, et al Isolation, identification, and characterization of novel nanovesicles. Oncotarget. 2016;7(27):41346–41362.[Google Scholar]
  • 42. Diaz-Vera J, et al A proteomic approach to identify endosomal cargoes controlling cancer invasiveness. J Cell Sci. 2017;130(4):697–711. doi: 10.1242/jcs.190835.] [[Google Scholar]
  • 43. Whelan SA, et al Mass spectrometry (LC–MS/MS) identified proteomic biosignatures of breast cancer in proximal fluid. J Proteome Res. 2012;11(10):5034–5045. doi: 10.1021/pr300606e.] [[Google Scholar]
  • 44. Celis JE, et al Proteomic characterization of the interstitial fluid perfusing the breast tumor microenvironment: a novel resource for biomarker and therapeutic target discovery. Mol Cell Proteomics. 2004;3(4):327–344. doi: 10.1074/mcp.M400009-MCP200.] [[PubMed][Google Scholar]
  • 45. Hu L, et al Selective on-line serum peptide extraction and multidimensional separation by coupling a restricted-access material-based capillary trap column with nanoliquid chromatography-tandem mass spectrometry. J Chromatogr A. 2009;1216(28):5377–5384. doi: 10.1016/j.chroma.2009.05.030.] [[PubMed][Google Scholar]
  • 46. Hu X, et al Comparative serum proteome analysis of human lymph node negative/positive invasive ductal carcinoma of the breast and benign breast disease controls via label-free semiquantitative shotgun technology. OMICS. 2009;13(4):291–300. doi: 10.1089/omi.2009.0016.] [[PubMed][Google Scholar]
  • 47. Yang Z, et al Multilectin affinity chromatography for characterization of multiple glycoprotein biomarker candidates in serum from breast cancer patients. Clin Chem. 2006;52(10):1897–1905. doi: 10.1373/clinchem.2005.065862.] [[PubMed][Google Scholar]
  • 48. Zhang R, et al Mining biomarkers in human sera using proteomic tools. Proteomics. 2004;4(1):244–256. doi: 10.1002/pmic.200300495.] [[PubMed][Google Scholar]
  • 49. Ye B, et al Haptoglobin-alpha subunit as potential serum biomarker in ovarian cancer: identification and characterization using proteomic profiling and mass spectrometry. Clin Cancer Res. 2003;9(8):2904–2911.[PubMed][Google Scholar]
  • 50. Marshall J, et al Creation of a federated database of blood proteins: a powerful new tool for finding and characterizing biomarkers in serum. Clin Proteomics. 2014;11(1):3. doi: 10.1186/1559-0275-11-3.] [[Google Scholar]
  • 51. Looze C, et al Proteomic profiling of human plasma exosomes identifies PPARgamma as an exosome-associated protein. Biochem Biophys Res Commun. 2008;378(3):433–438. doi: 10.1016/j.bbrc.2008.11.050.] [[Google Scholar]
  • 52. Melo SA, et al Glypican-1 identifies cancer exosomes and detects early pancreatic cancer. Nature. 2015;523(7559):177–182. doi: 10.1038/nature14581.] [[Google Scholar]
  • 53. Bery A, et al Deciphering the ovarian cancer ascites fluid peptidome. Clin Proteomics. 2014;11(1):13. doi: 10.1186/1559-0275-11-13.] [[Google Scholar]
  • 54. Karagiannis GS, et al In-depth proteomic delineation of the colorectal cancer exoproteome: mechanistic insight and identification of potential biomarkers. J Proteomics. 2014;103:121–136. doi: 10.1016/j.jprot.2014.03.018.] [[PubMed][Google Scholar]
  • 55. Krokhin OV, Ens W, Standing KGMALDI QqTOF MS combined with off-line HPLC for characterization of protein primary structure and post-translational modifications. J Biomol Tech. 2005;16(4):429–440.[Google Scholar]
  • 56. Schwartz JC, Senko MW, Syka JEA two-dimensional quadrupole ion trap mass spectrometer. J Am Soc Mass Spectrom. 2002;13(6):659–669. doi: 10.1016/S1044-0305(02)00384-7.] [[PubMed][Google Scholar]
  • 57. Yates JR, 3rd, et al Method to correlate tandem mass spectra of modified peptides to amino acid sequences in the protein database. Anal Chem. 1995;67(8):1426–1436. doi: 10.1021/ac00104a020.] [[PubMed][Google Scholar]
  • 58. Bowden P, et al Meta sequence analysis of human blood peptides and their parent proteins. J Proteomics. 2010;73:1163–1175. doi: 10.1016/j.jprot.2010.02.007.] [[PubMed][Google Scholar]
  • 59. Chick JM, et al A mass-tolerant database search identifies a large proportion of unassigned spectra in shotgun proteomics as modified peptides. Nat Biotechnol. 2015;33(7):743–749. doi: 10.1038/nbt.3267.] [[Google Scholar]
  • 60. Suman S, et al Quantitative proteomics revealed novel proteins associated with molecular subtypes of breast cancer. J Proteomics. 2016;148:183–193. doi: 10.1016/j.jprot.2016.07.033.] [[PubMed][Google Scholar]
  • 61. van den Broek I, et al Quantitative assay for six potential breast cancer biomarker peptides in human serum by liquid chromatography coupled to tandem mass spectrometry. J Chromatogr B Analyt Technol Biomed Life Sci. 2010;878(5–6):590–602. doi: 10.1016/j.jchromb.2010.01.011.] [[PubMed][Google Scholar]
  • 62. Altschul SF, et al Basic local alignment search tool. J Mol Biol. 1990;215(3):403–410. doi: 10.1016/S0022-2836(05)80360-2.] [[PubMed][Google Scholar]
  • 63. Lee J, et al Transition into inflammatory cancer-associated adipocytes in breast cancer microenvironment requires microRNA regulatory mechanism. PLoS ONE. 2017;12(3):e0174126. doi: 10.1371/journal.pone.0174126.] [[Google Scholar]
  • 64. Cao B, et al Latent transforming growth factor-beta binding protein-1 in circulating plasma as a novel biomarker for early detection of hepatocellular carcinoma. Int J Clin Exp Pathol. 2015;8(12):16046–16054.[Google Scholar]
  • 65. Semenza GLRegulation of the breast cancer stem cell phenotype by hypoxia-inducible factors. Clin Sci (Lond) 2015;129(12):1037–1045. doi: 10.1042/CS20150451.] [[PubMed][Google Scholar]
  • 66. Nie C, et al Hypoxia-inducible factor 1-alpha expression correlates with response to neoadjuvant chemotherapy in women with breast cancer. Medicine (Baltimore) 2018;97(51):e13551. doi: 10.1097/MD.0000000000013551.] [[Google Scholar]
  • 67. Akay OM, et al BCL2, BCL6, IGH, TP53, and MYC protein expression and gene rearrangements as prognostic markers in diffuse large B-cell lymphoma: a study of 44 Turkish patients. Cancer Genet. 2014;207(3):87–93. doi: 10.1016/j.cancergen.2014.02.001.] [[PubMed][Google Scholar]
  • 68. Yiu CC, et al Changes in protein expression after neoadjuvant use of aromatase inhibitors in primary breast cancer: a proteomic approach to search for potential biomarkers to predict response or resistance. Expert Opin Investig Drugs. 2010;19(Suppl 1):S79–S89. doi: 10.1517/13543781003701011.] [[PubMed][Google Scholar]
  • 69. Klein MC, et al AXER is an ATP/ADP exchanger in the membrane of the endoplasmic reticulum. Nat Commun. 2018;9(1):3489. doi: 10.1038/s41467-018-06003-9.] [[Google Scholar]
  • 70. Liang WJ, et al Differentially expressed genes between upward and downward progressing types of nasopharyngeal carcinoma. Ai Zheng. 2008;27(5):460–465.[PubMed][Google Scholar]
  • 71. Han JY, et al A genome-wide association study for irinotecan-related severe toxicities in patients with advanced non-small-cell lung cancer. Pharmacogenomics J. 2013;13(5):417–422. doi: 10.1038/tpj.2012.24.] [[PubMed][Google Scholar]
  • 72. Sulaiman NB, et al An azaspirane derivative suppresses growth and induces apoptosis of ER-positive and ER-negative breast cancer cells through the modulation of JAK2/STAT3 signaling pathway. Int J Oncol. 2016;49(3):1221–1229. doi: 10.3892/ijo.2016.3615.] [[PubMed][Google Scholar]
  • 73. Wu K, et al EYA1 phosphatase function is essential to drive breast cancer cell proliferation through cyclin D1. Cancer Res. 2013;73(14):4488–4499. doi: 10.1158/0008-5472.CAN-12-4078.] [[Google Scholar]
  • 74. Chaudhuri S, et al Genetic susceptibility to breast cancer: HLA DQB*03032 and HLA DRB1*11 may represent protective alleles. Proc Natl Acad Sci USA. 2000;97(21):11451–11454. doi: 10.1073/pnas.97.21.11451.] [[Google Scholar]
  • 75. He Y, et al Potentially functional polymorphisms in aminoacyl-tRNA synthetases genes are associated with breast cancer risk in a Chinese population. Mol Carcinog. 2015;54(7):577–583. doi: 10.1002/mc.22128.] [[PubMed][Google Scholar]
  • 76. Kwon SK, et al Trans-synaptic adhesions between netrin-G ligand-3 (NGL-3) and receptor tyrosine phosphatases LAR, protein-tyrosine phosphatase delta (PTPdelta), and PTPsigma via specific domains regulate excitatory synapse formation. J Biol Chem. 2010;285(18):13966–13978. doi: 10.1074/jbc.M109.061127.] [[Google Scholar]
  • 77. Aceto N, et al Tyrosine phosphatase SHP2 promotes breast cancer progression and maintains tumor-initiating cells via activation of key transcription factors and a positive feedback signaling loop. Nat Med. 2012;18(4):529–537. doi: 10.1038/nm.2645.] [[PubMed][Google Scholar]
  • 78. Wang Z, et al Low expression of WWC1, a tumor suppressor gene, is associated with aggressive breast cancer and poor survival outcome. FEBS Open Bio. 2019;9(7):1270–1280. doi: 10.1002/2211-5463.12659.] [[Google Scholar]
  • 79. Kanojia D, et al Proteomic profiling of cancer stem cells derived from primary tumors of HER2/Neu transgenic mice. Proteomics. 2012;12(22):3407–3415. doi: 10.1002/pmic.201200103.] [[PubMed][Google Scholar]
  • 80. Beheshti Zavareh R, et al Suppression of cancer progression by MGAT1 shRNA knockdown. PLoS ONE. 2012;7(9):e43721. doi: 10.1371/journal.pone.0043721.] [[Google Scholar]
  • 81. Mamelak AJ, et al Downregulation of NDUFA1 and other oxidative phosphorylation-related genes is a consistent feature of basal cell carcinoma. Exp Dermatol. 2005;14(5):336–348. doi: 10.1111/j.0906-6705.2005.00278.x.] [[PubMed][Google Scholar]
  • 82. Chi C, et al RTN4/Nogo is an independent prognostic marker for gastric cancer: preliminary results. Eur Rev Med Pharmacol Sci. 2015;19(2):241–246.[PubMed][Google Scholar]
  • 83. Morita R, et al Olfactory receptor family 7 subfamily C member 1 Is a novel marker of colon cancer-initiating cells and is a potent target of immunotherapy. Clin Cancer Res. 2016;22(13):3298–3309. doi: 10.1158/1078-0432.CCR-15-1709.] [[PubMed][Google Scholar]
  • 84. Adamus A, et al GCSH antisense regulation determines breast cancer cells’ viability. Sci Rep. 2018;8(1):15399. doi: 10.1038/s41598-018-33677-4.] [[Google Scholar]
  • 85. Masiero M, et al A core human primary tumor angiogenesis signature identifies the endothelial orphan receptor ELTD1 as a key regulator of angiogenesis. Cancer Cell. 2013;24(2):229–241. doi: 10.1016/j.ccr.2013.06.004.] [[Google Scholar]
  • 86. Li WX, et al Comprehensive tissue-specific gene set enrichment analysis and transcription factor analysis of breast cancer by integrating 14 gene expression datasets. Oncotarget. 2017;8(4):6775–6786.[Google Scholar]
  • 87. Garattini E, et al Lipid-sensors, enigmatic-orphan and orphan nuclear receptors as therapeutic targets in breast-cancer. Oncotarget. 2016;7(27):42661–42682. doi: 10.18632/oncotarget.7410.] [[Google Scholar]
  • 88. Guo H, Sadoul R, Gibbings DAutophagy-independent effects of autophagy-related-5 (Atg5) on exosome production and metastasis. Mol Cell Oncol. 2018;5(3):e1445941. doi: 10.1080/23723556.2018.1445941.] [[Google Scholar]
  • 89. Rai A, et al Exosomes derived from human primary and metastatic colorectal cancer cells contribute to functional heterogeneity of activated fibroblasts by reprogramming their proteome. Proteomics. 2019;19(8):e1800148. doi: 10.1002/pmic.201800148.] [[PubMed][Google Scholar]
  • 90. Liu Z, et al PDZ and LIM domain protein 1(PDLIM1)/CLP36 promotes breast cancer cell migration, invasion and metastasis through interaction with alpha-actinin. Oncogene. 2015;34(10):1300–1311. doi: 10.1038/onc.2014.64.] [[Google Scholar]
  • 91. Pangeni RP, et al The GALNT9, BNC1 and CCDC8 genes are frequently epigenetically dysregulated in breast tumours that metastasise to the brain. Clin Epigenetics. 2015;7:57. doi: 10.1186/s13148-015-0089-x.] [[Google Scholar]
  • 92. Qi J, et al Absent, small or homeotic 2-like protein (ASH2L) enhances the transcription of the estrogen receptor alpha gene through GATA-binding protein 3 (GATA3) J Biol Chem. 2014;289(45):31373–31381. doi: 10.1074/jbc.M114.579839.] [[Google Scholar]
  • 93. Selvanathan SP, et al Oncogenic fusion protein EWS-FLI1 is a network hub that regulates alternative splicing. Proc Natl Acad Sci USA. 2015;112(11):E1307–E1316. doi: 10.1073/pnas.1500536112.] [[Google Scholar]
  • 94. Rumiato E, et al Predictive markers in elderly patients with estrogen receptor-positive breast cancer treated with aromatase inhibitors: an array-based pharmacogenetic study. Pharmacogenomics J. 2016;16(6):525–529. doi: 10.1038/tpj.2015.73.] [[PubMed][Google Scholar]
  • 95. Cheng TY, et al Folate-mediated one-carbon metabolism genes and interactions with nutritional factors on colorectal cancer risk: women’s Health Initiative Observational Study. Cancer. 2015;121(20):3684–3691. doi: 10.1002/cncr.29465.] [[Google Scholar]
  • 96. Peng M, et al Intracellular citrate accumulation by oxidized ATM-mediated metabolism reprogramming via PFKP and CS enhances hypoxic breast cancer cell invasion and metastasis. Cell Death Dis. 2019;10(3):228. doi: 10.1038/s41419-019-1475-7.] [[Google Scholar]
  • 97. Bibens-Laulan N, St-Pierre YIntracellular galectin-7 expression in cancer cells results from an autocrine transcriptional mechanism and endocytosis of extracellular galectin-7. PLoS ONE. 2017;12(11):e0187194. doi: 10.1371/journal.pone.0187194.] [[Google Scholar]
  • 98. Gornati R, et al Evaluation of SAT-1, SAT-2 and GalNAcT-1 mRNA in colon cancer by real-time PCR. Mol Cell Biochem. 2007;298(1–2):59–68. doi: 10.1007/s11010-006-9350-0.] [[PubMed][Google Scholar]
  • 99. Megias-Vericat JE, et al Pharmacogenetics of metabolic genes of anthracyclines in acute myeloid leukemia. Curr Drug Metab. 2018;19(1):55–74. doi: 10.2174/1389200218666171101124931.] [[PubMed][Google Scholar]
  • 100. Sun Y, Li XThe canonical wnt signal restricts the glycogen synthase kinase 3/fbw7-dependent ubiquitination and degradation of eya1 phosphatase. Mol Cell Biol. 2014;34(13):2409–2417. doi: 10.1128/MCB.00104-14.] [[Google Scholar]
  • 101. Garrido P, et al Loss of GLUT4 induces metabolic reprogramming and impairs viability of breast cancer cells. J Cell Physiol. 2015;230(1):191–198. doi: 10.1002/jcp.24698.] [[PubMed][Google Scholar]
  • 102. Xie F, et al MicroRNA-193a inhibits breast cancer proliferation and metastasis by downregulating WT1. PLoS ONE. 2017;12(10):e0185565. doi: 10.1371/journal.pone.0185565.] [[Google Scholar]
  • 103. Wu WS, et al Human CCDC47 sandwich immunoassay development with electrochemiluminescence technology. J Immunol Methods. 2018;452:12–19. doi: 10.1016/j.jim.2017.09.011.] [[PubMed][Google Scholar]
  • 104. Tan S, et al Identification of miR-26 as a key mediator of estrogen stimulated cell proliferation by targeting CHD1, GREB1 and KPNA2. Breast Cancer Res. 2014;16(2):R40. doi: 10.1186/bcr3644.] [[Google Scholar]
  • 105. Lee HJ, et al A novel E1A-like inhibitor of differentiation (EID) family member, EID-2, suppresses transforming growth factor (TGF)-beta signaling by blocking TGF-beta-induced formation of Smad3-Smad4 complexes. J Biol Chem. 2004;279(4):2666–2672. doi: 10.1074/jbc.M310591200.] [[PubMed][Google Scholar]
  • 106. Li Y, et al Cancer cells and normal cells differ in their requirements for Thoc1. Cancer Res. 2007;67(14):6657–6664. doi: 10.1158/0008-5472.CAN-06-3234.] [[Google Scholar]
  • 107. Liu C, et al Elevated expression of Thoc1 is associated with aggressive phenotype and poor prognosis in colorectal cancer. Biochem Biophys Res Commun. 2015;468(1–2):53–58. doi: 10.1016/j.bbrc.2015.10.166.] [[PubMed][Google Scholar]
  • 108. Oehler VG, et al The derivation of diagnostic markers of chronic myeloid leukemia progression from microarray data. Blood. 2009;114(15):3292–3298. doi: 10.1182/blood-2009-03-212969.] [[Google Scholar]
  • 109. Chen WC, et al Systematic analysis of gene expression alterations and clinical outcomes for long-chain acyl-coenzyme a synthetase family in cancer. PLoS ONE. 2016;11(5):e0155660. doi: 10.1371/journal.pone.0155660.] [[Google Scholar]
  • 110. Folgueira MA, et al Gene expression profile associated with response to doxorubicin-based therapy in breast cancer. Clin Cancer Res. 2005;11(20):7434–7443. doi: 10.1158/1078-0432.CCR-04-0548.] [[PubMed][Google Scholar]
  • 111. Buisson R, et al Cooperation of breast cancer proteins PALB2 and piccolo BRCA2 in stimulating homologous recombination. Nat Struct Mol Biol. 2010;17(10):1247–1254. doi: 10.1038/nsmb.1915.] [[Google Scholar]
  • 112. Kim K, et al Interleukin-22 promotes epithelial cell transformation and breast tumorigenesis via MAP3K8 activation. Carcinogenesis. 2014;35(6):1352–1361. doi: 10.1093/carcin/bgu044.] [[PubMed][Google Scholar]
  • 113. van Roosmalen W, et al Tumor cell migration screen identifies SRPK1 as breast cancer metastasis determinant. J Clin Invest. 2015;125(4):1648–1664. doi: 10.1172/JCI74440.] [[Google Scholar]
  • 114. Kwasnicka-Crawford DA, Carson AR, Scherer SWIQCJ-SCHIP1, a novel fusion transcript encoding a calmodulin-binding IQ motif protein. Biochem Biophys Res Commun. 2006;350(4):890–899. doi: 10.1016/j.bbrc.2006.09.136.] [[PubMed][Google Scholar]
  • 115. Chang H, et al Identification of genes associated with chemosensitivity to SAHA/taxane combination treatment in taxane-resistant breast cancer cells. Breast Cancer Res Treat. 2011;125(1):55–63. doi: 10.1007/s10549-010-0825-z.] [[PubMed][Google Scholar]
  • 116. Haiman CA, et al Genome-wide testing of putative functional exonic variants in relationship with breast and prostate cancer risk in a multiethnic population. PLoS Genet. 2013;9(3):e1003419. doi: 10.1371/journal.pgen.1003419.] [[Google Scholar]
  • 117. Mewani RR, et al Gene expression profile by inhibiting Raf-1 protein kinase in breast cancer cells. Int J Mol Med. 2006;17(3):457–463.[PubMed][Google Scholar]
Collaboration tool especially designed for Life Science professionals.Drag-and-drop any entity to your messages.