Longitudinal protein expression patterns in bronchiolitis obliterans syndrome

Background: Lung transplant is an effective therapy, however, long-term survival is currently limited due to the high rate of bronchiolitis obliterans syndrome (BOS). The longitudinal biological changes that occur as a patient progresses to BOS are not well understood. The objective of this study was to evaluate the bronchoalveolar lavage fluid (BALF) proteome, seeking to identify proteins and their representative biological functions that have coherent changes in the development of BOS. Methods: Isobaric tag for relative and absolute quantification(iTRAQ®) with LC MS/MS were performed on BALF enriched for medium and low abundance proteins from three time points (range 699 days pre-BOS to 83 days post-BOS) taken from four patients that developed BOS. Controls consisted of pooled samples from nine patients who did not develop BOS within five years of sample acquisition. We investigated how the protein profiles changed longitudinally as the patient progressed towards BOS using Short Time-series Expression Miner, (STEM), analysis and identified gene ontology terms associated with these protein clusters. Results: We identified 871 unique proteins at a 5% FDR. STEM analysis revealed three models that met statistical significance.Gene ontology revealed three biological processes that associated with these models and included: sequestering of actin monomers, cytoskeletal organization and an inflammatory or defense response. Conclusion: Longitudinal and gene cluster models reveal coherent changes in BALF proteins as patients approach BOS.


Background
Lung transplant has become a widely used treatment for many end-stage lung diseases. However, long-term outcomes are limited, as 43% of patients are reported to have chronic graft dysfunction in the form of bronchiolitis obliterans by five years [1]. In most cases, bronchiolitis obliterans with the clinical correlate of bronchiolitis obliterans syndrome (BOS) is a progressive process that is diagnosed by a largely irreversible decline in pulmonary function. Treatment is limited to immunosuppres-sant augmentation to which response is poor and five-year survival after diagnosis is only 30-40% [2]. While single proteins, such as human neutrophil defensin, MMP-9, surfactant protein A, and unique peptides have previously been identified that correlate with the development of BOS, the global longitudinal changes in proteins and their associated biological processes in the development of BOS have not been described [3][4][5][6][7][8][9].
Bronchoalveolar lavage fluid (BALF) provides an available source of protein for further study and can lend insight into Pulmonology and Respiratory Research ISSN 2053-6739 | Volume 3 | Article 5 CrossMark ← Click for updates doi: 10.7243/2053-6739-3-5 mechanism of disease. Surveillance bronchoscopy is performed on a routine basis following transplant, allowing access to this biological sample. This report describes a longitudinal, discovery-level proteomic study of BALF at various points leading to and at BOS diagnosis to assess global proteomic changes that associate with the development of BOS.

Study population
The samples studied were excess, cell-free BALF collected longitudinally between 2002-2009 from routine post-transplant surveillance bronchoscopies as previously described [6]. Written, informed consent was obtained from all subjects. Collection of bronchoalveolar lavage fluid was performed by instilling 150-200mL sterile 0.9% saline solution into the transplanted lung, either right middle or left upper lobe, followed by aspiration with a target goal of 60mL. Samples were immediately placed on ice, cells were removed by centrifugation and supernatants were stored at -80°C. For this casecontrolled study, four subjects were chosen, each providing three samples ( Table 1). Four subjects that developed BOS grade 2-3 [10] were chosen based on availability of at least 50mL BALF at three separate time points including one time point within 100 days of the clinical diagnosis of BOS. None of the subjects had evidence of active pulmonary infection or acute rejection at the time of bronchoscopy and all were on triple immunotherapy (tacrolimus, mycophenolate mofetil and prednisone). BOS was diagnosed by spirometry as previously defined [10]. Controls consisted of nine lung transplant recipients who were healthy, without evidence of acute rejection or infection, undergoing surveillance bronchoscopy and remained grade 0 BOS for at least 5 years following bronchos-copy ( Table 2). BALF from each control (50mL total/control) was pooled and run in duplicate in each iTRAQ run to serve both as an internal control and a control for the BOS samples across separate iTRAQ runs. Identical pooled controls were used for all three iTRAQ runs to provide a common reference point required for inter-run analysis and provide technical replicates needed for run metrics. This study was approved by the University of Minnesota Institutional Review Board Human Subjects Committee.

Sample preparation
Prior to sample preparation, protein extraction was optimized for protein recovery and measurement using Bradford reagent (Bio-Rad, Hercules, CA), sonication, spin filter concentrating and immunodepletion. Samples were vortexed one minute followed by 5 seconds of sonication for mucus dispersion. Cells were removed after 10 minutes centrifugation at 14,000xg at 4°C. Amicon 3K spin filters (EMD Millipore, Billerica, MA) were used for salt removal and protein concentration after washing with ddH2O (20 minutes at 4000xg at 4°C, one spin filter per subjectwith up to four uses of each spin filter). BALF typically contains highly-abundant serum proteins necessitating removal prior to effective detection by MS. Fourteen highly abundant proteins were removed by Seppro IgY14 immunodepletion spin columns (Sigma-Aldrich, St. Louis, MO, proteins removed: HAS, IgG, fibrinogen, transferrin, IgA, IgM, haptoglobin, alpha2, macroglobulin, alpha1-Acid Glycoprotein, alpha1-Antitrypsin, Apo A-I HDL, Apo A-II HDL, complement C3 and LDL (ApoB)) using 3-(N-morpholino)propanesulfonic acid(MOPS) buffer instead of manufacturer suggested TRIS buffer due to interference with iTRAQ labeling followed by concentrating using above methods.

iTRAQ labeling and 2D LC-Orbitrap MS
Samples (40μg, 34 μg, and 49 μg for each sample in runs 1-3 respectively) were digested by trypsin and labeled with iTRAQ® reagent (AB Sciex, Foster City, CA) [11] for quantitative mass spectrometric analysis. Each eight-plex iTRAQ run contained three longitudinal samples from 1-2 BOS patients and two pooled control samples, for internal control and standardized reference point across all three runs. The total peptide mixture was purified with an MCX Oasis cartridge (Waters, Milford, MA) before separation via two-dimensional liquid chromatography-mass spectrometry (2D LC-MS) as previously reported [11]. Proteins were separated and concentrated offline in the 1 st dimension into 15 peptide-containing fractions, collected in 2-minute intervals on a C18 Gemini column (Phenomenex, Torrance, CA) at pH 10, and in the 2 nd dimension by a C18 reversed-phase capillary LC with a nano-LC system (Eksigent, Dublin, CA). Data-dependent acquisition of the 6 most intense peaks per LC fraction was performed on an Orbitrap Velos system, with HCD (higher energy collision induced dissociation) as the activation type for peptide tandem MS.

Database searching and data processing
Each of the 15 RAW files generated from the Orbitrap Velos MS system were converted to mzML files by using msconvert, then converted to a ProteinPilot compatible Mascot Generic Format (MGF) with preselected iTRAQ reporter ions. The MGF files were searched against the Human UniProt database along with contaminant protein sequences (84,838 sequences in total; December 2012) using ProteinPilot version 4.5 with a Detected Protein Threshold (Unused Protscore (Conf )): 10%. The ProteinPilot searches and subsequent generation of PSPEP (FDR) reports and protein and peptide-level summaries were generated within Galaxy-P [12]. Protein Summary with iTRAQ ratios (with pooled control as the denominator for determining fold change) was processed through a workflow built within Galaxy-P so that it yielded UniProt accession numbers and gene names of differentially expressed proteins.

Short time-series expression miner (STEM)
We investigated how the protein profiles changed longitudinally as the patient progressed towards BOS using STEM analysis. The STEM algorithm and software identifies genes that have similar behavior over a short time series (3-8 timepoints) with all genes clustered into one of a set of pre-defined patterns based on transformation of gene profiles into "units of change" [13]. We grouped the BALF samples into 4 time periods for analysis (<-581 days, -580 to -451 days, -450 to -300 days, 0 to +100 relative to BOS diagnosis). The STEM clustering algorithm was used to create 50 model profiles, using the hyper-geometric distribution to compute the significance of overlap between genes from the experiment and model profiles as previously described [14]. The model profiles are defined independent of the data and the boundaries in expression space that they induce remain the same between experiments. Therefore, the data from the three separate iTRAQ ® LC-MS experiments were analyzed.

Gene ontology (GO) enrichment analysis
To identify functional significance of differentially expressed proteins identified by STEM analyses we used the Database for Annotation, Visualization, and Integrated Discovery (DAVID, http://david.abcc.ncifcrf.gov, search date 5/10/14). DAVID provided annotations to identify gene ontology (GO) terms associated with our protein list according to common biological  function. For a background gene list we used all of the proteins identified in our BALF at a 5% FDR. Functional annotation clustering analysis in DAVID was used to identify the combinations of genes according to common biological function.

Characteristics of study participants
We analyzed three longitudinal BALF samples from four lung transplant recipients that progressed to BOS grade 2. Each subject had 2 samples collected prior to the diagnosis of BOS grade 2 (range 307-699 days prior to BOS 2 diagnosis) and one sample collected after the diagnosis of BOS grade 2 (range 1-83 days after BOS diagnosis, Table 1). All of the subjects had an underlying diagnosis of COPD, one with Alpha-1 Antitrypsin deficiency, and the mean age was 57 (

Proteins identified by iTRAQ labeling with MS/MS
ProteinPilot software (version 4.5) reported 6,593, 19,930, and 22,212 spectra matched to 2,777, 7,369, and 8,421 distinct peptides at a 5% local false discovery rate (FDR) for a total of 294, 599, and 816 inferred proteins, respectively for runs 1-3 (Supplementary figure S1). Any of the 14 targeted proteins for removal by the IgY depletion column that were still present in the iTRAQ run data were not included in the statistical analysis since their quantitation would be unreliable. Suspected contaminants, such as trypsin, were also removed for a total of 871 proteins for final analysis.

Short time-series expression miner (STEM) analysis
Using an unsupervised approach (STEM analysis), we identified protein clusters that changed coherently over time as the subject progressed to BOS. The STEM algorithm can use up to eight time points. Since our time points spanned several hundred days we grouped the BALF samples into 4 time periods for analysis (<-581 days, -580 to -451 days, -450 to -300 days, 0 to +100 relative to BOS diagnosis) with 3 samples in each group. The STEM algorithm generated 50 model profiles (Supplementary figure S2), three of which (STEM 43, 2 and 41) demonstrated statistically significant higher number than expected proteins ( Figure 1A). Each model generated a gene list (Supplementary Table S1) with 41 proteins in Model 43, 50  T0  T0   T0  T0  T1   T1  T1   T1   T1  T1   41  2  43   STEM 43  STEM 2  STEM 41   T2   T2  T2   T2   T2  T2  T3   T3  T3   T3   T3  T3  T4   T4  T4   T4   T4  T4   2 Table 3 demonstrates the proteins represented in every time point of the three STEM models (weight of 1.0). Figure 1B demonstrates the protein expression patterns of these heavily weighted proteins. These proteins were used for Gene Ontology (GO) Enrichment analysis with DAVID ( Table 4). GO analysis for model 43 revealed 'sequestering of actin monomers" as a significant biological process. The proteins thymosin beta-4 and beta-10 were significant for this process. GO analysis for Model 2 proteins revealed 'cytoskeletal organization' as a biological process with actinin, keratin 9, myosin-9 and Ras-related C3 botulism toxin substrate 1 as significant proteins. GO analysis for Model 41 was significant for an inflammatory, defense response or response to wounding.

Discussion
In previous mass spectrometry studies, others and our group have reported an elevation in the small molecule defensins  and the proteins club cell secretory protein and surfactant [6,8,15]. These studies compared patients with BOS to controls and used limited coverage mass spectrometry techniques. In our current study we achieved deep coverage of the BALF proteome through sample preparation optimization in combination with high-resolution mass spectrometry in longitudinal samples. For comparison, we used a common pooled control group consisting of BALF from patients that were BOS-free for at least 5 years after the sample was obtained. This common pooled BALF control was used for the relative quantitative analysis within a run, technical replicate data metrics, and allowed comparisons between runs. One advantage of a global protein assessment is the use of Gene Ontology enrichment studies, which revealed processes related to the proteins that are differentially regulated asthe patient approached BOS.

GO biological process
To determine if clusters of proteins change in a coherent pattern over time we used STEM analysis, which is an unsupervised approach [13]. The STEM is a program for gene clustering, comparing, and visualizing expression data over a time series. Originally designed for microarray data, this program allows up to eight time points. Since our BALF samples were obtained for clinical indications our time points did not coincide, therefore we grouped the time points by relative time. This was somewhat arbitrary since we do not know if the tempo of developing BOS is identical among patients. Nevertheless, we were able to identify three patterns of protein expression that met statistical significance.
STEM Model 43 demonstrated a cluster of proteins that increased soon after transplant and then dropped as the patient approached BOS. GO determined by DAVID revealed 'sequestering of actin monomers' as an important biological function. The proteins driving this model are thymosin beta-4 and beta-10. Thymosins, originally named after their original discovery in the thymus, are small actin-sequestering proteins present in many tissues. They are required for cytoskeletal rearrangements and thus cell motility. Thymosin beta-4 also promotes angiogenesis by promoting endothelial cell migration and tubule formation [16]. Thymosin beta-4 enhances the survival of transplanted mesenchymal stem cells for cardiac repairby reducing apoptosis [17]. In addition, gene expression profiling from a murine model of obliterative bronchiolitis demonstrated an increase in thymosin expression [18]. These studies suggest that thymosin may promote cellular and tissue growth in the development of BOS. STEM model 2 depicted a different pattern of decreased protein expression relative to healthy controls throughout the time leading up to BOS that are involved in cytoskeletal organization by GO analysis. Among these proteins were actinrelated proteins, actinin, myosin-9, keratin-9, and Ras-related protein. One of the immediate events following successful lung transplantation is lung repair, especially repair of the epithelium [19]. Epithelial recovery and restitution requires mechanisms involving cell spreading and migration that require wellorchestrated signaling between key factors, extracellular matrix, and structural components [20,21]. Increased injury and poor repair, such as that seen with ischemia-reperfusion injury, results in increased early mortality and increases the risk of late bronchiolitis obliterans syndrome [22]. Therefore, a decrease in cytoskeleton organization may depict a loss in epithelial cell repair.
The last STEM model 41 illustrated proteins that markedly increased as the patient approached BOS. GO analysis targeted proteins to an inflammatory or defense response. CD44 antigen, a cell-surface glycoprotein involved in cell-cell interactions, cell adhesion, and migration, was a prominent protein. CD44 is a receptor for hyaluronic acid and is present on many cell types including mesenchymal stromal cells. An increase in CD44 positive mesenchymal cells, as measured by mesenchymal colony-formation units, is predictive of BOS [23]. CD44 is important in both inflammatory cells as well as fibroblast recruitment to sites of injury [24][25][26][27], both important in the development of BOS. Complement, specifically C3 and C5b-9 components, are increased in lung allograft rejection and blockade of their function inhibits obliterative bronchiolitis in a rat tracheal allograft model [28]. Lastly, peroxiredoxin belongs to a ubiquitous family of antioxidant enzymes. Its upregulation is associated with lung cancer and also acute cardiac allograft rejection in a murine model [29]. The exact role peroxiredoxin plays in BOS is unknown.
The limitations of this study mainly derive from the substrate with which we were working. BALF, while accessible, requires a procedure to obtain and is dilute initially. BALF dilution was mitigated in our study by protein concentration and using fixed amounts of protein for comparison between samples. The relatively small number of cases also limited this study. Ideally, we would have preferred more subjects, each with three time points, but were limited by the BALF concentration and narrow case definition. Several patients were excluded when even one time point was unable to provide enough protein for concentration, immunodepletion and labeling.
To mitigate the small sample size and large dataset we used a stringent statistical approach. Additionally, working with BALF provided another limitation as proteomic techniques are developed and optimized for plasma. To improve the dynamic range for protein detection it is essential to remove the common abundant proteins found in plasma that is also present in BALF. These techniques to remove abundant proteins, such as immunodepletion, had to be tested and optimized for BALF. Immunodepletion resulted in removal of over 95% of the proteins in BALF, thus requiring a large initial quantity of BALF to perform MS/MS with iTRAQ labeling.

Conclusions
In lung transplant patients coherent biological processes leading to the development of BOS can be identified in BALF. In longitudinal and cluster models, protein dysregulation is seen in response to sequestration of actin monomers, cytoskeletal organization and the inflammatory or defense response. Individual proteins are identified and may serve as a representative of a larger pathological process that contrib-