

2. Division of Mathematical Science, Graduate School of Engineering Science, Osaka University, Japan.
3. Medical Genetics Department, Oslo University Hospital (Ullevål), Kirkeveien 166, 0407 Oslo, Norway.
4. Department of Genetics, Institute for Cancer Research, Oslo University Hospital, Montebello, 0310 Oslo, Norway.
5. Institute for Clinical Medicine, Faculty of Medicine, University of Oslo, Norway.
6. Department of Clinical Molecular Biology and Laboratory Science (EpiGen), Division of Medicine, Akershus Universit Hospital, University of Oslo, Norway.
Background: In this paper we report the prevalence of copy number aberration events at various stages (subclasses) of breast cancer as assessed by two different statistical methods, GISTIC, a well-known method to identify significant somatic copy-number alterations in a given set and GISDIP: Genomic Identification of Significant Differences in Progression, a newly developed numerical algorithm described here to compare different sets.
Methods: GISTIC assesses significant aberrations towards whole genomic location for each stage and GISDIP assesses significant difference towards whole genomic location for two stages pair-wise.
Results: We compare GISDIP with the original GISTIC by a simulation study. Comparing with the focal/broad regions obtained by GISTIC, GISDIP can directly illustrate significantly different regions between different stages. We then performed experimental data analyses and identified all significant aberrations at each stage of breast cancer using GISTIC and identified the peaks significant for each stage/tumour category. Then we applied GISDIP to identify the significant differences between the tumour size categories in progression. The significance was assessed by permuting the data across all samples in all stages. The statistically significant copy number aberrations specific for the various stages are presented here by both statistics on a set of 530 cases (validation set). The same analysis was performed in subsets of cases according to ER-status. The genes located in the identified regions were investigated by applying network analysis and the list for genomic loci related to allele-specific disparity.
Conclusions: Our method indicates significant particular regions related to difference of copy number aberration between different stages. The particular biological characteristics for network analysis on estimated regions are summarized.
Keywords: DNA copy number alteration, breast cancer, computational algorithm, permutation, false discovery rate, allele-specific disparity