Migration and DNA methylation: a comparison of methylation patterns in type 2 diabetes susceptibility genes between indians and europeans

Background Type 2 diabetes is a global problem that is increasingly prevalent in low and middle income countries including India, and is partly attributed to increased urbanisation. Genotype clearly plays a role in type 2 diabetes susceptibility. However, the role of DNA methylation and its interaction with genotype and metabolic measures is poorly understood. This study aimed to establish whether methylation patterns of type 2 diabetes genes differ between distinct Indian and European populations and/or change following rural to urban migration in India. Methods Quantitative DNA methylation analysis in Indians and Europeans using Sequenom® EpiTYPER® technology was undertaken in three genes: ADCY5, FTO and KCNJ11. Metabolic measures and genotype data were also analysed. Results Consistent differences in DNA methylation patterns were observed between Indian and European populations in ADCY5, FTO and KCNJ11. Associations were demonstrated between FTO rs9939609 and BMI and between ADCY5rs17295401 and HDL levels in Europeans. However, these observations were not linked to local variation in DNA methylation levels. No differences in methylation patterns were observed in urban-dwelling migrants compared to their non-migrant rural-dwelling siblings in India. Conclusions Analysis of DNA methylation at three type 2 diabetes susceptibility loci highlighted geographical and ethnic differences in methylation patterns. These differences may be attributed to genetic and/or region-specific environmental factors.

Results-Consistent differences in DNA methylation patterns were observed between Indian and European populations in ADCY5, FTO and KCNJ11. Associations were demonstrated between FTO rs9939609 and BMI and between ADCY5rs17295401 and HDL levels in Europeans. However, these observations were not linked to local variation in DNA methylation levels. No differences in methylation patterns were observed in urban-dwelling migrants compared to their non-migrant rural-dwelling siblings in India.
Conclusions-Analysis of DNA methylation at three type 2 diabetes susceptibility loci highlighted geographical and ethnic differences in methylation patterns. These differences may be attributed to genetic and/or region-specific environmental factors.

Keywords
Type 2 diabetes; DNA methylation; ethnicity Background Type 2 diabetes represents a major global health burden that is particularly increasing in prevalence in low and middle income countries, including India. Urbanisation is one reason attributed to this increase. Comparison of rural and urban dwelling Indians has uncovered increases in body mass index (BMI) and other metabolic traits associated with type 2 diabetes in urban dwellers [1,2].
Epigenetic variation, including DNA methylation, is involved in the mechanisms of gene regulation and has become a focus of interest in the domain of common complex disease [3]. There is increasing evidence that methylation levels change with age and with environmental exposure(s) [4][5][6]. Additionally, underlying genetic variation also plays a role in determining patterns of DNA methylation [7][8][9][10][11]. Population-level differences in genetic architecture may therefore reflect differences in DNA methylation patterns between populations. Furthermore, these patterns may also relate to differential disease susceptibility [12,13].
DNA methylation was analysed in three genes, ADCY5, FTO and KCNJ11. These genes were selected as they are established type 2 diabetes susceptibility genes that are plausibly regulated by epigenetic mechanisms: each locus contains dense regions of methylation (CpG) sites, known as CpG islands, close to the gene promoter and binding sites for methylation sensitive regulators of transcription [14]. Increased DNA methylation on the FTO obesity susceptibility haplotype [15], hypomethylation of FTO in type 2 diabetes cases compared to controls [16] and evidence (although not demonstrating a causal relationship) that the effect of the FTO rs9939609 risk allele may be mediated by methylation changes in other genes [17] have all been described in human peripheral blood DNA. DNA methylation in ADCY5 is associated with decreased gene expression in leukemia patients [18], also showing evidence for regulation of this gene by methylation. Evidence for variable KCNJ11 methylation has not yet been published, although methylation of other K ATP channel subunits has recently been described in mice and has been postulated to regulate gene expression [19].
Additionally, these genes were selected on the basis that they are associated with type 2 diabetes or related traits in both Indian and European populations. FTO has well replicated associations with type 2 diabetes in numerous populations including those in India and Europe [20][21][22][23]. The association between type 2 diabetes and genetic variants in KCNJ11 has also been identified in both populations [24][25][26][27]. Variants in ADCY5 associate with fasting and 2-hour plasma glucose levels following oral glucose tolerance tests in Europeans [28,29] and in Indians [30].
This study sought to identify differences in DNA methylation between Indians and Europeans and to assess the relationship between DNA methylation and underlying SNP architecture at three type 2 diabetes susceptibility loci. It also aimed to identify rural-urban migration-related differences in DNA methylation patterns at the loci analysed and to assess the relationship between DNA methylation and migration-induced shifts in metabolic traits in Indian subjects. Use of geographically distinct populations from North and South India allowed differences in methylation patterns within India to be determined.

Study populations
Indian Migration Study (IMS)-The IMS was conducted in four factories located in cities across India: Lucknow (Northern), Nagpur (Central), Hyderabad (Southern) and Bangalore (Southern). Factory workers and their co-resident spouses were recruited to the study if they were rural-urban migrants. Non-migrant siblings living in the rural village of origin were also invited to join the study. If migrants had multiple siblings, they were asked to invite the one closest to them in age and of the same sex. The fieldwork for this study took place between 2005 and 2007. In-depth study design details and preliminary findings from the IMS have been reported elsewhere [1,[31][32][33]. This report studies a subset of IMS participants: 92 rural-urban sibling pairs recruited from Hyderabad and 92 rural-urban sibling pairs recruited from Lucknow. Criteria for selecting sibling pairs were that they were of the same sex, age matched (within 5 years) and had provided DNA samples. Siblings are referred to throughout as 'urban' and 'rural' samples depending on their migration status. The sub-group were aged between 22 and 66 years, with an average age of 45.8 (SD= 7.3). The mean time since migration was 26.5 years.

Relationship between insulin sensitivity and cardiovascular disease (RISC)
cohort-This cohort is a collection of healthy Europeans (white Caucasian) aged between 30 and 60 years, recruited from 19 centres in 14 countries across Europe. Data were collected between 2002 and 2004. The recruitment methods of the RISC study have been described previously [34]. This report studies a subset of 351 RISC participants with a mean age of 44.6 years (SD= 8.3) who were selected randomly from those who had fasting glucose levels of ≤7 mmol/L, provided DNA samples and underwent euglycaemichyperinsulinaemic clamp measurements.

Measurements
Metabolic and anthropometric measures-In RISC, fasting blood samples collected at each study centre were separated into plasma and serum, aliquoted and stored (-20°C for plasma glucose, -80°C for lipids and insulin). Samples were transported to central laboratories for analysis. Height was measured using a clinical stadiometer. Body weight was measured using a bioimpedance balance (Tanita International Division, UK).
In IMS, fasting blood samples were collected and separated into plasma and serum. Fasting plasma glucose was measured on the day of collection at each of the four study centres. Serum insulin and lipids were analysed at the All India Institute of Medical Sciences, New Delhi, India. Height was measured using a portable plastic stadiometer. Body weight was measured using digital weighing scales with 100g accuracy.
DNA methylation-In RISC, DNA was extracted from whole blood using a Nucleon ® BACC2 kit according to manufacturer's instructions. In IMS, DNA was extracted from whole blood using a salt precipitation method. Methylation analysis was performed separately at each centre, both following identical protocols.
Genomic DNA (1 µg) was bisulphite modified using an EZ DNA Methylation kit (Zymo Research, CA, USA). The manufacturer's protocol was followed with modifications (as recommended by Sequenom ® ). Cycling conditions were adjusted to 20 cycles of 95°C for 30 seconds and 50°C for 15 minutes. Column spin times were doubled, and final elution volume was 100 µL.
Assays were designed using EpiDesigner software (Sequenom ® ) and methylation analysis was conducted using the Sequenom ® EpiTYPER ® according to the Sequenom ® protocol. Amplicons were designed to capture the largest number of CpG sites possible within or close to CpG islands at each of the three loci investigated. Assays for ADCY5 and FTO loci were located within CpG islands while the amplicon designed for KCNJ11 was 221 base pairs upstream of the nearest CpG island. Further details regarding the CpG sites measured are shown in Figures 1A-C. Oligonucleotide sequences are available from the authors on request. Methylation data were generated as β values between 0 and 1, indicating percentage methylation of the original template.
Methylation data were generated at least in duplicate. Methylation for a small number of CpG sites at each locus could not be measured as the fragment masses were beyond the range for MALDI-TOF detection (ADCY5: n= 3; FTO: n= 9; KCNJ11: n= 1). CpG sites with overlapping or duplicated mass fragments following MALDI-TOF were excluded from analysis (ADCY5: n= 9; FTO: n= 7; KCNJ11: n= 4). Replicates returning methylation values that were more than 10% discordant were also excluded. Average methylation values were calculated from concordant measures. CpG sites with concordant data for ≤80% of individuals were excluded (ADCY5: n= 4; FTO: n= 3; KCNJ11: n= 2). CpG sites with median methylation levels in both IMS and RISC cohorts with values of ≤2% or ≥98% were also excluded from the analysis (FTO: n= 6; KCNJ11: n= 7).
Genotyping-GWAS data had previously been generated from the RISC participants using an Affymetrix Genome-wide Human SNP Array 6.0 (Affymetrix, USA). Genotypes for SNPs within 50kb of the regions analysed for methylation were extracted from the GWAS data set (FTO: n= 17; ADCY5: n= 36; KCNJ11: n= 42). Additionally, Affymetrix FTO genotype data for SNP rs9939609 was included in the analysis as this SNP is associated with BMI and diabetes in GWAS. This SNP lies approximately 82kb from the region in FTO where methylation was measured. SNPs that were not in Hardy-Weinberg equilibrium (p<0.05) or had a minor allele frequency of <5% were excluded. SNPs that were adequately tagged by other SNPs were also excluded. Tagging SNPs were identified via the tagger function in Haploview using pairwise tagging and an r 2 threshold of 0.8 [35]. Nine FTO, seven ADCY5 and 23 KCNJ11 SNPs were included in the final analysis.
In the IMS, genotyping of FTO SNP rs9939609 had previously been conducted using the Sequenom ® MassARRAY™.

Ethical approval
Ethical approval for the IMS was obtained from an Indian central institutional review board (All India Institute of Medical Sciences (AIIMS), New Delhi, India (Reference Number: A-60/4/8/2004), as well as institutional review boards at each of the study sites. Written informed consent (witnessed thumbprint if illiterate) was obtained from the participants.
For the RISC study, local ethics committee approval was obtained by each recruitment centre. Written informed consent was obtained from all participants.

Statistical Analysis
Methylation and metabolic data were not normally distributed in the data sets, therefore nonparametric tests for association between groups were used. When measuring any effect of rural to urban migration, paired analysis was utilised. When comparing groups within India (Hyderabad and Lucknow) and between Indians and Europeans, analyses were performed separately across rural and urban individuals using unpaired tests. Europeans, who were considered to be urban dwellers, were compared to urban Indian sub-groups only. Age was not different between analysis groups and stratification by sex did not alter results reported.
The RISC cohort included healthy individuals with fasting glucose levels of ≤7 mmol/L. However, the IMS cohort included a number of individuals with type 2 diabetes (n= 34) and individuals without diagnosed type 2 diabetes but fasting glucose levels of >7 mmol/L (n= 2). Removal of these 36 individuals from the IMS group had no appreciable effect on any of the analyses presented; they are included in the analyses described here.
Linear regression analysis was conducted to investigate the relationship between methylation and metabolic measures. Diagnostic tests indicated that residuals were normally distributed and homoscedastic.
Where multiple SNPs were analysed, data was adjusted for multiple testing using a Bonferroni correction of unadjusted p-values. The denominator in the Bonferroni correction was determined as the sum of the LD blocks plus the number of singleton SNPs (SNPs not in a block). To define blocks genotyping in RISC participants and the Solid Spine of LD (SSLD) method as utilised in the program HAPLOVIEW was used [35]. In this method SNPs that have contiguous pairwise D' values of ≥0.8 are included in a block. For both FTO and ADCY5, the SNPs considered in the analyses formed two blocks with two additional singleton SNPs. Consequently, the denominator in the Bonferroni correction was n= 4 at each locus. For KCNJ11, SNPs formed six LD blocks and one singleton SNP.
Where multiple metabolic measures were tested, data were also adjusted for multiple testing using a Bonferroni correction of unadjusted p-values (n= 8).

Results
An overview of DNA methylation, metabolic measures and genotype in each of the study sub-groups can be found in Table 1 and Supplementary Table 1.

Rural to urban migration
Methylation-Wilcoxon signed-rank tests were conducted to identify potential changes in methylation in response to urban migration. Each place (Lucknow and Hyderabad) in this paired analysis was treated independently; urban dwellers were compared pair-wise to their non- Metabolic measures-Rural to urban migration in the IMS population has previously been shown to elicit changes in a range of metabolic measurements. Analysis of metabolic measures from this sub-group of the IMS population also demonstrated these changes which included increased BMI and HOMA-IR in urban groups ( Table 2).

Differences Between Geographical Location
Analysis of differences between IMS urban dwelling migrants (Hyderabad and Lucknow) and RISC were performed. An overview of methylation, metabolic measures and group sizes is shown in Table 1. Statistical comparisons are described below.
Methylation-Marked differences in methylation levels were noted between the geographic locations. ADCY5, FTO and KCNJ11 methylation was 6.3, 2.7 and 1.5% lower, respectively, in the European group compared to the Urban Indian group (Figure 2, see Table  1 for methylation values and Table 3 for test statistics). FTO methylation was 0.5% and 0.7% higher in rural and urban Lucknow dwellers compared to respective Hyderabad dwellers. Conversely, KCNJ11 methylation was 0.5% lower in Lucknow than Hyderabad, in both rural and urban dwelling comparisons. In ADCY5, methylation differences between Lucknow and Hyderabad were less marked. No robust regional differences in RISC methylation were identified between study centres (ADCY5, p= 0.362; FTO, p= 0.946; KCNJ11, p= 0.505). Furthermore, no differences in methylation were observed in RISC when Northern dwelling Europeans were compared to Southern dwelling Europeans (defined as living above or below 50° latitude) (ADCY5, p= 0.958 ; FTO, p= 0.745; KCNJ11, p= 0.637 ).  19-1.65)). As expected, triglycerides showed the inverse associations to HDL and were on average 0.53 mmol/L higher in urban dwelling Indians (1.46(1.13-2.05) vs 0.93(0.68-1.25)). Fasting glucose was on average 0.40 mmol/L higher in Lucknow than Hyderabad, in both urban (5.55(4.94-6.27) vs 5.05(4.72-5.27)) and rural groups (5.22(4.76-5.80) vs 4.94(4.66-5.16)). No differences were observed in total cholesterol. In general, it was noted that metabolic differences between Hyderabad and RISC were smaller than between Lucknow and RISC groups (see Table 1 and 3 for details). Stratification by age or sex did not alter the outcome of these analyses.

SNP architecture: Europeans
SNP genotypes were available for the European RISC cohort.Additionally, rs9939609 genotype data were available for the IMS cohort. SNPs were analysed to investigate whether: (i) local underlying genetic architecture is associated with local (cis) DNA methylation and (ii) local underlying genetic architecture is associated with any of the metabolic measures assessed. Selected SNPs in each gene were located within 50kb of the region that methylation levels had been measured.

Genetic association with cis DNA methylation
Genotypes from a total of seven SNPs in ADCY5 were analysed ( Figure 1A). When comparing ADCY5 methylation with respect to genotype using an additive model, rs17361324 and rs1112274 were associated with methylation levels. The association of SNP rs1112274 with mean ADCY5 methylation withstood correction for multiple testing (trend: z= 2.52, p= 0.012). These changes represent an approximate 0.5% increase in methylation per minor allele.
In FTO, genotypes from a total of 9 SNPs in the 50 kb region were analysed in the RISC cohort ( Figure 1B). SNP rs9939609 was also analysed. When comparing FTO methylation with respect to genotype in RISC, none of the SNPs showed an association with methylation.
Genotypes from a total of 23 SNPs in KCNJ11 were analysed ( Figure 1C). None of the KCNJ11 SNPs were associated with methylation at this locus.

Genetic association with metabolic measures
When testing for association between genotype and metabolic factors, SNPs that showed association using an initial Kruskal-Wallis test were further analysed for an additive effect using a non-parametric trend test.
In ADCY5, following correction for multiple testing, one robust association remained. An increase in HDL of 0.07 mmol/L was observed when the minor allele of rs17295401 was present (trend test: z= 2.66, p= 0.008). This HDL associated SNP was not associated with DNA methylation, suggesting that the observed influence of SNP on phenotype was independent of possible effects mediated by DNA methylation.
Following correction for multiple testing, one robust association remained at the KCNJ11 locus. HDL showed a 0.16 mmol/L average decrease per minor allele of rs1800467 (trend: z= -3.33, p= 0.001).

SNP architecture: IMS
In agreement with data generated on the RISC population, no association was seen between rs9939609 and methylation in the sub-group of the IMS used in this study. Robust association of rs9939609 with insulin or BMI was also not observed in the sub-group of the IMS. We also analysed the allele distribution of rs9939609 between the groups included in this study. When comparing urban Lucknow with RISC, there was little difference in allele frequency (Fisher's exact p= 0.859). However, there was a robust difference in allele frequency between urban Hyderabad and RISC groups (Fisher's exact p= 0.002). When comparing Lucknow with Hyderabad there was no difference in allele frequency between rural groups (Fisher's exact p= 0.438) but there was a difference between urban groups (Fisher's exact p= 0.008). This difference was being driven by an excess of minor allele homozygotes in the urban Lucknow group.

Association between methylation and metabolic measures in Indians and Europeans
To test the hypothesis that methylation and metabolic measures might be related, regression models were constructed. Data were analysed separately for urban Hyderabad, urban Lucknow and RISC groups. Univariate regression analysis indicated that methylation in ADCY5, FTO or KCNJ11 was not robustly associated with any metabolic measure that was assessed (BMI, fasting glucose, fasting insulin, HOMA-IR, LDL, HDL, total cholesterol and triglycerides).

Rural to urban migration in India
Metabolic differences between urban migrant and rural non-migrant siblings were observed, in concordance with previous observations in the whole IMS sample [1,2]. No differences in average DNA methylation were observed in any locus tested that could be ascribed to urban living. A number of factors might explain this lack of association. The loci studied may not be appropriate or sensitive biomarkers of methylation changes. Alternatively, subtle shifts in DNA methylation which might collectively exert large physiological shifts in metabolic measures may have been too small to detect with the available sample size. Post-hoc power calculations indicated an ability to detect changes in methylation of 1% (with 75% power) between rural and urban dwelling sibs at p= 0.05 and n=46; the power to detect smaller differences was limited.
The metabolic transition following migration is rapid [2] and if this is fuelled by epigenetic changes then they too must be responsive to the exogenous influences within a short period. The mean time since urban siblings had migrated was 26.5 years. It is therefore unlikely that changes in DNA methylation following migration were not observed due to the timings of sample collection. It is reasonable to postulate that methylation for these genes could have been set in early life, before migration occurred.

Regional differences in DNA methylation within India
Methylation of FTO was consistently higher in Lucknow compared to Hyderabad samples; methylation of KCNJ11 was lower. The mean percentage difference was small (around 0.6%). These regional differences could be caused by differences in dietary habits or other lifestyle influences specific to each region. Genetic differences between groups could also be causing the differences observed. Geographical differences in methylation could therefore be explained partly by genetic population stratification and partly by population specific environmental differences.

Geographical differences in methylation between Indian and European groups
At all loci, average methylation was markedly lower in Europeans compared to Indians. The difference in ADCY5 methylation was the largest (6.3%). Although it is recognised these differences may be caused by systematic differences in analysis between the two cohorts, any potential variability was minimised by following identical analysis protocols conducted largely by the same researchers. Furthermore, systematic differences of similar magnitude would be expected across all of the loci studied and this was not observed.

SNP effects on metabolic traits were not mediated through change in methylation
A number of recent studies have shown that methylation patterns are to some extent linked to underlying genetic architecture [7,36,37]. We therefore tested the association between methylation levels and genotype in the European RISC cohort. Tentative evidence supporting a relationship between methylation and SNP genotype in ADCY5 was observed. Further work would allow this relationship to be confirmed both in a larger European cohort and in the IMS.
Linguistic, mitochondrial and Y-chromosome studies of North and South Indians show similarities between North Indian populations and western Eurasians [38][39][40][41]. We observed differences in rs9939609 allele frequency between RISC and Hyderabad, but not between RISC and Lucknow which supported these findings. For this reason, we had initially postulated that the Lucknow population would be more similar to RISC than Hyderabad in its metabolic and methylation profile. However, we observed that differences between Lucknow and RISC were more marked than differences between Hyderabad and RISC. This suggests that metabolic and methylation profiles might be influenced predominantly by regional environmental factors rather than by genetic factors.

The relationship between methylation and metabolic measures.
We postulated that differences in DNA methylation in the loci tested might explain a proportion of the variation in metabolic risk. Although we identified that DNA methylation did vary between Indian and European groups, as did metabolic risk profiles, we were not able to confirm a direct relationship between DNA methylation at these loci and the metabolic measures tested. Thus the hy.3pothesis that environmentally driven changes in epigenetic patterns following migration underlie shifts in cardiometabolic disease risk remains unresolved.

Comparison with other studies
The difference in DNA methylation between Indians and Europeans at these loci is of similar magnitude to differences described in other studies. Heijmans and colleagues [6] described altered methylation of 5.2% between periconceptional famine exposure groups in IGF2. Methylation differences in the same gene between neonates of small or appropriate birth weight for gestational age were 0.2% [42]. A study investigating methylation of pancreatic islets in type 2 diabetes patients and controls identified methylation differences of 4.4 -9.3% between groups [43]. Methylation changes of this magnitude are known to elicit alterations in gene expression [43,44].
Other studies have also looked at the role of ethnicity in determining DNA methylation patterns. Global methylation differences of 2% have been observed between different racial groups living in North Texas, USA [45]. Another American study also described ethnic differences in DNA methylation, in this instance measuring genomic hypomethylation [46]. At the level of individual CpG sites, differences between ethnic groups have also been observed at birth [47]. Our data therefore support the limited available evidence that methylation patterns do differ between ethnic groups.
Emerging data point to the complex overlap between genotype, DNA methylation and gene expression [7,36,48]. Recent data from a Moroccan cohort identified that up to 25% of differential gene expression was explained by dwelling location even when genetic differences were small [49]. However, methylation analysis in a linked study suggested Elliott et al. Page 10 limited involvement of DNA methylation in driving these differences [50]. More work is therefore needed to further delineate these relationships.

Further work
This pilot study identified methylation differences between two ethnically distinct cohorts at three loci. One strategy to further investigate DNA methylation differences between these cohorts would be to employ a more comprehensive genome-wide approach. To establish the relative contributions of environment and genotype, analysis in a further group of Indians who have migrated to Europe could also be of value. Assuming methylation differences exist, these approaches may identify causally related shifts in metabolic profiles and associated disease risk.

Conclusions
This research described the analysis of DNA methylation at three distinct type 2 diabetes susceptibility loci. It highlights geographical and ethnic differences in methylation patterns, particularly between Indians and Europeans. These differences may be attributed to genetic and/or region-specific environmental factors and may help to explain the excess incidence of type 2 diabetes observed in South Asian populations.

Supplementary Material
Refer to Web version on PubMed Central for supplementary material.    Paired analysis of metabolic measure differences between urban and rural siblings.