If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Department of Bio-informational Pharmacology, Medical Research Institute, Tokyo Medical and Dental University, Tokyo, JapanLife Science and Bioethics Research Center, Tokyo Medical and Dental University, Tokyo, Japan
Department of Cardiovascular Medicine, Graduate School of Medicine, Tokyo Medical and Dental University, Tokyo, JapanHeart Rhythm Center, Tokyo Medical and Dental University, Tokyo, Japan
Department of Cardiovascular Medicine, Graduate School of Medicine, Tokyo Medical and Dental University, Tokyo, JapanHeart Rhythm Center, Tokyo Medical and Dental University, Tokyo, Japan
Laboratory for Cardiovascular Diseases, RIKEN Center for Integrative Medical Sciences, Yokohama, JapanDepartment of Human Genetics and Disease Diversity, Graduate School of Medicine and Dental Sciences, Tokyo Medical and Dental University, Tokyo, Japan
Chromosome 4q25 has been repeatedly identified as atrial fibrillation (AF)-sensitive locus in multiple genome-wide association studies (GWAS) and is considered to hold some clues to AF pathogenesis. We aimed to investigate the clinical utilities in Japanese and to unveil the function of the 4q25 locus in affecting transcription of adjacent genes.
Methods
We conducted AF GWAS in Japanese population (1382 AF cases and 1478 controls) and the replication panel (1666 AF cases and 1229 controls) with detailed clinical information which showed the acceleration of AF onset. Stepwise investigations with linkage disequilibrium analysis, histone code patterns, and reporter assay in the 4q25 locus were performed.
Results
The AF GWAS confirmed a significant association of rs4611994 and rs1906617 in chromosome 4q25 with AF. In the clinical analysis, AF onset of the individuals with risk allele accelerated 2.5 years compared with those with protective allele (p = 0.00012). Next, in the functional analysis, three single nucleotide polymorphisms (SNPs) in the variant group selected by linkage disequilibrium analysis were identified as candidates for the cis-regulatory element toward adjacent genes in chromatin immunoprecipitation assay. Among them, rs4611994 and rs72900144 regions showed higher effects on the transcriptional activity of luciferase gene in the risk alleles than those in the protective alleles (p < 0.0001, p < 0.005, respectively).
Conclusions
AF GWAS in Japanese confirmed the association with 4q25 locus and indicated that its SNP affected the acceleration of AF onset. The candidate regions of the causative SNPs, rs4611994 and rs72900144, could alter the adjacent gene expression level.
]. AF has been thought for a long time to be a lifestyle-related disease; it develops in association with other cardiovascular diseases and habitual exposure to alcohol and cigarettes. Recent multiple lines of evidence, however, implicate the involvement of genetic factors in AF pathogenesis. A previous report highlighted the importance of genetic factors in AF development through observing the higher percentage of positive family history for lone AF patients (15%) than for entire AF patients (5%) who visited an arrhythmia clinic [
]. The closest gene to theAF-associated 4q25 single nucleotide polymorphisms (SNPs) is a transcription factor, PITX2, which plays a pivotal role in left/right determination of the heart in the early stages of embryonic development [
]. Thus, the AF-associated 4q25 SNPs have been implicated to facilitate AF via affecting PITX2 transcription and thereby the characteristics of pulmonary vein myocardium. In this study, we performed functional studies of the 4q25 with re-sequencing, linkage disequilibrium (LD) analysis, and a histone analysis. We identified the SNPs that regulate the adjacent gene expression.
Materials and methods
Study population
The participants in the GWAS screening were enrolled in Biobank Japan (Panel 1). As for replication studies, those were collected from two independent cohorts; the subjects enrolled in Tokyo Medical and Dental University (TMDU) and TMDU's associated facilities (Panel 2), and Biobank Japan and Health Science Research Resource Bank (Panel 3). This protocol was approved by the research ethics committees of The University of Tokyo, Biobank Japan, and TMDU, respectively. Control DNA samples in Panel 3, extracted from the B cell line in 1380 healthy volunteers, were purchased from Health Science Research Resource Bank, Osaka, Japan. Written informed consent for participation in the study was obtained from all subjects. AF was diagnosed from the records of twelve-lead electrocardiogram or ambulatory electrocardiogram. We summarized the clinical background of the participants of the three panels in Table 1. The data for healthy volunteers in Panel 3 were unavailable.
Table 1Clinical background of GWAS discovery and replication study.
]. First, we genotyped 195 AF cases and 1480 controls for 280 K SNPs from autosomal chromosomes. For the second screening, the top 7216 SNPs were validated with 1384 AF cases and 1483 controls using Affymetrix GeneChip SNP array (Santa Clara, CA, USA), as described previously [
]. We evaluated the quality of the data for all SNPs under the criteria that call rate was more than 90%, and that the p-value of Hardy–Weinberg equivalence was more than 10−6 (Supplementary Table 1A and B). Genotyping for SNPs in the replication study was performed with the Multiplex PCR/Invader assay of Third Wave Technologies (Madison, WI, USA) [
], and for insertion/deletion (Indel) with the direct sequencing of PCR products using the capillary sequencer (ABI3700, Applied Biosystems, Foster, CA, USA). We selected rs1906617 due to the highest success rate of genotyping (96.1%), compared with rs4611994 (91.7%). Protocols for PCR primer design, PCR experiments, DNA extraction, DNA sequencing, and SNP discovery were as previously described [
Haplotype blocks were constructed as previously described using the Haploview software (Broad Institute, Cambridge, MA, USA, https://www.broadinstitute.org/haploview/haploview) with minor modifications. Briefly, we first clustered SNPs that were ultimately linked to each other into a group, and then selected a single representative SNP from each group by estimating the haplotype phasing with Expectation–Maximization algorithm. Second, we chose the set of common haplotypes with frequencies more than 1%, so that the sum of the selected common haplotypes would cover more than 95% of the population. We compared the frequency of each haplotype between cases and controls.
Chromatin immunoprecipitation assay
Chromatin immunoprecipitation (ChIP) assay was performed with ChIP-IT Express® (Active Motif, Carlsbad, CA, USA) according to the protocol described elsewhere [
]. Mesenchymal stem cells (MSCs) were washed with phosphate-buffered saline (PBS) three times, and cross-linked with 1% formaldehyde for 10 min at a room temperature of 25 ± 2 °C. According to the manufacturer's protocol, chromatin was isolated from cross-linked cells and sheared enzymatically for 10 min at 37 °C. Sheared chromatin was incubated with a 1:1000-diluted anti-monomethylated histone H3 lysine 4 (H3K4me1) antibody, a 1:1000-diluted anti-trimethylated histone H3 lysine 27 (H3K27me3) antibody, or 1:1000-diluted negative control IgG for overnight with protein G magnet-beads. After adding the Reverse Cross-linking Buffer, DNA samples were incubated with Proteinase K (20 ng/ml) and purified with phenol–chloroform–isoamyl alcohol extraction and ethanol precipitation. The ChIP-PCR was carried out with one of 22 pairs of primers (Supplementary Table 2). The expected PCR product size ranged from 70 bps to 200 bps. Since rs4611994 and rs4540107, and rs1906595 and rs1906596 are closely located, each pair of SNPs was analyzed simultaneously with identical pairs of primers, respectively. Aside from them, each PCR product has a single variant.
Luciferase reporter assay
Reporter assays were performed using the Dual-Luciferase Reporter Assay System (Promega, Madison, WI, USA). We sub-cloned DNA fragments from human samples with risk or protective homozygous genotype into the pGL4.23 vector with the minimal promoter (Promega, Madison, WI, USA). Supplementary Table 3 shows the chromosomal ranges used in the constructs for each variant. Twenty-four hours after transfection into HEK293T cells, we lysed the cells in passive lysis buffer (Promega, Madison, WI, USA) and measured luciferase activity with the Centro BL 960 system (BERTHOLD TECHNOLOGIES, Bad Wildbad, Germany). The same assay was carried out in triplicate.
Cell culture
MSCs were purchased from TAKARA Inc (Shiga, Japan). For ChIP assay, cells at about 5 × 106 were seeded in a 15 cm dish and grown for 4–5 days in special medium (TAKARA Inc.). For luciferase assay, cells at 5 × 107 were prepared in a 15 cm dish and seeded in the 24-well plate after electroporation. All the experiments were performed during 3–5 passages.
Statistical analysis
We carried out statistical analysis for the association study, haplotype frequencies, Hardy–Weinberg equilibrium, and calculation of r2 as described previously [
]. In brief, alternative hypotheses and null hypotheses for the association between cases and controls were evaluated with the χ2 tests; in each case, the odds ratio and 95% confidence interval (95% C.I.) were calculated. LD coefficients D′ = D/Dmax were calculated. Hardy–Weinberg equilibrium of alleles at each SNP was assessed with the χ2 statistics with SPSS (IBM Inc., Chicago, IL, USA). For survival analysis, Kaplan–Meier's method was employed. To analyze the difference of the continuous variables, we used Student t-test for the reporter assay. The p-value for statistical significance was adjusted with Bonferroni's correction if necessary.
Result
Genome-wide association study of AF in Japanese population
To identify the associated loci with AF in Japanese, we conducted a GWAS. First, we used gene-chips loaded with approximately 280 K SNPs for the screening study of 195 AF cases and 1483 controls. For the top 7216 SNPs highly ranked as the AF candidate SNPs, we performed association study, adding 1189 AF cases to the screening population (1384 AF cases and 1483 controls; Panel 1, Supplementary Table 1B). Clinical background is shown in Table 1. In this discovery study, chromosome 4q25 was identified as the most significant locus, which is consistent with the previous studies (Fig. 1, Table 2) [
]. We collected samples for the first replication study from TMDU and its associated facilities (Table 1, Panel 2), and for the second replication study from Biobank Japan and Health Science Research Resource Bank (Panel 3). The replication study confirmed the association of chromosome 4q25 with AF (p = 3.2 × 10−38; Table 2). As a result of combined study, the 4q25 locus was significantly associated with AF (p = 3.3 × 10−84; Table 2).
Fig. 1Manhattan plot of GWAS for AF in Japanese. All the SNPs used in the screening (1384 AF cases and 1483 controls; Panel 1) are plotted in order of chromosomal position on the x-axis. The y-axis indicates −log10 of p-value in the association study. GWAS: genome-wide association study. SNP: single nucleotide polymorphism; AF: atrial fibrillation.
Acceleration of AF onset in individuals with the risk allele of rs1906617
We investigated the relationship of 4q25 rs1906617 with clinical characteristics and parameters in Panel 2. Using 1322 individuals with detailed clinical data recruited from TMDU and its associated facilities (Table 1), we compared the distribution of AF onset age among individuals with different genotypes of rs1906617 using survival analysis and observed the significant difference in a dose-dependent manner (Fig. 2). Homozygous risk allele carriers could be diagnosed as AF at 2.5 years earlier than homozygous protective allele carriers (p = 0.00012). There were no differences among three groups in clinical characteristics or parameters including the history of diabetes mellitus or hypertension except for the rate of females.
Fig. 2Survival analysis of AF onset SNP. Survival curve of AF onset. We compared AF onset age of all the individuals collected from TMDU and its associated facilities according to the rs1906617 genotype of chromosome 4q25. Kaplan–Meier's method was employed and p-value was calculated with log-rank (Mantel–Cox) and generalized Wilcoxon (Breslow). AF: atrial fibrillation; SNP: single nucleotide polymorphism; TMDU: Tokyo Medical and Dental University.
Identification of functional variants through LD analysis in the 4q25
Next, to identify functional variants, we performed SNP discovery with direct re-sequencing of the 4q25 locus and categorized all the identified variants into several groups according to the strength of linkages between variants. By reference to the HapMap data (http://www.hapmap.org), we targeted 94 kb of the LD block including the GWAS marker SNPs between rs3853444 and rs60133733. Re-sequencing of 24 individuals revealed a total of 155 variants, including 150 common SNPs, 4 Indels, and one short tandem repeat polymorphism. Through genotyping of 94 AF cases and 94 controls for all the identified variants, we calculated the strength of linkage with Haploview software 4.2, which provided the information of the detailed LD structure (Fig. 3) and 22 tag-SNPs (Table 3). These 22 variant groups were ranked according to the magnitude of case-control association study [850 AF cases and 1373 healthy volunteers (Panel 3 in Table 2)] for all the tag-SNPs (Table 3). Thirteen of 22 tag-SNPs were significantly associated with AF (corrected p-value < 0.0023): among them, the group including rs1906617 (referred as the variant group 1) was the most significantly associated with AF. The variant group 1 included 23 SNPs such as rs2200733 whose association with AF has been reported in the initial AF GWAS [
]. The evaluation of the combination of the 13 associated tag-SNPs whose p < 0.0023 revealed that the most significant combination consisted of the variant group 1, 2, 3, and 19 (Supplementary Table 4), and the combination of 4 variants produced five haplotypes (referred to as Haplo1-5). Although Haplo1 was most significantly associated with AF (p = 7.6 × 10−17), the frequency of Haplo1 mostly depended on the risk allele frequency of variant group 1 (p = 4.0 × 10−16), suggesting that the significance of Haplo1 was provided exclusively by the effect of the variant group 1. These data suggest that one or more variants in the variant group 1 might contribute independently and predominantly to AF onset. Therefore, we focused on the variant group 1 to further examine the epigenetic properties of the AF-associated 4q25 variants.
Fig. 3Fine mapping and restructure of LD block of target area in 4q25. The LD block was constructed from genotyping data of 188 individuals. The r square method was employed. A standard color scheme is used to display LD with black for very strong LD (r2 = 1), white color for no LD (r2 = 0), gray color for intermediate LD. The solid black triangle in the block indicates an LD structure formed after fine mapping that includes all the SNPs in the variant group 1 (black arrows) and the variant group 2 (red arrows). LOC729065 is a hypothetical gene, which has not been annotated as a protein-coding gene. LD: linkage disequilibrium; SNP: single nucleotide polymorphism.
Since there was no protein-coding gene or microRNA in the 94 kb region around AF-associated SNPs, we hypothesized that this region would act through the long-range cis-regulatory mechanism, and thus investigated the histone code pattern on each of 23 SNPs in this group. The monomethylated histone H3 lysine 4 (H3K4me1) has been used as an enhancer mark [
]. We assessed regions flanking each of the 23 SNPs in the variant group 1 with ChIP, followed by semi-quantitative PCR experiments (Supplementary Table 2), and detected three regions of rs4611994, rs2200732, and rs72900144 as candidates of the cis-regulatory element because these 3 regions had positive signal as the enhancer and/or the silencer by ChIP-PCR (Fig. 4). While the regions with rs2200732 and rs72900144 were enriched in only H3K4me1, the rs4611994 region was enriched in both H3K4me1 and H3K27me3, which was thought to be in the bivalent state.
Fig. 4Histone code pattern of all the variants in the variant group 1. The top row is the ChIP result for H3K4me1 antibody, the middle row for H3K27me3, and bottom row for IgG as a negative control. The numbers of the SNPs in the variant group 1 are the same as the order in the chromosomal mapping of Fig. 3 and Supplementary Table 2. Arrows indicate the positive result of ChIP-PCR for H3K4me1 and H3K27me3. The positive bands also found in negative control IgG were considered pseudo-positive. M indicates DNA size marker. ChIP: chromatin immunoprecipitation. SNP: single nucleotide polymorphism, PCR: polymerase chain reaction, DNA: deoxyribonucleic acid.
To ask if each candidate region affected the transcriptional activity of the neighboring genes, we conducted a reporter assay in HEK293T cells with the luciferase vector, in which either the risk allele or the protective allele of rs4611994, rs2200732, or rs72900144 was inserted into upstream of the minimal promoter. We compared the cis-regulatory effect between the risk and the protective allele, which showed the higher promoter activity in the region with the risk allele of rs4611994 and rs72900144 than that with its protective allele, while there were no differences in the promoter activity between the risk and the protective allele of rs2200732 (Fig. 5; p < 0.0001 and p < 0.005, respectively).
Fig. 5Enhancer reporter assay. (A) The effects of SNPs in 4q25 on luciferase activity. Enhancer reporter plasmids were co-transfected with pRL-TK as described in Materials and Methods Section. Luciferase activity was measured 24 h after transfection by using the Dual-Glo Luciferase assay system. (B) The targeted sequences with risk or protective allele were cloned in front of a minimal promoter-driven luciferase reporter. The single (*) and the double asterisks (**) indicate *p < 0.0001 and **p < 0.005, respectively. SNP: single nucleotide polymorphism.
In this study, we conducted an encompassing research linking clinical data to experimental analysis. Several GWAS of AF have shown the association of the 4q25 region with AF repeatedly and implied the potential clinical usage of the 4q25 SNPs in the future clinical practice. Our data could replicate the previous data in Japanese population, and shed light on new aspects of clinical properties. The 4q25 region has been implicated to include the cis-regulatory element related to the neighboring genes expression through histone code analysis and reporter assay.
As for clinical implication, our data imply the possibility for evaluating the genetic predisposition to early-onset of AF and recurrence after pulmonary vein isolation. Homozygous risk allele carriers were diagnosed with AF 2.5 years earlier than homozygous protective allele carriers. The combination of this finding with clinical parameters may facilitate decision-making for the management of patients although the clinical application of genotype alone may be limited at present. As for the detection of functional variants and the elucidation of the 4q25 function, we performed LD analysis, which enabled us to categorize all variants into 22 groups. If 22 variant groups are ranked according to the significance of the association with AF, the top 13 variant groups were independently relevant under Bonferroni's correction (Table 4). To avoid underestimation of independence among variants, we also re-evaluated LD with the threshold r2 value of 0.5. It still depicted at least four independent AF genetic risk groups, suggesting the presence of multiple independent genetic risks facilitating AF in the 4q25 locus. Among them, we focused on the variant group with the lowest p-value: the representative histone code analysis showed that the sequence surrounding rs4611994 is assumed to be in bivalent status with both high enrichment of both H3K4me1, an enhancer mark, and H3K27me3, a silencer mark in the MSCs. The subsequent reporter assay indicated that rs4611994 had a cis-regulatory function to suppress promoter activity. By comparing the strength of the cis-regulatory function between the risk and the protective allele in three regions, we observed the higher effect of the risk allele of rs4611994 and rs72900144 on transcriptional activities of adjacent genes. We used the software, TFBIND (http://tfbind.hgc.jp/), which enables us to find the DNA binding sequence. It predicted that TFAP4 would bind to rs4611994 sequence and RUNX1 would bind to rs72900144 one. We need to validate them in the wet experiment in the future. Recently, it has been reported through ChIP-Seq analysis that a large cluster of multiple enhancers controls the gene expression in disease development (e.g. Super-enhancer [
]). Our target LD block includes a hypothetical gene, LOC729065, which we failed to confirm the expression in MSC. The aggregate of enhancers might affect the expression of the adjacent genes, PITX2 or ENPEP which are located 150 kbp or 250 kbp apart from our target LD block. Our finding might suggest that the risk allele of the cluster of these AF-associated SNPs could contribute to higher gene expression.
Table 4Analysis of representative 22 SNPs in 4q25.
r square in the right column indicates the value obtained from LD analysis with rs1906617. AF: atrial fibrillation; SNP: single nucleotide polymorphism.
rs1906617
AA
AG
GG
–
AF
184
371
286
841
1.658
66.24
4.0 × 10−16
Control
447
668
267
1382
rs6854111
AA
AT
TT
0.34
AF
48
271
529
848
1.765
63.65
1.5 × 10−15
Control
156
584
627
1367
rs2634073
AA
AG
GG
0.34
AF
534
261
50
845
1.705
55.06
1.2 × 10−13
Control
648
559
150
1357
rs1906618
TT
TC
CC
0.68
AF
226
366
241
833
1.594
55.05
1.2 × 10−13
Control
499
621
216
1336
rs60409120
TT
TC
CC
0.71
AF
231
375
236
842
1.555
49.89
1.6 × 10−12
Control
214
629
517
1360
rs17042144
TT
TC
CC
0.58
AF
279
384
183
846
1.54
46.28
1.0 × 10−11
Control
591
590
163
1344
rs10019689
AA
AC
CC
0.54
AF
193
370
286
849
1.517
43.53
4.2 × 10−11
Control
167
610
587
1364
rs10024267
TT
TC
CC
0.47
AF
321
356
141
818
1.499
37.92
7.4 × 10−10
Control
665
564
122
1351
rs6533531
TT
TG
GG
0.1
AF
7
120
718
845
1.889
37.02
1.2 × 10−9
Control
21
340
1006
1367
rs11931959
AA
AG
GG
0.45
AF
89
339
409
837
1.361
21.90
2.9 × 10−6
Control
187
660
520
1367
rs3866832
GG
CG
CC
0.06
AF
1
98
746
845
1.74
21.25
4.0 × 10−6
Control
11
249
1114
1374
rs3855819
GG
CG
CC
0.13
AF
559
180
23
762
1.361
12.57
3.9 × 10−4
Control
884
392
60
1336
rs2723320
AA
AG
GG
0.13
AF
114
344
362
820
1.232
9.80
1.7 × 10−3
Control
123
566
651
1340
4q25_38
AA
AG
GG
0
AF
682
160
7
849
1.253
4.58
0.03
Control
1158
208
11
1377
rs3866830
CC
CG
GG
0.25
AF
513
269
50
832
1.133
2.83
0.09
Control
759
470
86
1315
rs60133733
TT
TG
GG
0.01
AF
420
348
77
845
1.108
2.23
0.14
Control
714
525
110
1349
rs3853444
GG
AG
AA
0.01
AF
65
326
434
825
1.078
1.16
0.28
Control
103
587
670
1360
rs10027473
TT
TC
CC
0.05
AF
748
87
3
838
1.158
1.10
0.29
Control
1220
122
4
1346
rs4400058
TT
TC
CC
0.28
AF
50
278
517
845
1.078
1.04
0.31
Control
77
492
794
1363
rs1906606
AA
AC
CC
0.26
AF
504
274
52
830
1.069
0.81
0.37
Control
781
485
80
1346
rs9998815
CC
CG
GG
0.14
AF
430
350
68
848
1.009
0.02
0.90
Control
697
556
117
1370
rs10007382
AA
AG
GG
0
AF
305
371
125
801
1.003
0.00
0.96
Control
521
617
215
1353
a r square in the right column indicates the value obtained from LD analysis with rs1906617.AF: atrial fibrillation; SNP: single nucleotide polymorphism.
In summary, our findings indicate a clue linking the 4q25 SNPs with AF development. Clinical studies showed that 4q25 genotype could be a prediction marker for early AF onset. The combination analysis with linkage and histone modification identified candidate cluster of the cis-regulatory regions, rs4611994 and rs72900144. We propose that rs4611994 and rs72900144 genotypes could alter the adjacent gene expression level in MSC and distinguish AF onset age and recurrence rate after PVI. In conclusion, the 4q25 region could accelerate AF onset age or recurrence after PVI through cis-regulatory effects on the adjacent gene expression.
Funding
This work was supported by a Grant from Tailor-made Medical Treatment Program (1K157) Grant-in-Aid (26293052) from MEXT of Japan, Practical Research Project for Life-Style Related Disease Including Cardiovascular Disease and Diabetes Mellitus from Japan Agency for Medical and Development, AMED (I5656344), Translational Research Funds from the Japanese Circulation Society, and the Joint Usage/Research Program of the Medical Research Institute, Tokyo Medical and Dental University.
Conflict of interest
None.
Acknowledgments
We would like to appreciate all the participants in this study. We also thank all the members and technical staff of our laboratory.
Appendix A. Supplementary data
The following are the supplementary data to this article: