OUP user menu

Whole-exome sequencing in familial atrial fibrillation

Peter Weeke , Raafia Muhammad , Jessica T. Delaney , Christian Shaffer , Jonathan D. Mosley , Marcia Blair , Laura Short , Tanya Stubblefield , Dan M. Roden , Dawood Darbar , National Heart, Lung, and Blood Institute (NHLBI) GO Exome Sequencing Project
DOI: http://dx.doi.org/10.1093/eurheartj/ehu156 First published online: 11 April 2014


Aims Positional cloning and candidate gene approaches have shown that atrial fibrillation (AF) is a complex disease with familial aggregation. Here, we employed whole-exome sequencing (WES) in AF kindreds to identify variants associated with familial AF.

Methods and results WES was performed on 18 individuals in six modestly sized familial AF kindreds. After filtering very rare variants by multiple metrics, we identified 39 very rare and potentially pathogenic variants [minor allele frequency (MAF) ≤0.04%] in genes not previously associated with AF. Despite stringent filtering >1 very rare variants in the 5/6 of the kindreds were identified, whereas no plausible variants contributing to familial AF were found in 1/6 of the kindreds. Two candidate AF variants in the calcium channel subunit genes (CACNB2 and CACNA2D4) were identified in two separate families using expression data and predicted function.

Conclusion By coupling family data with exome sequencing, we identified multiple very rare potentially pathogenic variants in five of six families, suggestive of a complex disease mechanism, whereas none were identified in the remaining AF pedigree. This study highlights some important limitations and challenges associated with performing WES in AF including the importance of having large well-curated multi-generational pedigrees, the issue of potential AF misclassification, and limitations of WES technology when applied to a complex disease.

  • Atrial fibrillation
  • Genetics
  • Exome
  • Family study
  • Calcium signalling


Atrial fibrillation (AF), the most common sustained cardiac arrhythmia, is associated with impaired functional status, reduced quality of life and increased mortality. While the etiology of AF is multifactorial, the mechanisms underlying susceptibility to most forms of AF remain unknown.1 There is increasing evidence that AF is a heritable disorder, especially ‘lone AF’.2,3

Over the last decade, positional cloning and candidate gene approaches have shown that AF is a complex disease associated with familial aggregation. Rare genetic variants encoding cardiac ion channels and signalling proteins have been identified in familial AF kindreds.4,5 Conversely, genome-wide association studies (GWAS) have identified common AF susceptibility risk alleles and implicated genes encoding transcription factors related to cardiopulmonary development and cell signalling molecules.3 However, GWAS cannot determine the effects of rare variants, which may contribute to larger effect sizes in individual families and thus may collectively be responsible for substantial genetic variability of AF.6

In the present study, we report the use of whole-exome sequencing (WES) in six familial AF kindreds. This approach has identified rare candidate variants in genes not previously linked to other types of Mendelian disease79 and thus may offer new insights into AF pathogenesis and disease pathways that could ultimately provide novel therapeutic targets for this common condition.


The vanderbilt AF registry

Since 2002, patients with AF have prospectively been enrolled in the Vanderbilt AF Registry, a clinical and DNA repository.10 Patients must have documented AF or atrial flutter, concurrent use of one or more anti-arrhythmic drugs or atrio-ventricular nodal blockers to control ventricular rates, provide informed consent, report for follow-up, and be >18 years old. We excluded patients with a history of AF related to cardiac surgery. In all instances, AF was documented with an ECG, a rhythm strip, or a Holter monitor.

Study subjects and family data

Figure 1 depicts the process we used to select families for this study. We included families in which the proband, defined as the first member of the kindred encountered in clinic, had early onset (age ≤65 years) lone AF and did not have evidence of underlying structural (including left ventricular hypertrophy and valvular disease) or systemic disease (including hypertension, renal disease, and diabetes) as determined by clinical examination, including laboratory values, echocardiography, and thyroid function tests. Additional affected family members did not have to be lone AF patients. The proband in the AF5 family (III-2) had symptomatic paroxysmal AF documented at 67 years old, although symptoms started at 65 years (see the Supplementary material online, Appendix S1).

Figure 1

Flowchart of family selection process.

Whole-exome and Sanger sequencing

WES targets the coding regions of the human DNA (∼3%). In brief, we extracted DNA from peripheral blood leukocytes or saliva. Sequencing was performed on the HiSeq2000 (Illumina, San Diego, CA, USA) platform after in-solution enrichment of exonic and adjacent intronic sequences (SureSelect Human All Exon 44 Mb kit v2 (AF3, AF4, AF5, and AF6) and SureSelect Human All Exon 50 Mb kit (AF1 and AF2) (Agilent Santa Clara, CA, USA). Families AF3, AF4, AF5, and AF6 were sequenced at the Broad Institute and families AF1 and AF2 at Johns Hopkins Center for Inherited Disease Research. This generated 76 bp paired-end reads per run which yielded 9.2 Gb of sequence on average with ∼15 000 genes being covered. All samples had a transition/transversion ratio of >3.0 in coding regions and ∼85% of the target was covered at ≥20× on average. The full methods used for both Sanger sequencing and WES, variant analysis, and sequencing statistics are described in the Supplementary material online, Appendix S1.

Variant selection and filtering

The variant filtering process was designed to identify very rare amino acid changing (AAC) variants (i.e. missense, nonsense, structural variants, or splice site variants) shared by individual members of the same kindred. Synonymous variants were not considered. All variants were screened against dbSNP build 134, the 1000 Genomes Project, and the Exome Sequencing Project (ESP) of 5400 individuals. Assuming that both functionally important and pathogenic AF variants may be present in the ESP data but selected against, we only included very rare variants with a minor allele frequency (MAF) ≤ 0.04%; those variants with an MAF >0.04% have previously been associated with less certain pathogenicity.11 In addition, we screened all of the identified very rare AAC variants against a set of 286 internal reference exome sequences of European American ancestry that were drawn from studies on adverse drug reactions following predominantly antiarrhythmic pharmacotherapy. Hence, these exomes may be enriched with pathogenic variants which mean that defining the optimal cutoff threshold is challenging. Here, we excluded variants present in >5 of 286 internal reference exomes, as we have done in previous experiments searching for very rare variants associated with cardiovascular phenotypes.7 This additional filtering step reduces the likelihood of identifying false-positives variants, driven by population stratification. All identified very rare AAC variants were in all instances reanalysed by Sanger sequencing. Variant confirmation analysis was performed by Sanger sequencing in additional family members affected with AF and available DNA. As AF is a highly heterogeneous disease with variable age-dependent penetrance, it is challenging to confidently determine whether a patient was truly unaffected because individuals who carry a susceptibility variant may manifest AF at an older age.

We accessed microarray data from the University of California Santa Cruz to evaluate expression in cardiac tissue (http://genome.ucsc.edu).


Approval of the AF Registry protocol was obtained from the Vanderbilt Institutional Review Board (IRB) and a written informed consent was obtained from each study subject prior to enrolment.


We identified a total of 33 individuals with lone AF in the Vanderbilt AF registry who also had ≥2 affected family member. Of these, we studied six families of European-Caucasian ancestry (Figure 1). Nineteen subjects with documented AF from these six families were selected for WES; where possible, non-first degree relatives were analysed. The clinical characteristics of affected family members who were selected for WES or variant confirmation are listed in Supplementary material online, Table S1. The median age at time of diagnosis for the lone AF probands was 53.5 years, range 18–67 years. Among individuals undergoing WES, the median age at time of diagnosis was 61 years (range 18–76 years) with paroxysmal AF being the most common type of AF (n = 13, 68%). Individual level information is available in Supplementary material online, Appendix S1and Table S1.

Sequencing was successful in 18 patients; in AF3 proband, sequencing was unsuccessful (III-9). Assuming that any pathogenic-causing genetic variant would segregate among all affected family members, all very rare AAC variants were identified and validated (Table 1) among the two other affected members of this kindred (AF3) who also had WES done (II-7 and III-2). These were then verified in the proband (III-9) by Sanger sequencing. This approach generated a final list of shared very rare variants that could be tested among additional affected family members.

View this table:
Table 1

Filtering strategy for shared variants in six atrial fibrillation families identified by whole-exome sequencing

Shared nonsense, missense, structural variants and splice-site variants571153187020554655614992
Absent from dbSNP build 134 and 1000 Genomes Project17966332614
Rare: present in ESP5400 with an MAF ≤ 0.4%219b893
Novel: absent from ESP5400971c754
Total rare and novel by WES and database filtering1181015147
Present in ≤5 of 286 internal reference exomes1071015146
Rare and novel variants that were confirmed by sanger sequencing to be present in ≥1 affected family members10751070
  • ESP5400, Exome Sequencing Project of 5400 individuals; MAF, minor allele frequency.

  • aExome sequencing failed in the proband (III-9) of the AF3 family.

  • b9/40 variants identified between II-7 and III-2 were confirmed in the proband of the AF3 family by Sanger sequencing.

  • c1/9 variants identified between II-7 and III-2 was confirmed in the proband the AF3 family by Sanger sequencing.

On average, WES identified 11 shared very rare AAC variants per family which were validated by Sanger sequencing; the range in individual families was 7–15. Detailed variant information including in silico conservation scores and functional predictions for all variants that were also confirmed in affected family members are listed in Supplementary material online, Table S2 (n = 39).

Results by individual kindreds

AF1: WES was performed on III-1 (proband), II-3, and II-5. One additional affected family member with available DNA was used for variant confirmation analysis (II-2) (Figure 2). Of the 11 very rare AAC variants identified by WES, 10 were present in ≤5 of 286 internal reference exomes and confirmed by Sanger sequencing in one additional affected family member in the following genes: CACNB2, CHRNA10, ADAM17, MCM8, TLN1, DONSON, IGFN1, AIFM1, SLC15A4, and PTCH2 (Table 1). Listed in Supplementary material online, Table S2 is all of the very rare AAC variants identified by WES including functional predictions and conservation scores. Of these, CACNB2, TLN1, IGFN1, and ADAM17 have previously been associated with cardiac conduction and arrhythmia (CACNB2), cardiovascular function (IGFN1 and TLN1), and cardiac development (CACNB2 and ADAM17) (Supplementary material online, Table S3). Supplementary material online, Table S3 lists information on cardiac expression, putative gene function, and known associations for each gene harbouring a very rare AAC variant identified by WES. In brief, CACNB2 encodes the β2-subunit of the l-type voltage-gated calcium channel encoded by CACNA1; IGFN1 is involved in adaptive responses to mechanical inputs by bringing together structural and signalling proteins; TLN1 is implicated in cell–cell and cell–matrix interactions; ADAM17 is implicated in cell–cell and cell–matrix interactions. In the AF1 kindred, the variant identified in CACNB2 is in a canonical acceptor splice site (Supplementary material online, Table S2). For the remaining five lone AF kindreds (AF2, AF3, AF4, AF5, and AF6), we applied a filtering strategy identical to the one outlined above. Detailed description of the AAC variant filtering steps for each family is listed in the supplement.

Figure 2

Pedigrees of atrial fibrillation families. Clinically affected AF family members are denoted by filled squares or circles: Embedded Image = affected male; Embedded Image = affected female; Embedded Image = male; Embedded Image = female; Embedded Image = deceased; Embedded Image = whole exome sequencing; Embedded Image = segregation analysis by Sanger sequencing; Embedded Image = proband; F = failed whole exome sequencing.

AF2: Seven very rare AAC variants passed filtering (Table 1) in the following genes: CACNA2D4, HSF1, GPR20, C3orf33, and KIF21BI. Of these, CACNA2D4 and HSF1 have previously been associated with cardiac conduction (CACNA2D4), arrhythmias and cardiovascular function (HSF1). In brief, CACNA2D4 encodes a regulatory subunit that alters the properties of pore-forming alpha-1 subunits including CACNA1A; HSF1 encodes a molecular chaperone enhancing refolding of denatured proteins and degradation of damaged proteins (Supplementary material online, Table S3).

AF3: Five very rare AAC variants passed filtering (Table 1) in the following genes: SCNN1D, OTUD7B, PARP2, PIP5K1A, and GALNT6 (Supplementary material online, Table S2). SCNN1D has previously been associated with cardiovascular function. In brief, SCNN1D encodes a non-voltage-gated sodium ion channel that is expressed in both atria (Supplementary material online, Table S3).

AF4: 15 very rare AAC variants passed filtering (Table 1) in the following genes: Ellis-van Creveld 2 (EVC2), OVOS, COMMD3, HOXA9, NECAB3, ZFP64, BTC, KCMF1, and TNFSF15. EVC2 and HOXA9 have previously been associated with cardiac development. In brief, EVC2 is associated with the Mendelian disorder EVC syndrome and congenital cardiac disease; HOXA9 is part of a developmental regulatory system that provides cells with specific positional identities (Supplementary material online, Table S3).

AF5: Seven very rare AAC variants passed filtering (Table 1) in the following genes: ATP6V1A, CD244, ITLN1, DOK3, ZFP2, ADAMTS2, and INPP5D (Supplementary material online, Table S2). INPP5D has previously been associated with cardiac conduction and arrhythmia. In brief, INPP5D is involved in the control of cell–cell junctions and exerts important physiological and pathological functions in the heart (Supplementary material online, Table S3).

AF6: Zero very rare AAC variants passed filtering (Table 1).

Based on predicted function, involvement in cardiac development, tissue expression, and previous disease associations, we determined 7/39 high-priority candidate AF variants in the following genes: EVC2, CACNB2, CACNA2D4, HSF1, and SCNN1D (Supplementary material online, Table S3). However, in all instances, the high-priority variant was present in ≥1 unaffected individual in each family.


The goal of this study was to use WES to identify novel genes linked with familial AF in families with a complex inheritance pattern. We employed a rigorous filtering strategy aimed at identifying a single disease causing variant in each family, but identified multiple very rare AAC variants in five of the six familial AF kindreds. Further, in the remaining familial AF kindred, we did not identify one rare shared variant among all affected individuals. Collectively, these findings support the notion that (i) AF is a complex disease with familial aggregation; (ii) highlights some important challenges and limitations associated with applying WES to this common arrhythmia; and (iii) suggests that causative variants may not be in the coding regions of the human genome.

Our familial AF kindreds are suggestive for a monogenic inheritance pattern (i.e. multiple affected individuals in one generation or linearly affected individuals in multiple generations) justifying the use of WES for genetic discovery (Figure 1).7,12 However, we did not conclusively identify one pathogenic variant in any of the families highlighting some methodological difficulties and limitations associated with this genetic approach to AF. One of the major challenges of WES is identifying disease causing variants from a large pool of variants.13 We employed a stringent filtering strategy that excluded variants that were common according to publically available databases and 286 internal reference exomes.79 The inclusion of local internal reference exomes ensured that the final list of variants were indeed very rare; unlike most public databases, local samples are more likely to represent regional ancestry and allele frequency patterns thereby reducing the risk of identifying variants that are driven by population stratification.14 While some studies have relied on in silico prediction of variant function, we did not employ such a filter as this is prone to miss causative variants due to inaccurate functional prediction, incongruent functional prediction between various prediction schemes, and more importantly, it does not provide any evidence of causality.7,11

The success of WES studies in Mendelian disease hinges on several important premises including the size of the pedigree, the accuracy of phenotyping, and the disease. The advantages of having well-curated large multi-generational pedigrees include easier recognition of inheritance patterns, the ability to select remotely affected relatives for WES which in turn facilitates the identification of a causal variant, and the ability to perform segregation analysis. While we identified well-curated pedigrees, the families were only modest in size which hindered the selection of distantly related individuals that would have considerably reduced the number of shared potential causative variants for each family. Wells et al. recently showed that distantly chosen subjects for WES were predicted to share <0.1% of their genomes. However, despite the fact that they evaluated dilated cardiomyopathy, a highly penetrant monogenic disorder caused by a single rare variant, the authors faced similar issues to the ones encountered here with failure to identify a single causative variant using a filtering strategy only; a rare RBM20 variant was identified by segregation analysis.7 Hence, using WES among first degree relatives does not seem fruitful in the setting of a non-recessive disorders or disorders not caused by a de novo mutation. While segregation analysis can provide strong evidence of variant pathogenicity, performing such an analysis in the setting of AF is complicated by variable age of onset and penetrance; individuals who present with AF at older age have greater likelihood of AF being induced by environmental or epigenetic factors, which adds another layer of complexity.15 Lending support to this idea is the fact that when we genotyped unaffected family members at the seven high-priority candidate variants, many unaffected family carried the rare variants. To reduce the risk of misclassification associated with late age of AF onset, we performed an ‘affected-only’ analysis.

The biology and genetic underpinnings of AF are poorly understood. To date, GWAS have successfully tested the common disease-common variant hypothesis and identified multiple novel AF risk loci, although much of the overall AF variability remains unaccounted for.3 Here, we tested the common disease-rare variant hypothesis by coupling WES with AF. The rationale for performing WES is based on the notion that exonic variants are more likely to be pathogenic compared with those located in introns or intergenic regions. In the context of the common disease-rare variant hypothesis, the 4q25 locus is of considerable interest as common variants in this locus has been identified as a modifier of the clinical expression associated with rare latent cardiac ion channel and signalling molecule gene mutations associated with familial AF.16 While these findings represent a specific example of the more general ‘two-hit’ hypothesis wherein the combination of a common and a rare genetic variant modifies the risk of AF, it is possible that other ‘two-hit’ models may impose a similar risk of AF (e.g. the presence of two rare variants in the same pathway).17

Notably, we identified such an overlap in an important pathway between two families: AF1 and AF2. In the AF1 family, we identified a rare variant in CACNB2 in a canonical splice site, which was confirmed in all affected family members. CACNB2 encoding the β2-subunit of the l-type calcium channel (Cav1.2, encoded by CACNA1C) is strongly expressed in the heart, and has been linked with Brugada syndrome, cardiac conduction disease, sudden death,18 and the LQTS.19 Overexpression studies of CACNB2 in transgenic mice induced chronic heart failure and aggravated the development of arrhythmias and fibrosis.20 CACNB2 functions as a partner for the α-subunit of Cav1.2 ensuring its transport to the plasma membrane with loss-of-function mutations showing accelerated inactivation of the Cav1.2 responsible for the Brugada syndrome phenotype.21 CACNA1C gain-of-function mutations prolong the calcium current which delays cardiomyocyte repolarization thereby increasing risk of arrhythmias.22 However, none of the family members in AF1 had the Brugada syndrome EKG pattern. Interestingly, another segregating rare variant that also affected the Cav1.2 current was identified and confirmed in CACNA2D4 in AF2. Expression studies of CACNA2D4 have identified high-mRNA levels in both heart and skeletal muscle.23 Studies in transfected HEK293 cells have demonstrated that CACNA2D4 mediates the influx of calcium into the cell via a formed protein complex co-expressed with Cav1.2. Collectively, the identification of two disease segregating rare variants in two different genes (CACNB2 and CACNA2D4) with overlapping effects on the Cav1.2 current in two separate AF families suggest that these variants could identify an important pathway modulating AF susceptibility. This is consistent with a large body of evidence implicating abnormal calcium signalling in paroxysmal AF.2427 One approach to further testing this hypothesis is to extend WES to more familial AF families to identify additional defects in calcium control, or other functionally linked genes (‘pathways’) whose dysfunction may cause AF.

In the present study, we applied WES to familial AF and identified multiple potentially pathogenic and very rare AAC variants, although such variants need to be explored further to confirm causality with AF. Collectively, this study has highlighted some important challenges and limitations when performing WES in a complex phenotype like AF. First, WES appears to be most suitable for large AF families as the number of very rare AAC variants identified after a thorough filtering process still remains too large to conclusively determine one likely causative variant. However, familial AF kindreds tend to be small in part due to variable age-dependent penetrance and the paroxysmal nature and variable symptoms associated with the arrhythmia; this can make assignment of clinical phenotype difficult given the high prevalence of AF in the general population. In addition, rare or pathogenic variants may not segregate with AF in the absence of a ‘second-hit’ which may be a modifier common AF risk allele or a clinical risk factor like hypertension. Second, it is critical that family members are correctly classified as affected or unaffected. Phenotypic complexity of AF makes correct classification of family members challenging which may have affected our study findings. While one approach to tackle this issue is to perform an ‘affected-only’ analysis, the use of unaffected family members would increase the power of the study. One proposed approach to aid gene mapping is the use of well-defined endophenotypes that co-segregate with AF. An endophenotype should not only co-segregate with AF but also be present in an individual whether or not AF is present. Potential endophenotypes that we and others have identified include signal-averaged P-wave duration28 and biomarker profiles.29 Third, using publically and locally available exomes to identify potentially pathogenic variants (MAF ≤ 0.04%) based on previous findings from the ESP which showed that allele frequencies of >0.04% are of less certain pathogenicity does increase the likelihood of identifying variants associated with AF,11 although such filtering may be too stringent. While it is generally accepted that variant effect size is inversely correlated with MAF,9 it is not clear what the optimal threshold cut-off is for AF, although we have previously used ≤0.04;5 MAF guided thresholds for inferring putative functionality are moving targets likely to vary by phenotype. Hence, we acknowledge that excluding low-frequency variants may have resulted in inadvertently excluding variants with low, intermediate, or strong effects on AF (i.e. false negatives). However, in the absence of functional assays capable of handling and determining function on a large scale, using MAF to infer likelihood of putative function seems like the most pragmatic approach. Notably, we used a set of internal exomes to reduce the risk of identifying variants driven by population stratification. However, under optimal conditions, the set would be >286 and they should be free of an arrhythmia phenotype which should be taken into account when interpreting the derived variant frequencies in the present study. Fourth, we used WES based on the notion that the majority of functionally important variants is harboured in the exonic (i.e. coding) parts of the genome. However, WES is unable to evaluate variants in potentially important introns, intergenic regions, and other regions not covered on the capture used and may have influenced our results. The latter notion is highlighted by a previous linkage study performed on the AF4 family which revealed an association between prolonged signal-averaged P-wave duration, an AF endophenotype, and a chromosome 5p15 genetic locus.28 However, in this WES experiment, none of the identified variants mapped to the 5p15 locus which suggests that the 5p15 functional variant may reside in a non-coding region. Linkage analysis may provide important clues as to which variants are more likely to cause AF and one obvious next step is to expand the families and phenotype additional family members. However, as our familial AF kindreds are small the expected LOD scores, even for perfectly linked markers are limited (data not shown). In addition, while we used the same QC and variant calling pipeline on all exomes to reduce site and platform bias, we acknowledge that differences in sequencing centres and capture kits used may have influenced our results. Another limitation is that we did not use the newest versions of various databases (e.g. dbSNP138) as these were not available when the data were initially analysed.


In this study, we employed WES in familial AF kindreds with evidence of a Mendelian mode of inheritance pattern. By coupling family data with exome sequencing, we identified multiple very rare and putative pathogenic variants in five of six families, whereas none were identified in the remaining AF family. The present study highlights some important difficulties and limitations associated with performing WES in AF including the importance of having large well-curated multi-generational pedigrees, the issue of potential AF misclassification, and limitations associated with WES itself. Extending the family structures, conducting WES analysis in larger numbers of modest-sized families, extending the analysis to non-coding regions, and considering gene–gene interactions are future approaches that may prove fruitful.


This work was supported by NIH grants U19 HL65962, R01 HL092217 and an American Heart Association Established Investigator Award (0940116N). Exome sequencing was performed through grants from the NHLBI (grant number RC2 HL-102925). P.W. was funded by an unrestricted research grant from the Tryg Foundation (J.nr. 7343-09, TrygFonden, Denmark).



View Abstract