In human patients, mutations in the gene encoding filamin C (FLNC) are associated with Myofibrillar Myopathy 5 . In a prior blog post, we reviewed the human mutations in FLNC associated with the disease state Myofibrillar Myopathy 5 (MFM5) and various cardiomyopathies. We also presented the specific changes associated with the equine P3 allele of FLNC, a haplotype that bears two different missense alleles of FLNC: E753K in Ig-like domain 6 and A1270T in Ig-like domain 11. In this blog post, we review bioinformatic evidence associating the P3 allele of FLNC with Myofibrillar Myopathy in horses.
The association of the P3 allele of FLNC with Myofibrillar Myopathy in horses was first made in a horse diagnosed with Myofibrillar Myopathy (MFM) through the identification of desmin-positive inclusions observed in muscle tissue. The horse displayed symptoms of exercise intolerance. A genetic test for GYS1-R309H (P1) ruled out Polysaccharide Storage Myopathy type 1 (PSSM1). The desmin staining technique was relatively new at the time of the muscle biopsy on this horse; the presence of desmin-positive inclusions was diagnostic of MFM in distinction to other types of Polysaccharide Storage Myopathy type 2 (PSSM2).
The association of the P3 allele of FLNC with Myofibrillar Myopathy in this horse was made following analysis by whole-genome sequencing. The P3 allele of FLNC was observed, and the horse was free of pathogenic variants in other genes that are associated with Myofibrillar Myopathy in humans: DES , CRYAB , MYOT , LDB3 , BAG3 , KY , and PYROXD1 .
The two different missense mutations of FLNC that make up the P3 haplotype, E753K (chr4:83,837,774 G/A in EquCab 3.0) and A1270T (chr4:83,840,299 G/A in EquCab 3.0), can be assessed using two different methods. First, and simplest, is to ask what type of amino acid substitutions they cause. E753K substitutes lysine (K) for glutamic acid (E). Glutamic acid is acidic, while lysine is basic. This is a nonconservative substitution of a chemically dissimilar amino acid. A1270T substitutes threonine (T) for alanine (A). Threonine is an uncharged polar amino acid, while alanine (A) is nonpolar. This is also a nonconservative substitution of a chemically dissimilar amino acid.
The second method of assessing whether E753K and A1270T are damaging is to compare the amino acid sequence of the filamin C protein from a wide variety of organisms over a broad range of evolutionary distances to see what kind of amino acid substitutions have occurred at these positions. This lets us see whether the amino acids in those positions are widely conserved, indicating that they are of functional importance, or whether chemically dissimilar amino acids can occupy those positions in other species.
In order to carry out this analysis, we retrieved the amino acid sequence of human FLNC Ig-like domain 6 and 11 from UniProt. The domain 6 sequence is 759-861 of UniProt Q14315, while the domain 11 sequence is 1245-1344 of UniProt Q14315. These sequences were used as query sequences to retrieve sequences from 125 species of vertebrates using BLASTp. The species compared included 13 primates (including human), 38 non-primate mammals (including three marsupials and a monotreme), 28 birds, 8 reptiles and an amphibian, 34 bony fishes, and 4 cartilaginous fishes (sharks). The species, their accession IDs, and the percentage of similarity and identity are presented in a downloadable spreadsheet (XSLX).
The table below summarizes the median percent amino acid identity and similarity compared to the human query sequences across the 103 amino acids of Ig-like repeat 6 and the 100 amino acids of Ig-like repeat 11 for these species. It is evident from the table that evolutionary conservation of Ig-like repeat 6 is higher than that of Ig-like repeat 11.
|Comparison of FLNC Protein Sequences Among Speciesa|
|Ig-like repeat 6b||Ig-like repeat 11c|
|Species Groupd||# Species||% AA Identitye||% AA Similarityf||% AA Identitye||% AA Similarityf|
aProtein sequences for human FLNC immunoglobulin-like domains 6 and 11 were used as query sequences in BLASTp to retrieve FLNC sequences from other species as described in the text.
bImmunoglobulin-like domain 6 of human FLNC is amino acids 759-861 of UniProt Q14315.
cImmunoglobulin-like domain 11 of human FLNC is 1245-1344 of UniProt Q14315.
dSpecies groups are primates, non-primate mammals, birds, reptiles and one amphibian, bony fish, and cartilaginous fishes (sharks). The species, their accession IDs, and the percentage of similarity and identity are presented in a downloadable spreadsheet (XSLX).
e% Amino acid identity is the median percentage of amino acids identical between the human sequence and the non-human sequences for the indicated species group.
f% Amino acid similarity is the median percentage of amino acids identical or similar (BLOSUM62 scoring matrix) between the human sequences for the indicated species group.
The overall percentage of amino acid identity and similarity over evolutionary time for these two domains gives a picture of how rapidly filamin C is evolving. For the sake of comparison, consider the amino acid sequence of the alpha chain of hemoglobin, a moderately conserved protein. The amino acid sequence identity of human and shark alpha hemoglobin is about 51%, between human and bony fish it is about 55%, between human and birds it is about 70%, and human and a marsupial it is about 80%. Both filamin C domains have evolved more slowly than alpha hemoglobin.
More informative for assessing whether E753K and A1270T are expected to be pathogenic is the analysis of amino acid substitutions seen at specific positions. The amino acid sequences of filamin C domain 6 and 11 from all 125 species were first clustered into groups with identical sequences, then all unique sequences were used to generate multiple alignments using Clustal Omega . The complete multiple alignments are shown in supplemental figures for Ig-like domain 6 (PDF) and Ig-like domain 11 (PDF). Portions of the multiple alignments including 21 amino acids around E753K and A1270T are shown below.
Considering only the 21 amino acids around the position of the E753K variant, there are only seven unique sequences in the FLNC protein sequences from the 125 species in this analysis. All species except anole (Anolis carolinensis, XP_008122353.2) and ghostshark (Callorhinchus milii, XP_007908811.1) have a glutamic acid (E) at position 753; the two exceptions have the conservative substitution of aspartic acid (D). The nonconservative substitution of lysine (K) for glutamic acid (E) is seen only in the horse variant, and not in the wild-type allele from 125 different species across over 450 million years of evolutionary distance. This does not say that the E753K variant has never occurred in any of the species examined, but rather that whenever it has occurred, it has been cleared from the population by natural selection. This is strong evidence that the E753K mutation is pathogenic.
Considering only the 21 amino acids around the position of the A1207T variant, there are 37 unique sequences in the FLNC protein sequences from the 125 species in this analysis. All mammals, birds, reptiles, and amphibians have an alanine (A) at this position, while six different bony fishes [arowana (Scleropages formosus, XP_029102968.1), kanglang (Anabarilius grahami, ROL45089.1), zebrafish (Danio rerio, XP_017209728.1), killifish (Nothobranchius furzeri, XP_015819642.1), fighting fish (Betta splendens, XP_029008468.1), and zebra mbuna (Maylandia zebra, XP_004567805.1)] have the nonconservative substitution of a serine (S). This is considered a nonconservative substitution using the BLOSUM62 substitution matrix, the default setting on BLASTp, although the default settings on Clustal Omega for this group of 37 sequences still score this position as conserved. The nonconservative substitution of threonine (T) for alanine (A) is seen only in the horse variant, and not in the wild-type allele from 125 different species across over 450 million years of evolutionary distance. This does not say that the A1207T variant has never occurred in any of the species examined, but rather that whenever it has occurred, it has been cleared from the population by natural selection. This is evidence that the A1207T mutation is pathogenic, although it is clear that there is less sequence conservation in this portion of the protein than in the portion around the position of the E753K variant.
 Vorgerd M et al. (2005). “A mutation in the dimerization domain of filamin C causes a novel type of autosomal dominant myofibrillar myopathy.” Am J Hum Genet. 77(2):297-304. PMID: 15929027.
 Park KY et al. (2000). “Desmin splice variants causing cardiac and skeletal myopathy.” J Med Genet. 37(11):851-857. PMID: 11073539.
 Vicart P et al. (1998). “A missense mutation in the alphaB-crystallin chaperone gene causes a desmin-related myopathy.” Nature Genet. 20(1):92-95. PMID: 9731540.
 Gilchrist JM et al. (1988). “Clinical and genetic investigation in autosomal dominant limb-girdle muscular dystrophy.” Neurology. 38(1):5-9. PMID: 3275904.
 Selcen D and Engel AG (2005). “Mutations in ZASP define a novel form of muscular dystrophy in humans.” Ann Neurol. 57(2):269-276. PMID: 15668942.
 Jaffer F et al. (2012). “BAG3 mutations: another cause of giant axonal neuropathy.” J Peripher Nerv Syst. 17(2):210-216. PMID: 22734908.
 Straussberg R et al. (2012). “Kyphoscoliosis peptidase (KY) mutation causes a novel congenital myopathy with core targetoid defects.” Acta Neuropathol. 132(3):475-478. PMID: 27484770.
 O’Grady GL et al. (2016). “Variants in the Oxidoreductase PYROXD1 Cause Early-Onset Myopathy with Internalized Nuclei and Myofibrillar Disorganization.” Am J Hum Genet. 99(5):1086-1105. PMID: 27745833.
 Madeira F et al. (2019). “The EMBL-EBI search and sequence analysis tools APIs in 2019.” Nucleic Acids Res, 47(W1), W636–W641. PMID: 30976793.