What percentage of protein isoforms have different functions?

What percentage of protein isoforms have different functions?

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

I am looking for studies on how many protein isoforms have different functions, preferably in human. We know that a great many, if not most, of human genes are alternatively spliced and that many produce different protein isoforms. Has anyone looked at how many of these isoforms have different cellular functions? If someone could point me to a published paper, that would be great.

If no such study has been made, can anyone recommend a database from which this information could be extracted? GeneOntology is gene based so the information cannot be found there. Genes will be annotated to specific terms, not their protein isoforms. Also, I would need to be able to do this in a high throughput manner, I am not interested in specific proteins but in what percentage of all isoforms have different functions.

Ideally, I would like to be able to extract, for every human gene, the list of the different protein isoforms it encodes and whether their functions differ, or at least what those functions are.

It seems that most functional annotation these days is inferred from sequence similarity to previously annotated genes/proteins: this is certainly true of high-throughput functional annotation. It's hard to know how many layers of inference there are between your query sequence and an actual experiment verifying the (possibly tissue- or condition-specific) function of a gene/protein. But alas, I digress…

One confounding factor for is that alternatively spliced isoforms share a lot sequence in common, which means that they might share best hits when searching against a database of annotated transcripts and end up getting assigned the same GO terms.

But as others have mentioned, I think the biggest limiting factor you will encounter is the low-throughput step: in many cases the experimental work required to characterize the functions of alternative isoforms simply has not yet been done.

I would take a look at UniProt to begin with. It's my go-to whenever I need to look for isoforms or functional data, so it should help you as well. You could start by comparing the information in Swiss-Prot to TrEMBL, which are manually and automatically annotated, respectively. One issue that I've come across in my research on isoforms is that there often isn't a great deal known about them, as studies may have focused on just one or a few out of many possible. One way to get around this could be to use structure/function predictions based on sequence, then try to predict variations by looking at what's added or missing in the various isoforms.

I hope this helps…

I'd tend to agree there's very little data on this, esp given how many isoforms there are. I believe that even some cases of pseudogenes may have function as a protein. I'd bet that most of them have some unique functional aspects.

I think we'd all agree we can't prove that they do in all cases, but let me sketch out how these different functions might arise or are known.

Protein Isoforms usually come from two causes. The first being that there are two very similar copies of the same gene, from a gene duplication event. The second coming from alternative splicing.

Alternative splice isoforms are often missing or adding a few amino acids - the sequence is different. This can be from truncation of a large portion of the protein such as the leptin receptor. Leptin receptor has two full length variants that have different expression patterns at different temperatures and import leptin into the cell at different efficiencies. Leptin also has several short isoforms, where only a short portion that faces the cell interior is produced. These variants compete with the full receptor to modulate the cells sensitivity to leptin.

As you can imagine, mapping all this out in just one case took years of work. Complete characterization of all human isoforms would take a long time.

Isoforms from gene duplication have many of the same properties as splice variant isoforms; they can have completely different regulatory machinery - popping up in different biological states than the main gene and they can vary in sequence a little bit or a lot. It has been implied that these gene duplicates persist in the genome because they quickly find a new niche role. This paper shows that 30% of 195 new genes that have shown up since two strains of fly have diverged are necessary for viability.

Evolution doesn't let genes sit around because they are completely identical; they can be removed as easily as they show up, only differential function will keep an isoform in the mRNA repertoire for a prolonged time.

What percentage of protein isoforms have different functions? - Biology

Protein kinase C (PKC) isoforms have crucial roles in cutaneous signaling. Interestingly, we lack information about their involvement in human sebaceous gland biology. Therefore, in this current study, we investigated the functions of the PKC system in human immortalized SZ95 sebocytes. Using molecular biological approaches, imaging, and functional assays, we report that SZ95 sebocytes express the conventional cPKCα the novel nPKCδ, ε, and η and the atypical aPKCζ. Activation of the PKC system by phorbol 12-myristate 13-acetate (PMA) stimulated lipid synthesis (a hallmark of differentiation) and resulted in translocation and then downregulation of cPKCα and nPKCδ. In good accord with these findings, the effect of PMA was effectively abrogated by inhibitors and short interfering RNA-mediated “silencing” of cPKCα and nPKCδ. Of further importance, molecular or pharmacological inhibition of nPKCδ also prevented the lipogenic and apoptosis-promoting action of arachidonic acid. Finally, we also found that “knockdown” of the endogenous aPKCζ activity markedly increased basal lipid synthesis and apoptosis, suggesting its constitutive activity in suppressing these processes. Collectively, our findings strongly argue for the fact that certain PKCs have pivotal, isoform-specific, differential, and antagonistic roles in the regulation of human sebaceous gland–derived sebocyte biology.

Same gene can encode proteins with divergent functions

It’s not unusual for siblings to seem more dissimilar than similar: one becoming a florist, for example, another becoming a flutist, and another becoming a physicist.

Something of the same diversity applies to the “brood” of proteins produced from any single gene in human cells, a new study led by scientists at Dana-Farber Cancer Institute, University of California, San Diego School of Medicine, and McGill University has found. In a first large-scale systematic study, the researchers found that most sibling proteins – known as “protein isoforms” encoded by the same gene – often play radically different roles within tissues and cells, however alike they may be structurally.

The research, published online today by the journal Cell, stands to have a powerful effect on the understanding of human biology and the direction of future research. For one, it may help explain how the mere 20,000 protein-coding genes in the human genome – fewer than are found in the genome of a grape – can give rise to creatures of such enormous complexity. Scientists know that the number of different proteins in human cells, thought to be upwards of 100,000, far exceeds the number of genes, but many questions have remained. Do most of those proteins have a unique function in the cell, or do their roles sometimes overlap? The discovery that different protein isoforms encoded by the same gene may have divergent functions on a larger scale than realized suggests that they vastly multiply what our genes are capable of.

Diversity, clue to disease

This diversity also suggests that each protein isoform needs to be studied individually to understand its normal role and its potential involvement in disease, the study authors state.

“Research into cancer-related proteins, for example, often focuses on the most prevalent isoforms in a given cell, tissue, or organ,” said co-senior author David E. Hill, PhD, associate director of the Center for Cancer Systems Biology (CCSB) at Dana-Farber. “Since less-prevalent protein isoforms may also contribute to disease, and may prove to be valuable targets for drug therapy, their role should be examined as well and to do that properly, we also need comprehensive clone collections covering all expressed isoforms.”

Previous functional studies of protein isoforms have generally been done on a gene-by-gene basis. Furthermore, researchers frequently compared the activity of a gene’s “minor” isoforms to that of its predominant isoform in a particular tissue. The new study approached the functional question from a larger perspective – by gathering multiple protein isoforms of hundreds of genes and comparing how they specifically interact with any other human protein.

Alternative splicing

One of the ways that cells produce multiple protein isoforms from individual genes is a process called alternative splicing. Most human genes contain multiple segments called exons, separated by intervening non-coding sequences called introns. In the cell, different combinations of these individual exons are “glued” or spliced together to generate a final expressed gene product thus, a single gene can encode a set of distinct, but related protein isoforms, depending on the specific exons that are spliced. One isoform, for example, may result from splicing exons A-B-C-D of a particular gene. Another may arise from the skipping of exon C, resulting in a product with only exons A-B-D.

For the new study, researchers devised a technique called “ORF-Seq” that allowed them to identify and clone large numbers of alternatively spliced gene products in the form of open reading frames (ORFs), and use them to produce multiple protein isoforms for hundreds of genes.

Of the roughly 20,000 genes in the human genome that code for proteins, researchers concentrated on about eight percent. Using ORF-Seq, they ultimately created a collection of 1,423 protein isoforms for 506 genes, of which more than 50 percent were entirely novel gene products. They subjected 1,035 of these protein isoforms through a mass screening test that paired them with 15,000 human proteins to see which would interact.

“The exciting discovery was that isoforms coming from the same gene often interacted with different protein partners,” remarked Gloria Sheynkman, PhD, of Dana-Farber and one of the lead authors. “This suggests that the isoforms play very different roles within the cell” – much as siblings with different careers often interact with different sets of friends and co-workers.

The researchers found that in most cases, related isoforms shared less than half of their protein partners. Sixteen percent of related isoforms share absolutely no protein partners. “From the perspective of all the protein interactions within a cell, related isoforms behave more like distinct proteins than minor variants of one another,” Tong Hao, of Dana-Farber and one of the lead authors, asserted.

Intriguingly, isoforms that stem from a minuscule difference in DNA – a difference of just one letter of the genetic code – sometimes had starkly different roles within the cell, researchers found. At the same time, related isoforms that are structurally quite different may have very similar roles.

Quite often, the interaction partners of related isoforms vary from tissue to tissue, the researchers found. In the liver, for example, an isoform may interact with one set of proteins. In the brain, a relative of that isoform may interact with a largely different set of protein partners.

“A more detailed view at protein interaction networks, as presented in our paper, is especially important in relation to human diseases,” said co-senior author Lilia Iakoucheva of UC San Diego. “Drastic differences in interaction partners among splicing isoforms strongly suggest that identification of the disease-relevant pathways at the gene level is not sufficient. This is because different variants could participate in different pathways leading to the same disease or even to different diseases. It’s time to take a deeper dive into the networks that we are building and analyzing.”

"Cells from different tissues in our body share the same genome," said co-senior author Yu Xia of McGill University. "Yet their molecular wiring diagrams are far more divergent than previously thought. This information is crucial for understanding biology and combating disease."

Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing, Cell (2016), by Xinping Yang, Jasmin Coulombe-Huntington, Shuli Kang, . Lilia M. Iakoucheva, Yu Xia, Marc Vidal,

The work was supported by the National Human Genome Research Institute (grants P50HG004233 and U01HG001715) the Ellison Foundation the National Cancer Institute (grant R33CA132073) the Krembil Foundation a Canada Excellence Research Chair Award an Ontario Research Fund-Research Excellence Award the Eunice Kennedy Shriver National Institute of Child Health and Human Development (grant R01HD065288) the National Institute of Mental Health (grants R01MH091350, R01MH105524, and R21MH104766 the National Science Foundation (grant CCF-1219007) the Natural Sciences and Engineering Research Council of Canada (NSERC) (grant RGPIN-2014-03892), the Canada Foundation for Innovation (grant JELF-33732) the Canada Research Chairs Program National Institutes of Health (training grant T32CA009361) an NSERC fellowship the National Institute of General Medical Sciences (grant R01GM105431) and a Swedish Research Council International Postdoc Grant.


In this study we identified hundreds of genes regulated downstream of Fru M isoform activity. Fru MA , Fru MB and Fru MC have differences in the gene sets induced or repressed when they are over-expressed, demonstrating that each isoform has distinct biochemical activities (Figure 2). Consistent with this observation is that each Fru M isoform has different DNA binding specificity (Figure 3). Our results suggest that there are sex-specific factors that influence Fru M isoform activity, as over-expression of Fru MA , Fru MB and Fru MC isoforms in males and females resulted in different genes that are induced and repressed by each isoform (Figure 2 and Additional file2: Figure S1). The gene sets identified as induced downstream of Fru M isoforms in males are enriched with genes with nervous system function, based on GO annotations (Additional file5: Table S4).

Additionally, it is worth noting that there may be other possible sources for the differences observed in the gene expression levels in these experiments. First, the Fru M proteins contain a BTB domain that in previous work has been shown to contain a dimerization domain that can mediate homodimeric or heterodimeric interactions. Thus, some of the effects we observe could be due to 1) differences in the stoichiometric ratios of Fru M with each other, and/or 2) differences in the stoichiometric ratios of Fru M with other potential dimerization partners. However, based on immunofluorescence results we do not observe substantially different levels of over-expression of each isoform, nor is there substantial expression, if any, outside of the normal fru P1 expression pattern (see Additional file8: Figure S2). Second, there is a significant association between the genes that are either induced or repressed when Fru M is over-expressed, with those genes identified in loss-of-function fru M mutant analyses, demonstrating the physiological relevance of the genes identified by over-expression. Third, there is significant enrichment of the binding site sequences identified for each isoform within the genes that are induced by each isoform. Fourth, while our criteria were stringent (significant and substantial differences from two different wild type strains), strain differences may account for some of the differences between wild type male and female and Fru M over-expressor male and female strains, respectively. However, such strain differences are not likely to account for the differences we observe between the Fru M isoforms, which are in the same genetic background, nor can strain differences account for differences observed across sex. Taken together, these results demonstrate that in context of the over-expression experiments each of the three Fru M isoforms examined has different activities with respect to genes that are induced or repressed in males, with many more genes having induced rather than repressed expression in males.

In previous studies, production of Fru M in females, by expression of a tra-2 RNAi transgene in fru P1-expressing neurons, was sufficient to endow females with the potential to perform the first four sub-steps of the male courtship ritual, following, tapping, wing extension and proboscis extension, but not attempted copulation[14]. In contrast, overexpression of Fru MA or Fru MC in fru P1-expressing neurons, resulted in flies that displayed only following and tapping reviewed in[4], suggesting that overexpression of Fru M in females is not sufficient to endow females with the potential to perform courtship behaviors. The 42 genes identified as induced by all three Fru M isoforms in females will be interesting to examine, with respect to their role in establishing the potential for these early courtship steps. Interestingly, one of these genes is Ir54a, which encodes a member of a diverse family of ionotropic receptors, some of which are expressed in the adult antenna and underlie chemosensory functions[35]. It is also known that Dsx M plays a role in establishing the potential for courtship behaviors[26, 27, 36–38], which would not have been present in females in which Fru M was produced, though Dsx M is not present in all fru P1-expressing neruons, so is unlikely to account for all the differences between males and females observed here[26, 37]. Our results may further explain why there was not a complete rescue of male courtship behavior. It is clear that the sex of the fly in which Fru M is produced has an impact on the genes that are induced and repressed. These results suggest that there are additional sex-specific factors that influence Fru M activity, which may include Dsx M . Further biochemical characterization of Fru M protein interactions will be important to understand Fru M activities.

While fru has been predicted to be a transcription factor based on the observation that fru encodes BTB-zinc finger products, no direct transcriptional targets of fru have been identified, leaving this an open question. A recent study has shown that Fru M associates with a cofactor, Bonus, and subsequently associates with two chromatin modifying proteins, HP1a and HDAC1, however it was not clear if the association of Fru M with chromatin was direct[20]. The results presented here demonstrate that Fru M can bind DNA and that three Fru M isoforms examined have different binding activities. Given our observation that the binding sites are significantly enriched in all gene sets identified as induced, but not repressed by Fru M , suggests that Fru M may function by binding enhancer DNA directly, but acts in an indirect manner to repress gene expression (Table 1).

In previous studies we and others have show that genes with male-biased expression were enriched on the X chromosome in the adult head[31, 39] and brain[40]. There was also a significant enrichment of genes with male-biased expression that reside near dosage compensation entry sites[39, 40]. Here, we observe significant enrichment of genes that reside on the X chromosome that are induced by Fru M in males. This observation supports the idea that over evolutionary time there may have been a selection for genes with male-specific functions to reside on the X chromosome and in particular those regulated by Fru M . Perhaps, Fru M isoforms and their gene targets have evolved to take advantage of the unique properties of the male nucleus. These differences include the dosage compensation complex that is bound to the male X chromosome that leads to less compact chromatin reviewed in[41], the presence of the Y chromosome that affects chromatin architecture throughout the nucleus[42], or other differences in the chromatin and three-dimensional architecture of the nucleus in males [for example see[43]. It is possible that there are more interconnections in the sex hierarchy model between chromosomal sex, the sex hierarchy branches and sexual development that is downstream of Fru M than shown in the model (Figure 1).


We have found that the Dlc1 gene has four transcriptional isoforms expressed under the influence of three alternative promoters. The alternative promoters are differentially methylated in mouse tissues and this relates to the differential expression of the proteins in the tissues examined. We also identified a 127 KDa Dlc1 protein that may originate from the 6.2 Kb alternative transcript. We generated a Dlc1 gt mouse that resulted in hympomorphic expression of the 6.1 Kb transcript and the 123 KDa Dlc1 protein. The homozygous gt Dlc1 gt/gt mouse embryos showed significant abnormalities in yolk sac and placental labyrinth vasculature suggesting an important developmental role for the 6.1 Kb isoform of Dlc1 during early embryonic development. A deficiency of this isoform resulted in increased GTP-Rho levels resulting in more stress fibre and vinculin associated focal adhesion formation. This in turn resulted in heightened cellular mobility.

Alternative Splicing

Three decades ago, it was discovered that the coding function of some eukaryotic genes was broken up into smaller, discrete coding segments called exons, these separated from one another by long stretches of non-coding DNA, called introns. Both intron and exon nucleotides are transcribed, this yielding a primary mRNA transcript, or pre-mRNA molecule, from which RNA nucleotides complementary to intron nucleotides are excised, this constituting RNA splicing. RNA nucleotides complementary to some exon nucleotides can also be spliced out of a primary mRNA molecule these RNA nucleotides, though, can vary, depending on, for instance, cell type, this constituting AS. Primary-transcript exons ending up in all the functional mRNAs derivable from a particular primary transcript are commonly referred to as constitutive exons. Excisable exons are commonly referred to as alternative exons. Evidence suggests 95 percent of human genes are alternatively spliced (Chen & Manley, 2009, p. 741). The end-result of RNA processing—the RNA that leaves the nucleus to undergo translation—is called mature, or functional, mRNA.

To illustrate RNA splicing and AS, 1 imagine a eukaryotic structural gene A comprised of 10 exons and 9 introns. (From this point on, I will be using “intron” and “exon” in the context of RNA molecules, also.) In one cell type, intron removal and selective exon removal might yield a functional mRNA containing exons 1, 2, 3, 4, 7, 8, and 10, all the introns and exons 5, 6, and 9 having been removed from the primary transcript. In a different cell type, the functional mRNA might contain exons 1, 4, 5, 6, 7, and 9. Somewhat obviously, these two different mRNAs are going be translated into two different protein isoforms (hereafter referred to as isoforms), which might have different functions, as suggested by Yang et al. (2016). These authors determined the number of binding partners shared between members of an isoform pair. In a majority of cases, members of a pair shared fewer than 50 percent of binding partners, Yang et al. concluding, “In the global context of interactome network maps, alternative isoforms tend to behave like distinct proteins rather than minor variants of each other” (p. 805).

AS, then, to the extent that isoforms are functionally different, allows for an expanded proteome relative to a fixed genome, but also, the outline of AS provided so far might already provoke doubt that the information for protein synthesis resides solely in genes. A structural gene should certainly be considered a template for the associated primary mRNA. Should a structural gene, though, also be considered the template for a functional mRNA arising only after considerable processing of the primary transcript, if this processing involves variable excision of exons? The primary transcript does contain structural cues identifying excisable exons, and this information is necessary for functional mRNA synthesis. Additionally, every exon found in any of the functional mRNAs derivable from a particular primary transcript was, originally, part of that primary transcript. However, there is one category of information, critically important for splicing outcomes, that the primary transcript, and by extension, the gene, does not appear to contain. The primary transcript, alone, does not appear to contain the information for which excisable exons to leave in and which excisable exons to splice out during a particular splicing event. The information stored in a structural gene, so necessary for a protein's primary amino acid sequence, appears at the same time insufficient as regards specifying that sequence to the extent that more than one such sequence can result from splicing of the primary transcript. The information in a primary transcript, alone, can only indicate which exons might be excised, not which exons will be excised, as the result of a splicing event, when more than one isoform can result. Next, evidence that factors other than the associated gene can influence splicing outcomes is presented, but that such factors must exist seems implicit, already, in the very fact of AS any influence not due to a direct product of a gene will be termed non-genetic.

Evidence for the Role of Context, Especially Non-Genetic Context, in the Regulation of AS

That AS can lead to synthesis of isoforms with opposite functions is illustrated by an example from Yang et al. (2016): one isoform promotes apoptosis, while the other inhibits it (p. 806). An article cited by Yang et al., Schwerk and Schulze-Osthoff (2005), provides other examples of isoform pairs exhibiting opposite functions, but these authors also write, “Accumulating data have shown that splicing patterns can already be determined at the promoter of a gene, evidencing a coupling between transcription and alternative splicing” (p. 8, citing Goldstrohm et al., 2001). This statement appears to be saying that a gene is capable of wholly influencing the outcome of a splicing event. Schwerk and Schulze-Osthoff go on to summarize research, however, showing that a gene for which the aforementioned connection between transcription and splicing outcome is established has five promoters, and that selection of one of them is associated with a particular splicing outcome, but that the trigger for this is a steroid hormone (p. 8). It would appear that the information for this splicing outcome, its “blueprint,” due to this splicing outcome's dependence on a non-genetic factor, must lie in a realm wider than that of just the gene.

Pre-mRNA secondary structure can influence alternative-splicing outcomes. Buratti and Baralle (2004) discuss several examples of this, one (p. 5) being a pre-mRNA's-secondary-structure's influence on expression of two mutually exclusive exons, the tissue-specific manner of exon expression here suggesting a role in secondary-structure determination, and hence, splicing outcome, for factors other than just the relevant gene. The Drosophila gene Dscam, which can generate over 38,000 isoforms, all or most of which may be required for normal fruit fly central nervous system development (Yue et al., 2013, p. 1822), has 115 exons, 20 of which are not alternatively spliced. The remaining 95 are bundled into 4 groups or clusters (exon clusters 4, 6, 9, and 17 containing 12, 48, 33, and 2 exons, respectively), with a single exon from each cluster ending up in a particular functional mRNA (May et al., 2011, p. 222). Of Dscam's 115 exons, then, only 24 will be represented in any of 38,000 possible functional-mRNA transcripts and, hence, only 24 in any one isoform. Graveley (2005) proposed a mechanism for splicing of cluster 6, containing 48 exons, which relied on different secondary structures for each AS outcome. May et al. provided experimental evidence for the existence of these structures, and their importance, but concluded that other factors were also important (p. 227). Whether one focuses on the large number of different secondary structures involved in the many different alternative-splicing outcomes possible or on the “…larger integrative system” within which these authors believe the secondary-structure influence is embedded, it should still be apparent that the information required for splicing outcomes here must far exceed that capable of being supplied by Dscam itself.

The capacity of endogenous zinc levels, in Arabidopsis, to skew splicing of the primary transcript of a gene coding for a zinc-sequestration protein toward an alternative transcript with enhanced “translation efficiency” (Remy et al., 2014, p. 1) is another example of the influence of a non-genetic factor on AS.

A second example of a steroid-hormone's influence on a splicing outcome and, at the same time, a second example of transcription-AS linkage is provided by Dowhan et al. (2005). This article summarizes research showing that binding of progesterone to its receptor, in addition to influencing transcription, triggers recruitment of co-regulatory molecules that influence downstream splicing events. This second example of a steroid-hormone's influence on a splicing outcome provides somewhat explicit evidence of the capacity of a non-genetic factor, progesterone, to regulate a key splicing regulator molecule, the splicing factor. (AS regulation is generally attributed to the interaction of splicing factors [proteins and RNA molecules] with primary-mRNA nucleotide sequences, splicing factors thus filling a role in AS analogous to that filled by specific transcription factors in transcription importantly, because splicing factors are proteins or RNA molecules, their genesis, too, can involve AS.) This is noteworthy because implicit in the notion of a genetic program driving development is that regulator-gene products, such as the lac operon's repressor protein, control transcriptional and post-transcriptional events to an extent that obviates the need to look for factors outside of the genome to explain control of cellular processes (Keller, 2000, pp. 56–57). Thus, one might agree that the discovery of AS should provoke movement away from the characterization of single genes as blueprints for proteins but insist that because splicing factors are direct products of genes, the information for splicing outcomes still lies within the genome as a whole. However, in this splicing outcome, we see regulation by a non-genetic factor, not of transcription but, still, of the identity of the final product of a regulator gene.

An example of an external-environmental influence on AS also manifesting as an influence on the same category of regulator gene as in the previous example is, in A. thaliana, the effect of long-term exposure to cold on which of two splicing factors predominates (Shang et al., 2017, p. 2 of online version).

Syed et al. (2012), in examining the importance of AS to plant physiology, stress the role of splicing factors in AS but then point out the importance of non-genetic factors in the splicing outcomes that dictate the identity of the splicing factors themselves, these non-genetic factors including “temperature, light, salt, hormones, etc.” (p. 3 of online version).

Implications of AS for Conceptualizing the Gene

Prior to the discovery of AS, it seemed reasonable to conceive of structural genes as blueprints for proteins to the extent that genes appeared to be straightforward templates for mRNA molecules, with these appearing to be straightforward templates for the amino-acid sequence of proteins. However, the nature of the processing that primary mRNA undergoes on the way to becoming functional mRNA insures that there is no one-to-one correspondence between DNA nucleotides and translatable mRNA nucleotides in eukaryotes. Additionally, there is the problem, already discussed, of trying to imagine how the information in the gene/primary transcript, alone, is sufficient to dictate which excisable exons are to be spliced out when these can vary in a situation-dependent manner. Genes appear very much to be blueprints for primary mRNA molecules but not blueprints for the functional mRNA molecules resulting from AS. Information of a different sort seems required in going from primary to functional mRNA. Furthermore, that this information, whether manifesting as cell-specific secondary structure or hormone or mineral levels, or temperature or light intensity, is not always of a merely ancillary nature is indicated, again, by the fact that AS can result in isoforms with opposite functions.

According to Tress et al. (2017), there is scant evidence, in mammals at least, that anywhere near the same number of isoforms is being generated as are alternative transcripts (functional mRNAs), and when multiple isoforms do result, there is not always accompanying evidence that each is functionally distinct from the other(s). Howver, a splicing outcome non-productive in the sense that one of a pair of alternative transcripts is never translated can still exert a profound regulatory influence on protein synthesis via the orchestrated linkage of nonsense-mediated decay (NMD), a process that results in the degradation of functional mRNA molecules, to AS (Soergel et al., 2000, p. 13). Makeyev et al. (2007) provide an example of this. In mice non-neuronal cells, a splicing factor influences splicing of a second splicing factor's primary transcript in such a way that the resulting alternative transcript is shunted into the NMD pathway and degraded. However, in nervous system (NS) cells, where the second splicing factor is needed, a small regulatory RNA represses the first splicing factor's synthesis to a degree that allows for synthesis of the second splicing factor. Thus, splicing of the second splicing factor's primary transcript, in non-NS tissue, is non-productive the alternative transcript is never translated. Splicing in NS tissue, however, is productive the alternative transcript is translated.

Intron retention (IR) is a type of AS. One IR variant results in the retention of an intron in the 5′ un-translated region of a functional mRNA, this altering the ease of translation of the transcript and thus providing a means for regulating gene expression. We saw an example of this in Remy et al. (2014) as regards Arabidopsis. Translation of each of two alternative transcripts results in synthesis of exactly the same protein. However, there is a greater need for this protein in root cells. In these cells, zinc-mediated IR results in an alternative transcript with enhanced translation efficiency.

Regulation of gene expression by shuttling alternative transcripts down the NMD pathway, prior to translation, and intron retention, thus, are two mechanisms by which AS can regulate protein synthesis, and in a tissue-specific manner, even though neither results in the synthesis of more than one protein per gene.

The starting point for this article's analysis of PTM was an already “complete” protein, from where PTM's role in protein synthesis was seen to lie in its capacity to influence protein functioning. The starting point for this article's analysis of AS was the primary transcript, from where AS's influence on protein synthesis was seen to lie in its capacity to provide for the generation of more than one functional mRNA and, presumably, more than one functionally distinct protein. To the extent that this happens and to the extent that non-genetic factors are involved in regulating AS, the gene-as-blueprint and genome-as-developmental-program metaphors seem called into question. However, as just mentioned, there are doubts in some circles about the degree to which alternative transcripts are translated or, when translated, translated into functionally distinct proteins. We have just seen though, that AS can influence protein synthesis even when two different functional mRNAs, derived from the same primary transcript, do not lead to synthesis of two different proteins. AS can influence protein synthesis, not just by providing the basis for generating more than one functionally distinct protein from the same gene but also by regulating the extent of synthesis of a protein that might be the only one associated with a particular gene. (This latter capacity of AS does not challenge the gene-as-blueprint metaphor it does challenge the genome-as-developmental-program metaphor.)


ATF3 Edit

Activating transcription factor 3 (Atf3) is a known RAG with numerous promoters. Atf3 expression increases after nerve injury and overexpression of a constitutively active form of Atf3 increases the rate of peripheral nerve regeneration. [7] Four Atf3 isoforms were identified in dorsal root ganglia (DRG) so far. These four isoforms differ in TSS, and one differs in the CDS. However it is unclear which promoters are in use in regenerating DRG neurons. [8]


Phosphatase and tensin homolog (Pten) is originally identified as a tumor suppressor gene. [9] Recent studies found that Pten also suppressed axon regeneration in retinal ganglion cells, corticospinal tract, and DRG neurons. [10] [11] [12] So far 3 Pten isoforms (Pten, PtenJ1, and Pten J2) have been identified and analyzed. Pten J1 is identical in sequence to the conventional Pten isoform except for a difference in TSS and a small shift in the CDS. Pten J2 has a truncated CDS, an alternative transcription start site and a longer 3’ UTR compared to the conventional Pten isoform expressed within neurons. The truncated CDS encodes a protein that lacks a phosphate domain. Also, overexpression of Pten J2 and Pten in primary cortical neurons does not influence axon regeneration. So it’s hypothesized that Pten J2 works as regulatory RNA to inhibit the activity of Pten. [8]

Same Gene, Different Functions

Catherine Offord
Feb 11, 2016

CD46, a type I membrane protein, has at least 14 different isoforms. WIKIMEDIA, EMW The human genome contains roughly 20,000 protein-coding genes, yet the number of proteins in human cells is thought to be more like 100,000. Researchers from three institutions in North America have now shown that at least some of the diversity of proteins&rsquo functions in the cell may be due to the widely diverging roles of protein isoforms&mdashstructurally similar variants produced as a result of slight differences during the translation of a single gene. The findings were published yesterday (February 11) in Cell.

&ldquoThe exciting discovery was that isoforms coming from the same gene often interacted with different protein partners,&rdquo study coauthor Gloria Sheynkman of the Dana-Farber Cancer Institute said in a statement. &ldquoThis suggests that the isoforms play very different roles within the cell.&rdquo

Unlike previous functional studies of isoforms, which have generally focused.

The researchers found that, on average, two related isoforms shared less than 50 percent of interacting proteins 16 percent of related isoforms shared none at all. These differences in interaction partners were often associated with only tiny alterations in DNA sequence—sometimes just a single base pair.

“From the perspective of all the protein interactions within a cell, related isoforms behave more like distinct proteins than minor variants of one another,” study coauthor Tong Hao of Dana-Farber said in the statement.

“A more detailed view at protein interaction networks, as presented in our paper, is especially important in relation to human diseases,” added study coauthor Lilia Iakoucheva of the University of California, San Diego. “Drastic differences in interaction partners among splicing isoforms strongly suggest that identification of the disease-relevant pathways at the gene level is not sufficient. . . . It’s time to take a deeper dive into the networks that we are building and analyzing.”


DOPC (1,2-dioleoyl-sn-glycero-3-phosphatidylcholine), DOPS (1,2-dioleoyl-sn-glycero-3-phospho-L-serine), DOPE (1,2-dioleoyl-sn-glycero-3-phosphatidylethanol-amine), cholesterol (cholest-5-en-3ß-ol), PI(3)P (1,2-dioleoyl-sn-glycero-3-phospho-(1′-myo-inositol-3′-phosphate)), PI(3,5)P2 (1,2-dioleoyl-sn-glycero-3-phospho-(1′-myo-inositol-3′,5′-bisphosphate)), PI(4)P (L-α-phosphatidylinositol-4-phosphate), PI(4,5)P2 (L-α-phosphatidylinositol-4,5-bisphosphate), BODIPY TMR-PtdIns(4,5)P2, C16 (red PI(4,5)P2), 1-oleoyl-2-6-[4-(dipyrrometheneboron difluoride) butanoyl] amino hexanoyl-sn-glycero-3-phosphoinositol-4,5-bisphosphate (TopFluor PI(4,5)P2), and Egg Rhod PE (L-α-phosphatidylethanolamine-N-lissamine rhodamine B sulfonyl) were purchased from Avanti Polar Lipids, Inc. (Avanti Polar Lipids, USA). Stock solutions of lipids were solubilized in chloroform at a concentration of 10 mg mL −1 , except for cholesterol and Egg Rho PE dissolved respectively at a concentration of 20 mg mL −1 and 0.5 mg mL −1 and PIPs, which were solubilized in a mixture of chloroform/methanol (70:30) (v/v) at a concentration of 1 mg mL −1 . All stock solutions were kept under argon and stored at − 20 °C in amber vials (Sigma-Aldrich, France).

Expression, purification, and labeling of proteins

Expression and purification of MBP-CHMP2A-ΔC (residues 9–161) and CHMP3-FL (residues 1–122) was performed as described in [18]. A final gel filtration chromatography step on a superdex200 column was performed in a buffer containing 20 mM Hepes pH 7.6, NaCl 150 mM.

CHMP2B-FL (residues 1–222) and CHMP2B-ΔC (residues 1–154) were expressed and purified as previously described [32]. Both constructs contain a C-terminal SGSC linker for cysteine-specific labeling. Cells were lysed by sonication in 50 mM Tris pH 7.4, 1 M NaCl, 10 mM DTT, complete EDTA free, and the soluble fraction was discarded after centrifugation. The pellet was washed three times a buffer containing 50 mM Tris pH 7.4, 2 M UREA, 2% Triton X-100, 2 mM β-mercaptoethanol, and a final wash in 50 mM Tris pH 7.4, 2 mM β-mercaptoethanol. CHMP2B (-FL and -ΔC) was extracted from the pellet using a buffer composed of 50 mM Tris pH 7.4, 8 M guanidine, 2 mM β-mercaptoethanol over night at 4 °C. Further purification of solubilized CHMP2B included Ni 2+ chromatography in 50 mM Tris pH 7.4, 8 M urea, refolding by rapid dilution into a buffer containing 50 mM Tris pH 7.4, 200 mM NaCl, 2 mM DTT, 50 mM L-glutamate, 50 mM L-arginine at a final concentration of 2 μM. Refolded CHMP2B was concentrated by Ni 2+ chromatography in a buffer containing 50 mM Tris pH 7.4, 200 mM NaCl. A final gel filtration chromatography step was performed on a superdex75 column in the buffer containing 50 mM Tris pH 7.4, 100 mM NaCl.

For MBP-CHMP2B-ΔC production, Escherichia coli BL21 cells were transformed with plasmids and grown at 37 °C in Luria broth medium to an OD600 of 0.6. Protein expression was induced by the addition of 1 mM arabinose for 3 h at 37 °C. Cells were harvested by centrifugation and the bacterial pellet was re-suspended in 50 mL of binding buffer (50 mM Hepes pH 7.6, 300 mM NaCl, 300 mM KCl). The bacteria were lysed by sonication for 5 min and cell was pelleted by centrifugation at 20,000 rpm for 30 min. The MBP-CHMP2B-ΔC protein was purified on an amylose affinity column in binding buffer.

Following expression, CHMP proteins were concentrated, labeled overnight at 4 °C with a ratio of Alexa labeling dye per protein of 2 to 1. MBP-CHMP2A-∆C, CHMP3-FL, and CHMP2B ( -∆C and -FL) were labeled with Alexa 488 succimidyl ester, Alexa 633 succimidyl ester, and Alexa 488 C5 maleimide (Thermo Fisher Scientific), respectively. The excess of free dyes was removed by salt exchange chromatography except for MBP-CHMP2B-ΔC where a final gel filtration chromatography (superdex 200) step was performed in a buffer containing 50 mM Hepes pH 7.6, 150 mM NaCl. Immediately after labeling, all aliquots were frozen in liquid nitrogen with 0.1% of methyl cellulose (Sigma-Aldrich) as cryoprotectant. All aliquots were kept at − 80 °C prior to experiments.

GUV preparation for confocal, spinning disk, and FACS experiments

GUVs were prepared by spontaneous swelling on polyvinyl alcohol (PVA)-based gels [84]. A thin lipid solution is deposited on a PVA gel (5% PVA, 50 mM sucrose, 25 mM NaCl and 25 mM Tris, at pH 7.5), dried under vacuum for 20 min at room temperature and rehydrated with the growth buffer at room temperature. Vesicles form within 45 min and are extracted by pipetting directly from the slides on top of the PVA gel.

Composition 1

For confocal and spinning disk microscopy experiments, lipid stock solutions were mixed to obtain DOPC/DOPS/DOPE/Cholesterol/PI(4,5)P2/PE-Rhodamine (54.2,10:10:15:10:0.8) (molar ratio) at a concentration of 3 mg mL −1 in chloroform. In the following, this GUV composition will be referred to as 10% PIP2-GUV. In order to detect the PI(4,5)P2 lipid signal, PE-Rhodamine in the PIP2-GUV lipid stock solution was replaced by TopFluor PI(4,5)P2 with a molar ratio of PI(4,5)P2/TopFluorPI(4,5)P2 of (8,0.5) referred to as FluoPIP2-GUV.

Composition 2

For FACS microscopy experiments, lipid stock solutions were mixed to obtain DOPC/DOPE/Cholesterol/PI(4,5)P2/PE-Rhodamine (72.2:10:15:2:0.8) (molar ratio) at a concentration of 3 mg mL −1 in chloroform. In the following, this GUV composition will be referred to as 2% PIP2-GUV. To compare CHMP protein binding to different PIP species, we replaced PI(4,5)P2 lipids at equal molar ratio by PI(3)P, PI(4)P, and PI(3,5)P2 lipids, respectively. In the following, these GUV compositions will be referred to as 2% PI(3)P-GUV, 2% PI(4)P-GUV, and 2% PI(3,5)P2-GUV.

SUV preparation for QCM and AFM experiments

After preparation of lipid composition 1, at 3 mg mL −1 , in chloroform, the solvent was evaporated by rotating the vial under a gentle stream of nitrogen, at room temperature and then was placed under vacuum for 20 min at room temperature. The dried lipid film was rehydrated in the appropriate growth buffer solution to obtain a final concentration of 1 mg mL −1 . The solution was vortexed for 2 min and then extruded 11 times through a polycarbonate track-etched membrane with pore sizes of 100 nm [85] or sonicated for 5 min until obtaining a clear colorless solution for small unilamellar vesicle (SUV) formation. Produced SUVs were either used freshly for QCM-D experiments and for HS-AFM indentation experiments or stored at − 20 °C in amber vials (Sigma-Aldrich, France) for further use. In the following, this SUV composition will be referred to as 10% PIP2-SUV.

To compare CHMP2B protein binding in the absence of PI(4,5)P2 and to increase the net negative charge of the membrane for QCM-D experiments, SUVs were produced containing DOPC/DOPS/DOPE/Cholesterol/PE-Rhodamine (44.2:30:10:15:0.8) (molar ratio) or (34.2:40:10:15:0.8), at a concentration of 3 mg mL −1 in chloroform referred to as 30% DOPS-SUV and 40% DOPS-SUV, respectively. Moreover, to compare CHMP2B protein binding to a membrane incorporating a higher amount of negative charges as well as PIP lipids, we replaced the 10% molar ratio of PI(4,5)P2 in the PIP2-SUV by 10% molar ratio of PI(3-5)P3 lipids. In all QCM-D experiments, quartz crystal resonance frequency shifts were measured at the overtone 5 of the oscillating crystal and therefore defined as ∆ϑ5.

CHMP supramolecular assembly on GUVs observed by fluorescence microscopy

Freshly produced 10% PIP2-GUVs were incubated with CHMP proteins at concentrations ranging from 50 nM to 2 μM in BP buffer (Tris 25 Mm, NaCl 50 mM pH 7.5) in isotonic conditions for 15 to 30 min. Then, CHMP-coated GUVs were diluted 20 times and transferred to the observation chamber, previously passivated with the β-casein solution and rinsed twice with BP buffer.

Supramolecular assembly of CHMP proteins on GUVs was visualized on an inverted Spinning Disk Confocal Roper/Nikon. The spinning disk is equipped with the camera, EMCCD 512 × 512 Andor Technology (pixel size 16 μm), an objective (× 100 CFI Plan Apo VCoil NA 1,4 WD 0,13), and 3 lasers (491, 561, 633 nm 100 mW). The exposure time for all images was 50 ms.

To further characterize and compare the interaction of CHMP proteins on GUVs, we measured the total intensity of the protein on the vesicle and normalized this value by the GUV area.

Image acquisition for protein quantification was performed using a confocal microscope composed of an inverted microscope (Eclipse TE2000 from Nikon), two objectives (× 60 water immersion and × 100 oil immersion), a C1 confocal head from Nikon, three lasers (λ = 488 nm, λ = 561 nm, and λ = 633 nm). One confocal plane image was taken for each set tension.

FACS experiment for protein-lipid binding assay

2% PI-GUV and CHMP fluorescence intensity was measured with a BD LSRFORTESSA flow cytometry instrument. Data analysis was performed with BD FACS Diva software.

The collected GUVs were transferred to BP buffer and incubated 30 min with CHMP proteins at 500 nM. The vesicle concentration was adjusted in order to count about 10,000 events per condition every 60 s at high speed.

2% PI-GUVs were labeled with Egg Rhod PE (0.8% w/w), CHMP2B labeled with Alexa 488, CHMP2A labeled with Alexa 488, and CHMP3 labeled with Alexa 633. Alexa 488 was excited with a 488-nm laser, and the emission was detected through a 530/30 standard bandpass filter. Alexa 633 was excited with a 633-nm laser, and the emission was detected through a 670/30 bandpass filter. Egg Rhod PE was excited with a 532-nm laser, and the emission was detected through a 610/20 bandpass filter. Two signals were closely analyzed: the protein fluorescent signal and the lipid fluorescent signal. Thus, the fluorescence intensity of the membrane and the fluorescence intensity of the proteins are respectively proportional to the amount of fluorophores in the vesicle and proteins bound to it or present in the detection zone and unbound. The intensity plot displaying the protein fluorescence signal as a function of the lipid fluorescent signal presents 3 regions: (i) unbound proteins (single-positive for proteins only in the top left quadrant), (ii) CHMP proteins bound to GUVs (double-positive for proteins and lipids in the top right quadrant), and (iii) GUVs free of proteins (single-positive for lipids only in the lower right quadrant).

QCM-D experiments

Supported lipid bilayers (SLBs) were generated with or without PIP lipids. In the absence of PI(4,5)P2, SLB made of 30% and 40% DOPS-SUV composition were produced with a buffer containing Ca 2+ (150 mM NaCl, 10 mM Tris (at pH 7.5) + 2 mM Ca 2+ ) [41]. After SLB formation, the bilayer was rinsed with the same buffer but supplemented with EDTA (150 mM NaCl, 10 mM Tris pH 7.5, 10 mM EDTA) to remove Ca 2+ excess. SLBs were also produced in the presence of PIP lipids (PI(4,5)P2 or PI(3-5)P3), with PIP2-SUV or PIP3-SUV lipid compositions, respectively. SLB formation was achieved in a buffer containing 150 mM KCl, 20 mM citrate pH = 4.8 [42]. Following SLB formation, CHMP proteins were injected at a concentration of 200 nM in BP buffer. The interaction between the proteins and the lipid bilayer was directly measured from the fifth overtone of the frequency shift (Δϑ5).

QCM-D measurements were performed using a Q-Sense E4 system (Q sense Gothenburg, Sweden). The mass sensor is a silicon dioxide-coated quartz crystal microbalance SiO2 (QSX-303 Lot Quantum Design France) with a fundamental frequency of 4.95 MHz. The liquid flow was controlled using a high-precision multichannel dispenser (IPC ISMATEC—Germany). All experiments were performed at room temperature with a flow rate of 50 μL min −1 .

Micropipette experiments

The experimental chamber and the micropipette made of a borosilicate capillary (1-mm outer diameter and 0.58-mm inner diameter (Harvard Apparatus, UK)) introduced into the chamber are passivated with a β-casein solution at 5 mg mL −1 in sucrose 25 mM, NaCl 50 mM, and Tris 25 mM (pH 7.5) for 15 min. The chamber is rinsed twice with BP buffer. Then, PIP2-GUVs pre-incubated with CHMP proteins are added to the chamber. Once the chamber is sealed with mineral oil, the zero pressure is measured and the aspiration assay can begin by decreasing the water height gradually, thus increasing the applied tension on the vesicle.

The explored tensions for the aspiration experiments with the different CHMP proteins range up to 1.6 mN m −1 (corresponding to the membrane enthalpic regime). The software EZ-C1 was used for the acquisition of the confocal images.

At high tension, in the enthalpic regime, an apparent elastic stretching modulus of the membrane χ can be deduced from the linear variation of the fractional excess area Δα ( ( Delta upalpha =pi _pleft(1-_p/_voperatorname<> ight)Delta _p/_0 ) where ΔLp is the variation of the tongue length and A0 the initial area of the GUV) as a function of the applied tension σ using ( Delta alpha =Delta _0+frac<1>sigma ) [86], with ∆α0 being the initial excess area for the reference tension σ0. According to the Young-Laplace equation, the membrane tension is equal to ( sigma =Delta P imes _p/left(2 imes left(1-frac ight) ight) ) where ΔP is the difference of pressure between the interior of the micropipette and the chamber and Rp and Rv are respectively the pipette and vesicle radius [63].

Osmotic shock on GUVs

10% PIP2-GUVs were either co-incubated with 500 nM CHMP2B-∆C in 50 mM NaCl and 25 mM Tris, at pH 7.4 buffer (CHMP protein binding buffer referred as BP buffer) or transferred to the same buffer free of protein (osmolarity equal to 125 mOsm L −1 ). CHMP2B-coated GUVs and CHMP2B-free GUVs were then transferred to a hyperosmotic buffer with increasing sodium chloride concentrations up to 250 mM NaCl. The effect of the osmotic shock was visualized using confocal microscopy.

HS-AFM imaging-based deformation experiment

PIP2-SUVs were immobilized on a freshly cleaved mica surface and placed into the AFM chamber with BP buffer. For studying the vesicles with proteins, prior to immobilization to the surface, the PIP2-SUVs were pre-incubated with either 1 μM of CHMP2B or 1 μM of CHMP2B + 2 μM of CHMP3 or 1 μM of CHMP2A + 2 μM of CHMP3 for 30 min to allow full protein coverage on the SUV surface. A high-speed amplitude modulation tapping mode AFM (RIBM, Japan) was used for imaging [87,88,89] and deformation experiments, with ultra-short cantilevers (spring constant 0.15 N/m, Nanoworld). Initial imaging (at minimum force) was performed at a free cantilever oscillation amplitude of 5.4 nm and a set-point amplitude at 4.3 nm. The imaging rate was 0.5 frame/s. We regulated the set-point amplitude in a stepwise manner, while keeping the free amplitude constant, in order to increase the imaging force. The imaging force can be estimated in the first approximation as F = kΔz, where k is the spring constant of the cantilever and Δz is the difference between free and set-point amplitude of the cantilever oscillation. It follows that the images were acquired with an estimated minimal force of

150 pN. For the measurement of membrane mechanics with and without proteins, image acquisition was first performed at minimal force (

150 pN). Next, step by step, the imaging force was increased with 9% increments, by decreasing the set-point amplitude. After reaching the maximal force, after

8 steps and an estimated final imaging force of

270 pN, the tapping force was reduced again to its lowest value (

150 pN), and the height recovery was recorded. Only those vesicles that exhibited a height recovery of at least 90% of their initial height were considered to be elastically deformed and were included in the analysis. Errors in the relative stiffness are given as standard error of the mean (SEM). Images were analyzed using IgorPro scripts of the AFM manufacturer (RIBM) and ImageJ scripts.