A common type of alternative splicing involves "cassette" exons: internal exons that can either be included or excluded from the mature mRNA. We are studying the Ultrabithorax (Ubx) transcripts of Drosophila as a model for developmental regulation of cassette exons. As a consequence of stage- and tissue-specific alternative splicing, Ubx encodes a family of six transcription factor isoforms that control segmental identity in the epidermis, mesoderm and nervous system of Drosophila (Figure 1). It has been assumed that exclusion of cassette exons always results from exon skipping, i.e. the splice sites flanking the exon are ignored and the exon is removed along with the two flanking introns. However, we have found that the two cassette exons of Ubx are spliced constitutively to the upstream exons regardless of developmental stage or tissue type. Exclusion of the cassette exons results from subsequent recursive splicing at 5' splice sites that are regenerated at the exon-exon junctions; this removes the cassette exon during splicing of the next intron (Figure 2). This strategy avoids the production of a very long (75kb) precursor transcript in CNS neurons, where the cassette exons are excluded from Ubx mRNAs.
We are using genetic modifier screens, site-directed mutagenesis and biochemical methods to investigate the mechanism of regulation of recursive splicing for cassette exon mI of Ubx. These approaches have identified both negative (e.g. hnRNP A1 family protein Hrb87F) and positive (e.g. SR family protein 9G8) effects on mI resplicing, apparently mediated by binding to partially overlapping regulatory sites within exon mI, immediately downstream of the regenerated 5' splice site. We are analyzing additional features of the regulatory mechanism, including interactions between consecutive cassette exons.
We are studying mechanisms involved in the processing of very large introns (>10 kb), which are found in many genes with important roles in development and disease (e.g. dystrophin, CFTR, Rb). We have found that (at least a subset of) large introns in Drosophila are processed by a recursive splicing mechanism that allows sequential removal of smaller subfragments as the intron is transcribed (Figure 4). Sequence analysis and preliminary experimental data suggest that this is also the case in mammals. Conservation of the sequences and relative positions of recursive splice sites in orthologous introns of D. melanogaster and D. pseudoobscura indicates that recursive splicing plays an important role in expression of the corresponding genes. Possible functions under investigation include maintenance of splicing fidelity across large introns and prevention of premature transcriptional termination. Recursive splicing may also have facilitated changes in gene structure during evolution.
Recursive splicing is mediated by special elements ("ratchetting points") that consist of juxtaposed 3' and 5' splice site consensus sequences, with no exon between them (Figure 5). When an upstream exon is spliced to such a site, a functional 5' splice site is regenerated at the junction and can be used to splice the next segment of intron. Several ratchetting points can subdivide an intron into multiple subfragments. Because a ratchetting point is essentially a 0-nucleotide exon, recursive splicing leaves no trace in the final mRNA and can only be detected by analysis of processing intermediates or the phenotypes of mutations in ratchetting points.
Ratchetting points are also found at the 5' ends of some alternatively spliced cassette exons, such as those in Ultrabithorax. In this case, use of the regenerated 5' splice site is regulated developmentally so that the cassette exon is retained in some mRNAs and removed in others (Figure 5).
SR proteins are a family of RNA-binding proteins characterized by an RRM-type RNA-binding domain and an arginine/serine-rich domain that mediates protein-protein interactions (Figure 3). They are essential splicing factors that help recruit spliceosome components to the 5' and 3' splice sites. They are also believed to function as concentration-dependent regulators of alternative splicing, with tissue-specific variations in levels of specific SR proteins determining the tissue-specific pattern of utilization of particular alternative splice sites. How does the cell maintain appropriate levels of specific SR proteins? We are studying the feedback regulatory mechanisms that control the function of SR proteins RBP1 and CG1987 in Drosophila. These proteins are orthologs of human SRp20; they are encoded by separate genes but are very closely related and are at least partially redundant in function (Kumar and Lopez, 2005). RBP1 is normally expressed at higher levels than CG1987, but either protein can exert feedback regulation on both genes by inducing alternative splicing of the corresponding transcripts (Kumar and Lopez 2005). The alternatively spliced transcripts encode proteins with Ser and Thr-rich domains in place of the normal RS domains, and they antagonize the function of the RS forms.
In collaboration with Dr. Mayte Saenz-Robles and Dr. James Pipas (U. Pittsburgh) we are investigating changes in alternative splicing that accompany transformation. We are using wild type and mutant derivatives of SV40 T antigen to probe relevant cellular pathways. In addition to standard methods for analysis of differential splicing of specific transcripts, we are developing tools and strategies for analysis of splicing on a genome-wide scale that do not depend on preconceptions about relevant genes or prior information about their splicing patterns. In collaboration with other experimental and computational biologists at Carnegie Mellon (Berget, Durand, Jarvik, Minden, Murphy, Rule) we are incorporating data on splicing into the development of multi-factor databases and analysis tools for cancer classification.