Investigating the Prevalence of an Unusual Form of Alternative Splicing

Date

2014

Authors

Wilson, Laurence Oscar

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

This study identified 2567 potential novel protein isoforms, across seven different species, which have been overlooked by previous large-scale studies. These arise due to an unusual outcome of alternative splicing which combines a frame-shift with an alternate start-codon in order to shift the reading frame of the protein's amino-terminus. First encountered by the Fahrer laboratory in the mouse Ncaph2 gene, only four instances of this splicing are reported in the literature, none of which have been identified through the automated pipelines used for protein annotation. The first aim of this thesis was therefore to investigate the prevalence of the amino-terminus frame-shifting splicing in a number of different genomes. In order to do this, a bioinformatics pipeline was developed, which analyzed publically available Expressed Sequence Tag (EST) data in order to identify these splice events. The pipeline identified 3045 protein isoforms across the human, mouse, Arabidopsis, C. elegans, Drosophila, rat and zebrafish genomes. Of these, over 2500 are completely novel. A small number of splice events were found to be conserved between multiple species, but only in the most related genomes of human, mouse and rat. In silico analysis of the predicted protein isoforms suggests they are functionally important, with the majority predicted to either changer their sub-cellular localization and/or alter their domain structure. Translation of these isoforms is ultimately reliant on the cell selecting the correct start codon. Consistent with this, characteristics known to promote translation (such as Kozak sequences and structural elements within the mRNA transcript) were identified in the predicted variants. Identification of a number of the variants at the protein level was achieved through the use of publically available Mass Spectroscopy data. The strongest translational evidence however comes from a recent Ribosomal Profiling study, which shows direct evidence for the selection of a number of alternate start codons predicted in this thesis. A more in depth study into the expression of the Ncaph2 splice variants was undertaken. In total, the first exon of Ncaph2 can be transcribed as one of three forms. All of the variants were shown to be transcribed across all stages of the cell-cycle. Fusion-proteins of all possible protein isoforms were also constructed and transfected into a cell-line. All proteins were shown to be capable of being translated. While no functional differences were observed, the tools developed here can be used for future in-depth study. The 2567 novel proteins described in this thesis are involved in a variety of biological processes, and therefore open many new avenues of research.

Description

Keywords

Citation

Source

Type

Thesis (PhD)

Book Title

Entity type

Access Statement

License Rights

Restricted until