Gavin Wilson

From

Gavin Wilson

Supervisor: Lincoln Stein

Proposed collaborative internship with: Benjamin Blencowe

Registered in: Department of Molecular Genetics


My doctoral research project aims to better understand the role of pre-mRNA splicing and cancer development. There has been previous precedent in the literature demonstrating that perturbations of the splicing patterns within a cell can promote cellular transformation and cancer development. This leads to the primary hypothesis of my project that, “Perturbations in the splicing patterns within cancer cells contributes to the promotion of cellular transformation”. I will be utilizing pairedend RNA-seq data with a minimum read-size of 100bp that is being produced at the Ontario Institute for Cancer Research (OICR). This sequencing data will be generated using Illumina sequencing technology. The samples sequenced will primarily be derived from pancreatic ductal adenocarcinoma primary tumours, mouse xenografts, or cell lines that are being sequenced for the International Cancer Genome Consortium by OICR. To analyze this data, I first surveyed the existing RNA-seq alignment tools such as TopHat. However, I found that these tools produced alignment artifacts that caused deleterious effects on my down-stream analysis. I attempted to develop post-processing tools to fix these artifacts but there were always residual problems with the alignments. To remedy these artifacts I have begun to develop a tiered RNA-seq alignment pipeline. The tiered RNA-seq alignment pipeline I have been developing has two major steps. The first step attempts to align each read-pair individually to known splice junctions and the reference genome. I will be using a junction sequence database tailored specifically for the read-size produced by the sequencing reaction and Novoalign for the alignment. After the alignment a post-processing step will be performed to remove redundant and ambiguous alignments and to resolve the read-pairs. The second major step attempts to find novel junctions within the reads that did not align in the first step. I will be using a splicing aware aligner such as BLAT or a de novo assembler for this step. Initial testing of my pipeline compared to TopHat showed a significant increase in sensitivity for known splice-site alignments. Furthermore, my pipeline typically mapped more reads to known-splice sites. Finally, after the alignment steps, I will develop a pipeline that uses custom and/or published tools to analyze the samples splicing patterns. This step will require transcript assembly and abundance calculations and tools to normalize these values for comparison to other samples.

Personal tools
MediaWiki Appliance - Powered by TurnKey Linux