MIG Seminar - Heejung Shim - riboHMM*: Comprehensive annotation of translated regions using ribosome footprint profiling data
riboHMM*: Comprehensive annotation of translated regions using ribosome footprint profiling data.
Understanding the functional effects of gene expression depends critically on the accurate and comprehensive annotation of sequence elements which are translated in each gene. Ribosome profiling provides direct and genome-wide measurements of translation levels in a given cell type. In this talk, I will first introduce a method, riboHMM, that 1) models a codon periodicity structure in ribosome profiling data, and 2) integrates RNA sequence information and transcript expressions to identify translated coding sequences in a transcript. Applying riboHMM on ribosome profiling data collected from human lymphoblastoid cell lines, we identified 7273 novel coding sequences, including 2442 translated upstream open reading frames (uORFs) and 2551 coding sequences from transcripts that were previously annotated as non-coding. We observed that more than 60% of the novel coding sequences use non-canonical start codons. We also observed that ~40% of the 2442 translated uORFs are likely to regulate the translation of their downstream coding regions. Motivated by this observation, I will briefly introduce another method, riboHMM2, for annotating a comprehensive set of translated uORFs by jointly modelling the fine-scale structure in ribosome profiling data around translated uORFs and downstream coding regions. While the previous riboHMM was able to search for translated uORFs only in the transcripts with translated downstream coding regions, riboHMM2 enables annotation of translated uORFs in the entire transcripts. It also allows us to infer the regulatory impact of uORFs on downstream coding regions (e.g., suppression), which is useful for gene regulation studies to understand the mechanisms of uORF actions.
Dr Heejung Shim, Group Leader, Melbourne Integrative Genomics
Dr Heejung Shim
Group Leader, Melbourne Integrative Genomics
The University of Melbourne
I am a Group Leader in the Melbourne Integrative Genomics (MIG) and Lecturer in the School of Mathematics and Statistics at the University of Melbourne. I completed my BS in Mathematics (with a double major in Computer Science and Engineering) from the POSTECH, and my PhD in Statistics from the University of Wisconsin at Madison, advised by Dr. Bret Larget. I did a postdoc at the University of Chicago working with Dr. Matthew Stephens. Previous to my position here, I was a tenure track Assistant Professor in the Department of Statistics at the Purdue University for two years. Currently I retain an affiliation with Purdue as an Adjunct Assistant Professor.