CSE549: Introduction to Computational Biology (Fall 2017)

Welcome to the course webpage for CSE549: Introduction to Computational Biology


  • (10/14) Some brief examples of the types ofe short answer and algorithm design questions you can expect on the midterm have been posted
  • (9/21) The course projects have been assigned to groups. The list of assignments is posted in this spreadsheet. The vast majority of groups were assigned either their first or second choice project. If you do not have an assigned project, and have not been contacted by us, please let us know ASAP.
  • (9/14) The list of available course projects has been posted. The group signup sheet has also been posted. You should finalize your groups and ranking of projects (i.e. fill in your group's entry on the sheet) by no later than 1pm on Tues. 9/19.
  • (9/12) A Piazza site has been created for the course. You can sign in here.
  • (9/07) The first homework set of 6 problems is posted on http://rosalind.info/classes/437/. They are due Thurs. (9/14) by midnight.

This website will contain relevant course announcements and news, as well as links to presentation slides after the lecture has taken place

A tentative list of topics

Computational Biology is a huge field of study, that touches upon many distinct algorithmic and biological areas of study. What we are able to cover in this course will depend, in part, on the pace at which we move, which I will attempt to adjust as appropriate. However, here is a tentative list of topics I hope to cover this semester (not necessarily in order).

  • Optimal sequence alignment (global, local, and glocal alignment — with constant & affine gap penalties
  • Algorithms and data structures for efficient text indexing and exact search
  • Heuristics for read alignment and mapping &mdash mapping DNA-seq and RNA-seq reads
  • Genome assembly — k-mers, De Brujin graph construction and representation, long-read technology and read-overlap graph assembly
  • Motif finding via Gibbs sampling
  • Gene finding — statistical models for ab initio and evidence-guided prediction of genes
  • RNA-seq and transcriptomics — transcript assembly, abundance estimation and differential expression testing
  • Phylogenetics — The small and large phylogeny problem; parsimony, maximum likelihood and Bayesian methods