Staff
Luay K. Nakhleh
Instructor
nakhleh@cs.rice.edu
Office Hours
by appointment, DH 3119
Leo Elworth
Teaching Assistant
ryan.a.leo.elworth@rice.edu
Office Hours
by appointment, DH 3121
Discussion Forum
We will be using Piazza for discussions. Feel free to post questions and responses about the material covered in the lecture, homework problems, etc. However, please do not give any information that relates to your own solution for a homework problem. To go to the discussion forum, click here.
Course Information
- MEETING PLACE AND TIME: Place HRZ 210, Tuesday and Thursday, 9:25-10:40 AM.
- TEXTBOOKS (none required, but highly recommended, especially the first two as they cover most of the material in the course):
- “Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids“, by Durbin et al., Cambridge University Press.
- “Understanding Bioinformatics“, by M. Zvelebil and J.O. Baum. Published by Garland Science, 2008.
- “Algorithms on Strings, Trees, and Sequences“, by Gusfield. Cambridge University Press.
- “Genome-scale Algorithm Design“, by V. Makinen et al., Cambridge University Press.
- “Population Genetics“, by M.B. Hamilton. Published by Wiley-Blackwell, 2009.
- An Introduction to Population Genetics Theory, by Crow and Kimura.
- Inferring Phylogenies, by Felsenstein.
- Fundamentals of Molecular Evolution, by Graur and Li.
- Theoretical Evolutionary Genetics, by Felsenstein (PDF available online).
- Evolutionary Theory: Mathematical and Conceptual Foundations, by Rice.
- SOFTWARE: here are links to programs (Matlab toolboxes included) that may be useful for homework tasks, and beyond:
- For a comprehensive list of phylogeny programs, please see the website maintained by Prof. Joe Felsenstein here.
- The Molecular Biology and Evolution toolbox. Click here for a paper that describes the tool.
- The Population Genetics and Evolution toolbox.
- The ms tool for generating samples under neutral models.
- INTENDED AUDIENCE: Anyone interested in learning about algorithms and their use in biological sequence analysis. A solid background in algorithms and good knowledge of probability are essential (without these two, students might struggle in the course). Knowledge of biology is a plus, but is not required. This is not a “programming for biologists” course, nor is it a course on how to use bioinformatics tools and databases.
- TOPICS TO BE COVERED (tentative, time permitting):
- Pairwise sequence alignment
- Markov chains and HMMs
- Pairwise alignment using HMMS
- Profile HMMs for sequence families
- Multiple sequence alignment
- Phylogenetic tree inference
- Phylogenomics
- Genome-scale index structures (suffix trees, Burrows-Wheeler indexes,..)
- Genome-scale algorithms (read mapping, genome comparison, genome compression,…)
- Genomics, transcriptomics, and metagenomics
- GRADING:
- In-class midterm 1: 25% (February 23, 2017)
- In-class midterm 2: 25% (April 20, 2017)
- A set of homework assignments (5-6 assignments): 50%
- RICE HONOR CODE: In this course, all students will be held to the standards of the Rice Honor Code, a code that you pledged to honor when you matriculated at this institution. If you are unfamiliar with the details of this code and how it is administered, you should consult the Honor System Handbook. This handbook outlines the University’s expectations for the integrity of your academic work, the procedures for resolving alleged violations of those expectations, and the rights and responsibilities of students and faculty members throughout the process.
Students from other institutions will also be held to the same standards of the Rice Honor Code. - STUDENTS WITH DISABILITY: If you have a documented disability or other condition that may affect academic performance you should: 1) make sure this documentation is on file with Disability Support Services (Allen Center, Room 111 / adarice@rice.edu / x5841) to determine the accommodations you need; and 2) talk with me to discuss your accommodation needs.
Course Material
Course material (homework assignments, schedule of topics, slides, etc.) will be posted in this section.
Slides Set # | Topic | Slides | Materials |
1 | Administrivia and background material | Slides (full), Slides (handout) |
syllabus |
2 | Sequence alignment: General overview | Slides (full), Slides (handout) |
|
3 | Sequence alignment: Scoring schemes | Slides (full), Slides (handout) |
|
4 | Sequence alignment: Dynamic programming algorithms for pairwise alignment |
Slides (full), Slides (handout) |
|
5 | Significance of estimated sequence alignments | Slides (full), Slides (handout) |
|
6 | Markov chains and hidden Markov models | Slides (full), Slides (handout) |
homework 1 (due Feb 7) |
7 | Pairwise HMMs and sequence alignment | Slides (full), Slides (handout) |
homework 2 (due Feb 21) |
8 | Profiles and multiple sequence alignments | Slides (full), Slides (handout) | |
9 | Phylogenetics: Recovering Evolutionary History | Slides (full), Slides (handout) |
|
10 | Phylogenetics: Building Phylogenetic Trees | Slides (full), Slides (handout) |
|
11 | Phylogenetics: Parsimony | Slides (full), Slides (handout) |
homework 3 (due Apr 6) |
12 | Phylogenetics: Distance-based Methods | Slides (full), Slides (handout) |
homework 4 (due Apr 18) |
13 | Phylogenetics: Likelihood | Slides (full), Slides (handout) |
|
14 | Phylogenetics: Bayesian Inference | Slides (full), Slides (handout) |