BIOS668, Spring 2026

Course Title: Statistical Methods for High-throughput Genomics Data II

Instructor: Mikhail Dozmorov
Department: Biostatistics, VCU
Credits: 3
Duration: 15 weeks (2 lectures per week, 1 hour 20 minutes each)

Course Overview

Welcome to BIOS 668. This course is the second part of the Genomics curriculum, aimed to introduce core principles of Data Science, Genomics, Bioinformatics and Biostatistics. This is a blended course that combines in-class learning (lectures, labs) with self-directed activities.

The study of genomics and use of next-generation sequencing are at the forefront of biomedical research. Sequencing market is constantly evolving, stimulating the development of new genomics technologies, biostatistical approaches and software tools. Therefore it may not be possible to maintain a stable analysis pipeline throughout a project because the lifetime of software often spans months and even years. To be able to effectively analyze and interpret genomic sequencing data, it is crucial to (1) understand the technologies and statistical properties of the data they produce and (2) develop strong computational skills that will be competitive and broadly applicable.

This course is a continuation of the BIOS 658 and will introduce high-throughput genomic assays including single cell RNA-seq and spatial transcriptomics, whole genome sequencing, alignment, and genome variation analysis, miRNA-seq, metagenomics, epigenomic analysis including ChIP-seq and methylation assays, and chromatin conformation capture technologies. The course will develop applied skills for multi-omics data analysis using Unix, high-performance computing, and R environments. The course is primarily focusing on human genomics; however, knowledge and skills gained through the course are extendable on genomics of model organisms.

The class will be conducted in person and include lecture and coding parts. Course material will be publicly available. The syllabus is subject to change. Observe the VCU Honor Pledge in any class- and homework activities.

Prerequisites

BIOS 658 (see Bulletin), or as permitted by instructor.

Course Objectives

  • Understand the core principles, strengths, and limitations of high-throughput technologies
  • Learn statistical models and algorithms used in sequencing data analysis
  • Gain practical experience in high-throughput Exploratory Data Analysis, visualization, and quality control using Unix and R environment
  • Critically evaluate and interpret statistical methods used in flagship tools for sequencing data analysis
  • Interpret biological findings provided by different sequencing technologies, and be able to integrate different layers of -omics data

At the conclusion of the course, students will be able to collect, analyze and interpret multi-omics data using Unix and R programming environment

Tentative schedule

  • Unix overview
  • Genomic technologies and Bioconductor
  • Genome sequencing and alignment
  • Single nucleotide polymorphism (SNP) analysis
  • Copy Number Variant (CNV) analysis
  • Single-cell sequencing
  • Chromatin Immunoprecipitation (ChIP)-, DNAse-, ATAC-seq
  • Metagenomic
  • Methylation
  • Epigenomics
  • Chromatin Conformation Capture analysis
  • Integrative analysis, TCGA

Homework format

  • Assignments will be posted and should be submitted via VCU Canvas, https://learningsystems.vcu.edu/canvas/
  • Assignments should be submitted as reproducible reports in RMarkdown
  • Short summaries of Reading assignments should be organized in an RMarkdown report, maintained on GitHub
  • Final project should be submitted as a fully reproducible GitHub repository

Class rules

  • Attendance is required
  • Read all assignments before class
  • Bring your laptop and the book to every class

Grading Rubric

General Expectations (applies to all assignments)

When evaluating any assignment, the following points will be considered:

  • Punctuality and Completeness – Was the assignment submitted on time and fully completed, addressing all required components?

  • Organization and Presentation – Is the material presented clearly, logically, and in a format suitable for a professional/academic audience?

  • Effort and Application – Does the work demonstrate serious engagement with the material and effective use of the required tools (e.g., R, RMarkdown, GitHub)?

Discussions

  • Preparedness – Student demonstrates they have completed the assigned preparation (videos, readings, tutorials) before class.

  • Participation – Student is actively engaged during discussion (listening attentively, asking questions, contributing ideas, responding to peers).

In-class Exercises

  • Effort – Student makes a genuine attempt to complete the exercise and uses it as an opportunity to deepen understanding of the topic.

  • Collaboration – When working with peers, the student contributes meaningfully to group progress.

Final project

  • In addition to the above grading considerations, each student will grade one peer’s project, assigned randomly. The goal here is to learn from peer’s work while assessing its quality. Grading by students will be averaged with instructor’s grading.

In addition to the criteria above, the final project will be evaluated on:

  • Reproducibility – The project must be reproducible, with clear instructions, well-documented code, and an organized GitHub repository.

  • Depth and Rigor – The project should demonstrate mastery of statistical and computational methods in genomics, with thoughtful application to a real dataset or problem.

  • Communication – Results and interpretations should be presented clearly, with effective use of figures, tables, and narrative.

  • Peer Review – Each student will evaluate one randomly assigned peer’s project. Peer grades will be combined with the instructor’s grade. Peer evaluation should be constructive and based on the same rubric.

Assignment Values

Assignments Grade Overall

Your final grade reflects your performance throughout the semester. It includes readings, participation, in-class exercises (attendance is required), and assignments, weighted as follows:

Assignment Percentage Value
In-class participation 20%
Reading and homework assignment 50%
Final project 30%
TOTAL 100%

Grading model

Grading for individual assignments reflects the overall quality of the completed work:

  • A – Excellent work demonstrating high quality, accuracy, and minimal need for revision.
  • B – Good work showing solid understanding and competence, with minor issues.
  • C – Satisfactory but below expectations; work has notable errors or omissions.
  • D – Poor work that fails to meet basic standards or demonstrates significant misunderstandings.
  • F – Unacceptable work with major deficiencies, showing little effort or understanding.
Grade Percentage Performance
A+ 97% to 100% Excellent
A 93% to 96% Excellent
A- 90% to 92% Excellent
B+ 87% to 89% Good
B 83% to 86% Good
B- 79% to 82% Good
C+ 76% to 78% Unsatisfactory
C 73% to 75% Unsatisfactory
C- 70% to 72% Unsatisfactory
D+ 67% to 69% More than unsatisfactory
D 64% to 66% More than unsatisfactory
D- 61% to 63% More than unsatisfactory
F 60% and below Unacceptable

Deadlines policy

Deadlines are mandatory. Homework and reading assignment reviews are due two weeks from the assignment date, unless otherwise specified. Late submissions will receive no credit unless you discuss the situation with the instructor in advance.

Assignments must be submitted before the stated deadline. For each day an assignment is late, the grade will drop by half a letter grade. Plan ahead and remember: done is better than perfect—it is always better to submit something than nothing. If you encounter difficulties, notify the instructor immediately rather than waiting until it is too late. In return, you can expect feedback from the instructor within a reasonable timeframe.