BIOS668, Spring 2026

Course Title: Statistical Methods for High-throughput Genomics Data II

Instructor: Mikhail Dozmorov
Department: Biostatistics, VCU
Credits: 3
Duration: 15 weeks (2 lectures per week, 1 hour 20 minutes each)

Course Overview

Welcome to BIOS 668. This course is the second part of the Genomics curriculum, aimed to introduce core principles of Data Science, Genomics, Bioinformatics and Biostatistics. This is a blended course that combines in-class learning (lectures, labs) with self-directed activities.

The study of genomics and use of next-generation sequencing are at the forefront of biomedical research. Sequencing market is constantly evolving, stimulating the development of new genomics technologies, biostatistical approaches and software tools. Therefore it may not be possible to maintain a stable analysis pipeline throughout a project because the lifetime of software often spans months and even years. To be able to effectively analyze and interpret genomic sequencing data, it is crucial to (1) understand the technologies and statistical properties of the data they produce and (2) develop strong computational skills that will be competitive and broadly applicable.

This course is a continuation of the BIOS 658 and will introduce high-throughput genomic assays including single cell RNA-seq and spatial transcriptomics, whole genome sequencing, alignment, and genome variation analysis, miRNA-seq, metagenomics, epigenomic analysis including ChIP-seq and methylation assays, and chromatin conformation capture technologies. The course will develop applied skills for multi-omics data analysis using Unix, high-performance computing, and R environments. The course is primarily focusing on human genomics; however, knowledge and skills gained through the course are extendable on genomics of model organisms.

The class will be conducted in person and include lecture and coding parts. Course material will be publicly available. The syllabus is subject to change. Observe the VCU Honor Pledge in any class- and homework activities.

Prerequisites

BIOS 658 (see Bulletin), or as permitted by instructor.

Hardware
- A laptop, Mac or Linux OSs are recommended.
Software
- Unix environment. Windows users - install WSL, Windows Subsystem for Linux.
- Git and GitHub
- R for Windows or Mac. Review Getting Used to R, RStudio, and RMarkdown book, if necessary
- RStudio Desktop

Course Objectives

Understand the core principles, strengths, and limitations of high-throughput technologies
Learn statistical models and algorithms used in sequencing data analysis
Gain practical experience in high-throughput Exploratory Data Analysis, visualization, and quality control using Unix and R environment
Critically evaluate and interpret statistical methods used in flagship tools for sequencing data analysis
Interpret biological findings provided by different sequencing technologies, and be able to integrate different layers of -omics data

At the conclusion of the course, students will be able to collect, analyze and interpret multi-omics data using Unix and R programming environment

Tentative schedule

Unix overview
Genomic technologies and Bioconductor
Genome sequencing and alignment
Single nucleotide polymorphism (SNP) analysis
Copy Number Variant (CNV) analysis
Single-cell sequencing
Chromatin Immunoprecipitation (ChIP)-, DNAse-, ATAC-seq
Metagenomic
Methylation
Epigenomics
Chromatin Conformation Capture analysis
Integrative analysis, TCGA

Homework format

Assignments will be posted and should be submitted via VCU Canvas, https://learningsystems.vcu.edu/canvas/
Assignments should be submitted as reproducible reports in RMarkdown
Short summaries of Reading assignments should be organized in an RMarkdown report, maintained on GitHub
Final project should be submitted as a fully reproducible GitHub repository

Class rules

Attendance is required
Read all assignments before class
Bring your laptop and the book to every class

Grading Rubric

General Expectations (applies to all assignments)

When evaluating any assignment, the following points will be considered:

Punctuality and Completeness – Was the assignment submitted on time and fully completed, addressing all required components?
Organization and Presentation – Is the material presented clearly, logically, and in a format suitable for a professional/academic audience?
Effort and Application – Does the work demonstrate serious engagement with the material and effective use of the required tools (e.g., R, RMarkdown, GitHub)?

Discussions

Preparedness – Student demonstrates they have completed the assigned preparation (videos, readings, tutorials) before class.
Participation – Student is actively engaged during discussion (listening attentively, asking questions, contributing ideas, responding to peers).

In-class Exercises

Effort – Student makes a genuine attempt to complete the exercise and uses it as an opportunity to deepen understanding of the topic.
Collaboration – When working with peers, the student contributes meaningfully to group progress.

Final project

In addition to the above grading considerations, each student will grade one peer’s project, assigned randomly. The goal here is to learn from peer’s work while assessing its quality. Grading by students will be averaged with instructor’s grading.

In addition to the criteria above, the final project will be evaluated on:

Reproducibility – The project must be reproducible, with clear instructions, well-documented code, and an organized GitHub repository.
Depth and Rigor – The project should demonstrate mastery of statistical and computational methods in genomics, with thoughtful application to a real dataset or problem.
Communication – Results and interpretations should be presented clearly, with effective use of figures, tables, and narrative.
Peer Review – Each student will evaluate one randomly assigned peer’s project. Peer grades will be combined with the instructor’s grade. Peer evaluation should be constructive and based on the same rubric.

Assignment Values

Assignments Grade Overall

Your final grade reflects your performance throughout the semester. It includes readings, participation, in-class exercises (attendance is required), and assignments, weighted as follows:

Assignment	Percentage Value
In-class participation	20%
Reading and homework assignment	50%
Final project	30%
TOTAL	100%

Grading model

Grading for individual assignments reflects the overall quality of the completed work:

A – Excellent work demonstrating high quality, accuracy, and minimal need for revision.
B – Good work showing solid understanding and competence, with minor issues.
C – Satisfactory but below expectations; work has notable errors or omissions.
D – Poor work that fails to meet basic standards or demonstrates significant misunderstandings.
F – Unacceptable work with major deficiencies, showing little effort or understanding.

Grade	Percentage	Performance
A+	97% to 100%	Excellent
A	93% to 96%	Excellent
A-	90% to 92%	Excellent
B+	87% to 89%	Good
B	83% to 86%	Good
B-	79% to 82%	Good
C+	76% to 78%	Unsatisfactory
C	73% to 75%	Unsatisfactory
C-	70% to 72%	Unsatisfactory
D+	67% to 69%	More than unsatisfactory
D	64% to 66%	More than unsatisfactory
D-	61% to 63%	More than unsatisfactory
F	60% and below	Unacceptable

Deadlines policy

Deadlines are mandatory. Homework and reading assignment reviews are due two weeks from the assignment date, unless otherwise specified. Late submissions will receive no credit unless you discuss the situation with the instructor in advance.

Assignments must be submitted before the stated deadline. For each day an assignment is late, the grade will drop by half a letter grade. Plan ahead and remember: done is better than perfect—it is always better to submit something than nothing. If you encounter difficulties, notify the instructor immediately rather than waiting until it is too late. In return, you can expect feedback from the instructor within a reasonable timeframe.

Plagiarism and Copyright

It is a serious ethical violation to take any material created by another person and represent it as your own original work. Any such plagiarism will result in serious disciplinary action, possibly including dismissal from VCU. Plagiarism can involve copying text from a book or magazine without proper attribution, or lifting words, code, photographs, videos, or other materials from the Internet and presenting them as your own. Please ask the instructor if you have any questions about distinguishing acceptable research from plagiarism.

In addition to being a serious academic issue, copyright is also a legal matter.

Never “lift,” “borrow,” “appropriate,” or “repurpose” graphics, audio, or code without both permission and proper attribution. This guidance applies to scripts, audio, video clips, programs, photos, drawings, and other images, including those found online and in books.

Create your own graphics, seek out images that are in the public domain or shared via a Creative Commons license that allows derivative works, or use images from the AP Photo Bank or other sources for which the school has obtained licensing.

If you’re repurposing code, keep the original licensing intact. If you are unsure how to credit code, ask the instructor.

The exception is fair use: if your work analyzes or comments on the image itself, reproducing it may be acceptable. For more guidance on fair use, the Citizen Media Law Project is an excellent resource.

When in doubt: ask.

Observe the VCU Honor Pledge in any class- and homework activities

University-wide policies