CS 561

Data Systems Architectures

Class at a glance

Class: Mon/Wed 5:00-6:15pm (HAR 306)
Instructors: Tarikul Islam Papon 
Zichen Zhu 

Lab: Fri 5:00-5:50pm (HAR 306)

OH: Mon 9 - 10 AM (@CCDS 925)
OH: Thu 10 - 11 AM (@CCDS 925)

  • Midway Report Due date: March 22nd March 29th
  • Project 1 is released. Due date: Feb 16.
  • Project 0 is released. Due date: Feb 2.
  • Register for presentation by Feb 2.
  • No lab on Jan 19! First class on Jan 22.
  • Semester starts on Jan 18 - stay tuned for updates.

Class Milestones - Important Dates

Keep in mind the Official Semester Dates.

  • February 22, last day to drop (without a "W")
  • February 23, submit your project proposal
  • March 29, last day to drop (with a "W")

Class Schedule (tentative)

Here you can find the tentative schedule of the class (which might change as the semester progresses).

Class : Introduction to Data Systems and CS561

In this class we will discuss the basics of data systems and the goals and structure of the course.


Class : Data Systems Architectures Essentials – Part 1

In this class we discuss the fundamental components that comprise a database system. We will see the commonalities and the differences of the main database system architectures and we will discuss why we have several different ones.


Class : Data Systems Architectures Essentials – Part 2

In this class we continue discussing data systems architectures and the basics for modern systems focusing on relational row-stores and column-stores.


Class : Class Project Overview

In this class the students will be introduced to the class semester project. In that process we describe in detail LSM-trees and we highlight open research problems in data management.


A: Storage Layouts

Class : Log-Structured Merge (LSM) Trees


Class : Row-Stores vs. Column-Stores (student presentation )

Concepts: column-stores, row-stores, vertical partitioning, index-only plans, materialized views, tuple reconstruction, late/early materialization, block iteration, vectorized execution (block iteration), compression (run length encoding), hash joins, index joins, sort-merge joins, invisible joins, star schema


Class : Compaction in LSM Trees


Class : HTAP Systems (student presentation )

Concepts: key-value stores, point queries, blind updates, read-modify-write, on-line transaction processing (OLTP), on-line analytical processing (OLAP), locality, immutable file, mutable file, append-only systems, in-place updates


B. Indexing

Class : Introduction to Indexing, Trees & Tries

In this class the instructor will provide the necessary background to indexing. We will describe the most common design principles and decisions of index structures and provide the background needed for diving into the details of cutting-edge indexing papers.


Class : Guest Lecture on Database Tuning: Andy Huynh


Class : Adaptive Radix Trees (student presentation )

Concepts: tree indexing, tries, radix, adaptive radix trees


Class : Guest Lecture on Sortedness-Aware Indexing: Aneesh Raman


Class : Adaptive Indexing & Cracking (student presentation )

Concepts: adaptive indexing, cracking, stochastic cracking, hybrid cracking, scan, sort and binary search, adaptive adaptive indexing, radix partitioning, TLB, software managed buffers, non-temporal streaming stores, partitioning fanout, skew, adaptive indexing convergence rate, simulated annealing, uniform/normal/zipfian distribution


Class : Guest Lecture on Table Discovery and Integration in Data Lakes: Aamod Khatiwada


C. Modern Hardware

Class : Modern hardware trends

In this class the instructor will discuss modern hardware trends that drive system and index design with respect to storage, memories, and processing.


Class : Data Processing with GPUs (student presentation )

Concepts: GPUs


Class : Guest Lecture on Relational Memory: JuHyoung Mun

Class : SSD-Aware Data Systems


D. Query Evaluation

Class : Join Optimization

Concepts: query processing, join optimization, instance-optimal algorithms


Class : BMI-based Query Optimization (student presentation )

Concepts: query processing, query evaluation, bit manipulation instructionson, predicate pushdown


Class : Guest Lecture on Delete-Aware LSMs: Subhadeep Sarkar

E. ML For Data Systems

Class : Learned Query Evaluation (student presentation )


Class : Learned Indexes (student presentation )


Class : Learning Data Layouts (student presentation )


Project Presentation

Class : Project Presentations A

Project Presentation - I

Class : Project Presentationa B

Project Presentation - II

Project Awards (by popular vote)


  • Most Engaging Presentation: “Benchmark Compression With Near Sortedness” by Harshitha Tumkur Kailasa Murthy, Vishwas Bhaktavatsala
  • Project with Highest Technical Depth: “Query-Driven Compaction in LSM-Trees” by Karatsenidis Konstantinos, Shubham Kaushik, Nishil Agrawal
  • Best Overall Project: “Range Deletes in LSM-Trees” by Jingyi Li, Ming-Han Hsieh, Yu-Cheng Huang
  • Honorable Mention: “Exploring the Performance of Data Compression Algorithms with Varying Data Sortedness” by Shivangi and Vani Singhal