Project


Getting hands-on experience with state-of-the-art data systems!

Project Deadlines


Title       Due Date       Material
Project 0        02/05 (extended)       Project 0 Doc
Project 1        02/20       Project 1 Doc
Project Proposal        03/12
Mid-semester Report          03/26       Report Template
Preliminary Project Report        04/26
Final Project Report        05/05

Project 0

A quick dev. project to sharpen C++ skills and to prepare for the upcoming research/dev project.

Project 1

A small-scale implementation project on data systems.

Class Project

Every student should complete a semester-long class project. The students can decide between a systems project and a research project.

Useful links:

System Projects

A system project sharpens your systems skills and provides background on state-of-the-art systems, data structures and algorithms. For a successful systems project you will design and implement a systems component in C or C++, and you will deal with low-level system implementation details like memory allocation and management, cache-aware processing, parallel and concurrent processing and a deeper understanding of read/write performance trade-offs, and performance scalability. Systems projects can be carried out by one student or a group of two students.

This year we will have two topics for a systems project.

Project

Implementation of LSM-Trees

Implementation of a Bufferpool

Research Projects

A research project, on the other hand, aims at challenging the state-of-the-art. The goal is (i) either to better understand an open research problem through analysis and benchmarking, or (ii) to solve open problems through new designs and proof-of-concept implementations. The ultimate goal of a research project is to give a taste of research to students, and ideally lead to publications. When working on a research project, the student will interact with the instructor and the teaching assistants closely. Students will work in groups of three students.

We have a number of possible research topics below. The students can also propose their own project (subject to instructor's approval).

Subjects

Quantifying Write Amplification in LSM-based Key-Value Stores on SSDs

Range Deletes in LSM-Trees

Query-driven compaction in LSM-trees

Exploring the Optimal Compaction Strategy for A Given Workload

Finding the Optimal Granularity of Index

Evaluating Sorting Algorithms with Varying Data Sortedness

Measuring the Robustness of Modern Key-Value Stores

Boosting Join Implementation for Skew Correlation in Postgres

Benchmarking Large Graph Processing Systems

Benchmark Compression With Near Sortedness