If you are a CS561 student, please complete the CS561 LfA Attendance Survey.
Class: Tu/Th 12:30-1:45pm (CAS 116)
Instructor: Manos Athanassoulis
Lab: Fri 9:05-9:55am (CAS 213)
Teaching Fellows: Tarikul Islam Papon / JuHyoung Mun / Aneesh Raman
Office: MCS 106
OH: Tue/Thu 2pm - 3pm ET
Discussion on Piazza
TFs Office Hours: Posted on Piazza
Grades on Gradescope
Keep in mind the Official Semester Dates.
Here you can find the tentative schedule of the class (which might change as the semester progresses).
In this class we will discuss the basics of data systems and the goals and structure of the course.
In this class we discuss the fundamental components that comprise a database system. We will see the commonalities and the differences of the main database system architectures and we will discuss why we have several different ones.
In this class we continue discussing data systems architectures and the basics for modern systems focusing on relational row-stores and column-stores.
In this class the students will be introduced to the class semester project. In that process we describe in detail LSM-trees and we highlight open research problems in data management.
Concepts: column-stores, row-stores, vertical partitioning, index-only plans, materialized views, tuple reconstruction, late/early materialization, block iteration, vectorized execution (block iteration), compression (run length encoding), hash joins, index joins, sort-merge joins, invisible joins, star schema
Concepts: on-line transaction processing (OLTP), on-line analytical processing (OLAP), n-ary storage model (NSM), decomposition storage model (DSM), partition attributes across (PAX), flexible storage model (FSM), projectivity, selectivity, concurrency control, multi-version concurrency control (MVCC), two-phase locking (2PL)
In this class the instructor will provide the necessary background to indexing. We will describe the most common design principles and decisions of index strutures and provide the background needed for diving into the details of cutting-edge indexing papers.
Concepts: tree indexing, tries, radix, adaptive radix trees
Concepts: partitioning, horizontal partitioning, vertical partitioning, hybrid partitioning, zonemaps, tuple reconstruction, normalized schema, denormalized schema, clustering, use of clustering and feature extraction for partitioning
Concepts: adaptive indexing, cracking, stochastic cracking, hybrid cracking, scan, sort and binary search, adaptive adaptive indexing, radix partitioning, TLB, software managed buffers, non-temporal streaming stores, partitioning fanout, skew, adaptive indexing convergence rate, simulated annealing, uniform/normal/zipfian distribution
Concepts: bitmap indexing, bitvectors, fence pointers, out-of-place updates, query-driven merging, bitmap encoding schemes (RLE, BBC)
In this class the instructor will discuss modern hardware trends that drive system and index design with respect to storage, memories, and processing.
Concepts: multi-core, many-core, multi-socket, load balancing, skew resistance, context switching, non-uniform memory architectures (NUMA), pipeline breaker, elasticity, thread pool, just-in-time (JIT) code compilation, lock-free data structures, hyper-threading, translation lookaside buffer (TLB), open addressing, morsel-driven parallelism, dynamic hashing, outer join, semi-join, anti-join, radix join
Concepts: GPUs
Concepts: storage, asymmetry, concurrency, bufferpool, eviction policy
Concepts: non-volatile memories, indexing
Concepts: in-situ query processing, raw data files, adaptive partitioning, fine-grained indexing, query-based vs. homogenous partitioning, implicit clustering, eviction policy, workload shift, memory consumption
Concepts: array management systems, multi-dimensional arrays, storage manager, tiles, thread-safe, process-safe, atomicity, dense vs. sparse arrays, global cell order, fragments, dense vs. sparse fragments, consolidation
Concepts: physical design, machine learning, tuning knobs, database administrator (DBA), OtterTune, workload characterization, k-means clustering, knob identification, automatic tuner, feature selection, linear regression model, ordinary least squares, workload mapping (dynamic vs. static), configuration recommendation
Project Presentations
Project Presentations