The course covers the architecture and design of modern big data systems from a data modeling and data management perspectives. Topics includes centralized vs distributed data systems, NoSQL and in particular wide-column database systems and Cassandra, storage strategies, denormalization modeling, stream processing, data warehouse and more.
The goal of the course is to provide the theoretical as well as the practical hands on knowledge required for designing and developing internet scale based data applications.
The class meets once a week for a 3 hours lecture.
There will be 2-3 homework assignments (in pairs, some of which will involve programming) - 45% of the final grade.
Final exam - 55% of the final grade.
* tentative due to Iron Swords war
# | Date | Topics | Material | Notes |
---|---|---|---|---|
1 | 28.10.2025 | Introduction | ||
2 | 04.11.2025 | Relational DB | ||
3 | 11.11.2025 | Relational data modeling | HW#1 distributed | |
4 | 18.11.2025 | Distributed DB, CAP theorem, NoSQL | ||
5 | 25.11.2025 | Dynamo | HW#1 due | |
6 | 02.12.2025 | Bigtable | ||
- | 09.12.2025 | No class this week | ||
7 | 16.12.2025 | Cassandra - Intro | ||
8 | 23.12.2025 | Cassandra - Advanced | ||
30.12.2025 | Cassandra - Hands on | HW#2 distributed | ||
10 | 06.01.2026 | Data modeling in NoSQL | ||
11 | 13.01.2026 | Data modeling in NoSQL - Advanced | HW#2 due HW#3 distributed |
|
12 | 20.01.2026 | TBD - Advanced topics | ||
- | 27.01.2026 | HW#3 due | ||
- | ? | Final Test |