Big Data Systems

2022-2023, Semester A

School of Computer Science
Tel Aviv University

Course Information

Course Staff

Course Goal

The course covers the architecture and design of modern big data systems from a data modeling and data management perspectives. Topics includes centralized vs distributed data systems, NoSQL and in particular wide-column database systems and Cassandra, storage strategies, denormalization modeling, stream processing, data warehouse and more.
The goal of the course is to provide the theoretical as well as the practical hands on knowledge required for designing and developing internet scale based data applications.

Course Format

The class meets once a week for a 3 hours lecture.
There will be 2-3 homework assignments (in pairs, some of which will involve programming) - 45% of the final grade.
Final exam - 55% of the final grade.

Course Requirements

  • Data structures (0368-2158) or Data structures and Algorithms (0512-2510)

Course Schedule and Slides

# Date Topics Material Notes
1 25.10.2022 Introduction Hello, World!
Introduction to Big Data
Introduction to Relational DB
- 01.11.2022 No class this week (Election Day)
2 08.11.2021 Relational DB SQL
Relational Data Integrity
MySQL CLI
3 15.11.2022 Relational data modeling Relational modeling
MySQL workbench
HW#1 distributed
4 22.11.2022 Distributed DB, CAP theorem, NoSQL Introducrtion to Distributed DB
CAP theorm
NoSQL
5 29.11.2022 Dynamo Dynamo
Dynamo (Extra)
HW#1 due
6 06.12.2022 Bigtable Bigtable
7 13.12.2022 Cassandra - Intro Cassandra - Intro
8 20.12.2022 Cassandra - Advanced Cassandra - CQL
Cassandra - Advanced
9 27.12.2022 Cassandra - Hands on Astra DB
Cassandra - Java Driver
HW#2 distributed
10 03.01.2023 Data modeling in NoSQL Denormalization
Data Modeling in NoSQL - Intro
11 10.01.2023 Data modeling in NoSQL - Advanced Data Modeling in NoSQL - Advanced
Data Modeling in NoSQL - Examples
HW#2 due
HW#3 distributed
12 17.01.2023 Data warehouse (BigQuery) BigQuery (Google)
- 24.01.2023
HW#3 due
- 14.02.2023 Final Test