Learn what a data lake is – an alternative to a data warehouse, designed to store structured, semi-structured and unstructured data at any scale.
Data lakes are optimized for big data processing and analytics, often using formats like Parquet or Iceberg with cloud object storage (S3, GCS, Azure Blob). They are ideal for data science, machine learning, and storing raw event data before transformation.
This video covers when to use a data lake vs a data warehouse, the rise of the lakehouse architecture which combines both, and how modern platforms blur the line between lakes and warehouses.