Matthias Niehoff
Modern Data Architectures need Software Engineering
#1about 2 minutes
The evolution from data warehouses to data lakes
Data architectures evolved from centralized data warehouses for BI reporting to data lakes that accommodate unstructured data for data science and machine learning.
#2about 2 minutes
Understanding the modern cloud data platform
Cloud data warehouses like Snowflake and Databricks enabled the shift from ETL to ELT and introduced the data lakehouse concept using open table formats like Apache Iceberg.
#3about 3 minutes
Solving centralization bottlenecks with Data Mesh
Data Mesh applies domain-driven design principles to data, promoting decentralized ownership, data as a product, a self-serve platform, and federated governance to avoid central team bottlenecks.
#4about 1 minute
Why data engineering needs software engineering discipline
As data systems become production-critical, the Python-heavy data ecosystem requires rigorous software engineering practices beyond simple scripting to build reliable, maintainable software.
#5about 1 minute
Implementing unit, integration, and data quality tests
Effective data pipelines require a multi-layered testing strategy, including unit tests for logic, integration tests for system connections, and runtime tests to validate data content and quality.
#6about 3 minutes
Managing complex data environments for development and testing
Creating separate dev, test, and prod environments for data is challenging because development often requires access to production-like data, raising issues of data replication, cost, and anonymization.
#7about 5 minutes
Using the Modern Data Stack and DBT for transformations
The Modern Data Stack applies DevOps principles to data, with tools like DBT (Data Build Tool) enabling engineers to manage data transformations with version-controlled SQL, automated testing, and CI/CD.
#8about 4 minutes
Using data contracts to stabilize data integration
Data contracts act as a formal API-like agreement between data producers and consumers, ensuring schema stability and data quality by making breaking changes explicit and enforceable in CI/CD pipelines.
#9about 2 minutes
Building a company-wide data culture and literacy
Fostering a strong data culture through initiatives like data bootcamps helps all employees, including non-technical ones, understand the value of data and the importance of data quality.
#10about 4 minutes
Modern data architectures and the reality of team size
Modern data architectures can range from simple setups using DuckDB to complex cloud platforms like Databricks, but it's crucial to remember that data teams are typically much smaller than software teams.
Related jobs
Jobs that call for the skills explored in this talk.
Wilken GmbH
Ulm, Germany
Senior
Kubernetes
AI Frameworks
+3
MARKT-PILOT GmbH
Stuttgart, Germany
Remote
€75-90K
Senior
Java
TypeScript
+1
Matching moments
01:54 MIN
The growing importance of data and technology in HR
From Data Keeper to Culture Shaper: The Evolution of HR Across Growth Stages
03:28 MIN
Shifting from talent acquisition to talent architecture
The Future of HR Lies in AND – Not in OR
03:39 MIN
Breaking down silos between HR, tech, and business
What 2025 Taught Us: A Year-End Special with Hung Lee
02:20 MIN
The evolving role of the machine learning engineer
AI in the Open and in Browsers - Tarek Ziadé
05:12 MIN
How to build structure and culture without killing agility
From Data Keeper to Culture Shaper: The Evolution of HR Across Growth Stages
02:39 MIN
Establishing a single source of truth for all data
Cat Herding with Lions and Tigers - Christian Heilmann
09:00 MIN
Navigating the growing complexity of modern CSS
WeAreDevelopers LIVE – You Don’t Need JavaScript, Modern CSS and More
03:28 MIN
Why corporate AI adoption lags behind the hype
What 2025 Taught Us: A Year-End Special with Hung Lee
Featured Partners
Related Videos
Enjoying SQL data pipelines with dbt
Matthias Niehoff
The Data Mesh as the end of the Datalake as we know it
Mario Meir-Huber
Data Science on Software Data
Markus Harrer
Modern software architectures
David Tielke
How building an industry DBMS differs from building a research one
Markus Dreseler
Blueprints for Success: Steering a Global Data & AI Architecture
Dominik Schneider
Empowering Retail Through Applied Machine Learning
Christoph Fassbach & Daniel Rohr
The AI-Ready Stack: Rethinking the Engineering Org of the Future
Jan Oberhauser, Mirko Novakovic, Alex Laubscher & Keno Dreßel
Related Articles
View all articles.gif?w=240&auto=compress,format)



From learning to earning
Jobs that call for the skills explored in this talk.

Bluewave Select GmbH
Azure
Knockout.js
Data analysis

Datamics Gmbh
€52K
API
Python
Microservices
Continuous Integration

Deutsche Wohnen AG
Berlin, Germany
Remote
Azure
T-SQL
Python
Data Lake
+4

Microsoft Deutschland GmbH
Azure
MySQL
DevOps
PostgreSQL
Data analysis
+1

Stackable
Java
HBase
Spark
Kafka
DevOps
+5

Smart Future Campus GmbH
Bamberg, Germany
ETL
JSON
Azure
NoSQL
Scrum
+1

Finanz Informatik GmbH & Co. KG
Hannover, Germany
Remote
Java
Python
Data analysis

