Simon Späti on Feedspot

Data Modeling - The Unsung Hero of Data Engineering: Architecture Pattern, Tools and the Future (Part 3)

Simon Späti

by

11M ago

Welcome to the third and final installment of our series “Data Modeling: The Unsung Hero of Data Engineering.” If you’ve journeyed with us from Part 1, where we dove into the importance and history of data modeling, or joined us in Part 2 to explore various approaches and techniques, I’m delighted you’ve stuck around. In this third part, we’ll delve into data architecture patterns and their influence on data modeling. We’ll explore general and specialized patterns, debating the merits of various approaches like batch vs ..read more

Visit website

Data Modeling – The Unsung Hero of Data Engineering: An Introduction to Data Modeling (Part 1)

Simon Späti

by

11M ago

Amidst the excitement and hype surrounding artificial intelligence, the significance of data engineering and its critical foundation—data modeling—can often be overlooked. This article is the first in a three-part series that will shine a spotlight on the fascinating world of data modeling, delving into its crucial importance within the broader context of data engineering. We will explore the history of data modeling, pioneered by visionaries like Kimball and Inmon, and discuss the necessity for a comprehensive understanding of data architecture in today’s data-driven world ..read more

Visit website

Modern Data Stack: The Struggle of Enterprise Adoption

Simon Späti

by

11M ago

In part I, The Open Data Stack Distilled into Four Core Tools, we discussed how to quickly set up a data stack, tackling end-to-end data analytics challenges. As a manager or developer working with data at a mid- to large-sized enterprise, you might ask why aren’t we using any of these tools. In this article, we dive into what mid-to-large-sized companies are using instead, the struggle of setting up a Modern Data Stack (MDS) for an enterprise size, and the opportunities of a free-of-charge and open-source MDS ..read more

Visit website

The Rise of the Semantic Layer

Simon Späti

by

11M ago

A semantic layer is something we use every day. We build dashboards with yearly and monthly aggregations. We design dimensions for drilling down reports by region, product, or whatever metrics we are interested in. What has changed is that we no longer use a singular business intelligence tool; different teams use different visualizations (BI, notebooks, and embedded analytics). Instead of re-creating siloed metrics in each app, we want to define them once, open in a version-controlled way and sync them into each visualization tool ..read more

Visit website

Data Lake / Lakehouse Guide: Powered by Data Lake Table Formats (Delta Lake, Iceberg, Hudi)

Simon Späti

by

11M ago

Image by Rachel Claire on Pexels Ever wanted or been asked to build an open-source Data Lake offloading data for analytics? Asked yourself what components and features would that include. Didn’t know the difference between a Data Lakehouse and a Data Warehouse? Or you just wanted to govern your hundreds to thousands of files and have more database-like features but don’t know how? This article explains the data lake power and which technologies can build one to avoid creating a Data Swamp with no structure and orphaned files ..read more

Visit website

Data Orchestration Trends: The Shift From Data Pipelines to Data Products

Simon Späti

by

11M ago

Data consumers, such as data analysts, and business users, care mostly about the production of data assets. On the other hand, data engineers have historically focused on modeling the dependencies between tasks (instead of data assets) with an orchestrator tool. How can we reconcile both worlds? This article reviews open-source data orchestration tools (Airflow, Prefect, Dagster) and discusses how data orchestration tools introduce data assets as first-class objects. We also cover why a declarative approach with higher-level abstractions helps with faster developer cycles, stability, and a bet ..read more

Visit website

Building a Data Engineering Project in 20 Minutes

Simon Späti

by

11M ago

This post focuses on practical data pipelines with examples from web-scraping real-estates, uploading them to S3 with MinIO, Spark and Delta Lake, adding some Data Science magic with Jupyter Notebooks, ingesting into Data Warehouse Apache Druid, visualising dashboards with Superset and managing everything with Dagster. The goal is to touch on the common data engineering challenges and using promising new technologies, tools or frameworks, which most of them I wrote about in Business Intelligence meets Data Engineering with Emerging Technologies ..read more

Visit website

Data Modeling - The Unsung Hero of Data Engineering: Architecture Pattern, Tools and the Future (Part 3)

Simon Späti

by

1y ago

Welcome to the third and final installment of our series “Data Modeling: The Unsung Hero of Data Engineering.” If you’ve journeyed with us from Part 1, where we dove into the importance and history of data modeling, or joined us in Part 2 to explore various approaches and techniques, I’m delighted you’ve stuck around. In this third part, we’ll delve into data architecture patterns and their influence on data modeling. We’ll explore general and specialized patterns, debating the merits of various approaches like batch vs ..read more

Visit website

Data Modeling – The Unsung Hero of Data Engineering: Modeling Approaches and Techniques (Part 2)

Simon Späti

by

1y ago

In case you missed Part 1, An Introduction to Data Modeling, make sure to check first, where we discussed the importance of data modeling in data engineering, the history, and the increasing complexity of data. We have also touched upon the significance of understanding the data landscape, its challenges, and much more. As we delve deeper into this topic, Part 2 will focus on data modeling approaches and techniques. These methods play a vital role in effectively designing and structuring data models, allowing organizations to gain valuable insights from their data ..read more

Visit website

Data Modeling – The Unsung Hero of Data Engineering: An Introduction to Data Modeling (Part 1)

Simon Späti

by

1y ago

Amidst the excitement and hype surrounding artificial intelligence, the significance of data engineering and its critical foundation—data modeling—can often be overlooked. This article is the first in a three-part series that will shine a spotlight on the fascinating world of data modeling, delving into its crucial importance within the broader context of data engineering. We will explore the history of data modeling, pioneered by visionaries like Kimball and Inmon, and discuss the necessity for a comprehensive understanding of data architecture in today’s data-driven world ..read more

Visit website

Follow Simon Späti on FeedSpot