Pipeline Data Engineering Academy home blog pages letters

The Data Janitor Letters - January 2022

Data engineering salon. News and interesting reads about the world of data.

We’ve only scratched the surface of the full potential for the data warehouse
Mikkel Dengsøe, Head of Data Science, Operations & Financial Crime, Monzo Bank

Why I think the data warehouse will become the control centre for modern companies


Git, SQL, CLI
Vicki Boykis, Machine Learning Engineer, Automattic

I’ve narrowed it down to three basic tools.


One Year of dbt
Adam Boscarino, Director of Data Engineering, Devoted Health

Over the last year, dbt has become a key piece of the data platform at Devoted and lived up to our wildest hopes and dreams. We have gone from a single proof-of-concept to 1,100+ models.


Why OpenMetadata is the Right Choice for you
Suresh Srinivas, Co-Founder, Collate

Is OpenMetadata pull-based, push-based, or hybrid? Again, all systems are mainly pull-based to integrate with metadata sources. We support push-based ingestion when it is possible.


The future history of Data Engineering
Matt Arderne, Product Engineering, Data & Analytics, focal

On Data Engineers and their place in a Data SaaS world.


Airflow Alternatives: A Look at Prefect and Dagster
Pedram Navid, Head of Data, Hightouch

We take a deep dive into Airflow, Prefect, and Dagster and the differences between the three!


Modern Data Stack: Which Place for Spark?
Furcy Pin, Lead Data Engineer, Younited Credit

This makes data-lineage more difficult, since dbt only lets us visualize the BigQuery part, while our internal “dbt for pySpark” tool only lets us see the pySpark part.


Why I Quit Data Science
Nirant Kasliwal, Analytics Engineer, Sundial

Question from a friend: I am interested in knowing how did you come to this decision of moving to SWE from DS/MLE.