The Data Janitor Letters - August 2021

Data engineering salon. News and interesting reads about the world of data.

From Data Driven to Driving Data — The dysfunctions of Data Engineering

Many “data driven” initiatives are failing even though they had the best engineers on the task and picked the “best” stack of technologies.

What's an OLAP cube? 🎲
Claire Carroll, Analytics Engineer, analyticsengineers.club

OLAP cubes were this intimidating concept, and the more they read, the less they understood, but it turns out that they aren’t that confusing in practice. This is a trend that I’ve seen a lot in data engineering / data modeling where jargon is used as a gatekeeper.

Expecting Great Quality with OpenLineage Facets
Michael Collado, Staff Software Engineer, Datakin

Good data is paramount to making good decisions - but how can you trust the quality of your data and its dependencies?

A future for SQL on the web
James Long, Software Engineer, Stripe

Absurd-sql is a persistent backend for SQLite on the web. That means it doesn’t have to load the whole db into memory, and writes persist.

MinIO: A Bare Metal Drop-In for AWS S3
Mark Litwintschik, Big Data Consultant

MinIO offers an S3 gateway service that can allow you to expose Hadoop's distributed file system (HDFS) with an AWS S3-compatible interface.

No, we don’t use Kubernetes
Maik Zumstrull, Site Reliability Engineer, Ably

It also doesn’t make sense for a lot of companies that are currently going all-in on Kubernetes.

Evolution of search engines architecture
Julien Lemoine, Co-founder & CTO, Algolia

We look at some key milestones in the evolution of search engine architecture. We also describe the challenges those architectures face today.