Pipeline Data Engineering Academy home blog pages letters

The Data Janitor Letters - September 2021

Data engineering salon. News and interesting reads about the world of data.

Cloudflare’s Disruption
Ben Thompson, Stratechery

S3’s margin is R2’s opportunity.


Operations is not Developer IT
Mathew Duggan, DevOps Manager, GAN Integrity

It's not their fault, they were told this was easy.


How Big Tech Runs Tech Projects and the Curious Absence of Scrum
Gergely Orosz

A survey of how tech projects run across the industry highlights Scrum being absent from Big Tech. Why is this, and are there takeaways others should take note of?


Who are analysts, technically?
Bobby Pinero, CEO and Co-Founder, Equals

Rather than needing to be an impossible combination of statistician, developer, and business expert, analysts can simply be great critical thinkers.


Re-evaluating Kafka: issues and alternatives for real-time
Olivia Iannone Technical Writer at Estuary

Kafka’s challenges have exhausted many an engineer on the path to successful data streaming. What if there was an easier way?


A very detailed comparison of Python stream processing libraries
Mike Rosam, Cofounder and CEO, Quix

“To successfully use Flink in production you must invest serious resources … estimate more than 18 months.”


The First Rule of Machine Learning: Start without Machine Learning
Eugene Yan, Applied Scientist, Amazon

Having robust data pipelines and high-quality data labels also suggests you’re ready for machine learning.


Reaching MLE (machine learning enlightenment)
Vicki Boykis, Machine Learning Engineer, Automattic

Once, on a crisp cloudless morning in early fall, a machine learning engineer left her home to seek the answers that she could not find, even in the newly-optimized Google results.