Data engineering salon. News and interesting reads about the world of data.
I advise a lot of people on how to build out their data stack, from tiny startups to enterprise companies that are moving to the cloud or from legacy solutions. There are many choices out there, and navigating them all can be tricky.
If you choose not to use dbt, you’ll probably waste time building a less-fully featured, buggy implementation of it yourself. Give it a serious look.
We prefer boring but battle tested technologies over tech-fetishism.
2020 versions of both ClickHouse and Redshift show much better performance. However, open source ClickHouse continues to outperform Redshift on similarly sized hardware, and the difference increases as the query complexity grows.
Going out of the regular DS & ML job scope helped with delivering more value, faster.
Backwards compatibility keeps systems alive and relevant for decades.
Machine learning is a tool, not a panacea. If our tools don’t immediately work for them, it’s not their fault.