Data engineering salon. News and interesting reads about the world of data.
There are 70% more open roles at companies in data engineering as compared to data science.
The relational data model was introduced in 1970 and has dominated for 50 years. What led to its success? Building on first principles and Bushnell's law.
Now, I've got a GPT-3 instance that takes a plain English question and translates it to SQL that really works on my database.
Structure the project to allow for multiple teams to collaborate in a scalable manner, have the additional tooling to address the edge cases, and the optimize the continuous integration process to provide fast feedback and reduce costs.
Data engineers need to have a broad knowledge about ways of storing and processing data. Most of this knowledge will come with practice. But is still important to understand general ideas behind all concepts I've explained here.
I personally have never used a Kafka log as the source-of-truth for my data. Software development is hard enough as it is, even when trying to go “by the book.”
The goal is to keep development speed as close to the early days of JSONMutexDB, when you could recompile and run locally in a fraction of a second and deploy ten times a day.
Modern disks are so fast that system performance bottleneck shifts to RAM access and CPU.