Data engineering salon. News and interesting reads about the world of data.
Code reading is the single largest expense in software development, and very few even talk about it.
Your ROI is suffering from an inability to hire properly.
You bet.
In essence, GPT-2 has been a monumental experiment in Locke's hypothesis, and so far it has failed.
Ultimately, machine learning is about finding things that are similar to things the machine learning system can already model.
Are you using #postgres via #docker for mac? Have you ever noticed EXPLAIN ANALYZE slowing down your queries by like 60x? The important takeaway is that our modern stacks are incredibly complex and fragile.
A single ClickHouse server can be used to collect and monitor temperature data from 1,000,000 homes, find temperature anomalies, provide data for real-time visualisation and much more. Since it is a single server, setting it up, loading 500B rows and running sample queries is very easy.
After running a simple script based in using Scikit-Learn, I noticed there’s some latent vulnerabilities not only in terms of objects but also in regarding to have a proper security mindset when we’re developing ML models.
My interest here is in seeing the performance differences between using PostgreSQL with a B-Tree index versus ClickHouse and its MergeTree engine for this use case. The performance gap in the hourly lookup rate favouring ClickHouse is off by an order of magnitude.