Data engineering salon. News and interesting reads about the world of data.
An addition to the long list of unused ML projects: do the simplest solution first, modeling is never done, failure is okay.
Deploying is hard. Deep learning is deceptively easy. Go for prebuilt as much as possible. Understand networking and scale. Iterate quickly.
Common sense will emerge only when a connectionist like system will have a chance to develop the internal symbols to represent the relationships in physical world.
If you have to choose between engineering and ML, choose engineering. It’s easier for great engineers to pick up ML knowledge, but it’s a lot harder for ML experts to become great engineers.
Realize that there's tons of misconception in the world. Adapt solutions to your particular use case. Your solutions are not any worse than the ones on the internet.
A journey step-by-step.
The underlying hardware plays a significant role in the performance of an Elasticsearch cluster. Provisioning larger data nodes will yield better performance as compared to the smaller default nodes currently used in production. Furthermore, a cluster with more shards will perform better on larger data sets.
For site owners who just need basic traffic numbers, GoatCounter and Plausible both seem like excellent options. Those who like more visual polish and documentation might prefer Plausible; those who value a more developer-friendly tool with easy self-hosting will probably prefer GoatCounter.