Pipeline Academy works with a number of selected organisations and supports them with in-house data engineering trainings and with guidance related to their data infrastructure. One of these companies is Ecosia, the search engine that plants trees. The (growing) data team of the Berlin-based startup is led by Zsofia Bognar, and we’ve had the chance to interview her about the challenges at the intersection of data and sustainability.
Peter: What is your professional background and what drove you towards a career in data?
Zsofia: I’ve been working in the data field for the last 12 years so it’s hard for me to imagine life WITHOUT data :) . But jokes aside, I think what drew me to work with data most is how by analysing data we can get closer to understanding the world around us. I studied Economic Sciences, a part of which was statistics. For a lot of people statistics is a really dry discipline, but I found it fascinating how with the help of statistical methods we can gain fresh insights in a concise way for just about anything.
P: You are currently in charge of the data team at Ecosia. What kind of value is generated from user data for the organisation and what kind of team is dedicated to data initiatives?
Zs: At Ecosia we are on a mission to build a digital green companion that plants billions of trees. We want to help our users make sustainable choices, and be a role model in the transition to a sustainable society.
As a data team we contribute to this mission by delivering data products in three main areas:
1) We focus on helping product teams make data-informed choices in line with their KPIs and their own team mission. We help them monitor and interpret their results. Advising and educating teams plays a big role in ensuring team success.
2) We create reports to evaluate different marketing efforts to understand what the most informative way to reach our users is. We not only want to attract new users but make actionable information available about the climate emergency in an approachable way and we as data team help the marketing team to identify these areas.
3) We build capabilities for the tree team to evaluate their tree planting projects, which in turn also delivers value to our tree planting partners since our tree planting officers are able to communicate transparently and quickly the impact and possible areas of improvement for each project. Our goal is to know about the state of every single tree we help to plant.
Currently we are a four person team: Arnaud, our tech lead, is a true data pipeline plumber who constantly improves our setup and finds better solutions. Nikki, our Product/Marketing Analyst, is the jack-of-all-trades in our team in its best meaning; knowing the company and business model inside out, identifying problems and working towards solving them. And we have our latest joiner, Elise, who is a really great data storyteller. She understands the needs of the stakeholders and how information can be meaningfully visualised for them
P: What are your teams biggest technical and organisational challenges?
Zs: We are a small team with a lot of stakeholders: currently we have 8 stakeholder teams. This means in the past year we had to completely change our mindset of what a data team is. Previously we operated as a service team where we had ad-hoc requests from the teams. With the resource constraints our backlog grew, the team became frustrated and our stakeholders became under-served. Balancing our legacy infrastructure and soaring stakeholder needs was a real struggle.
We understood this was not sustainable so we changed to a data product team mindset. Now we identify the product needs with user stories, and build our products around this. We introduced agile methods and work in sprints. We also educate our users on how best to use these products and integrate their feedback on the data UX design. We have seen that this has increased the velocity of the team but also data has integrated much better into the day-to-day of Ecosia.
Our main technical challenge is handling truly big data sets. There are almost no BI or ETL products or solutions that we can use out of the box. This means we are regularly dealing with challenges to optimise solutions for our needs along the entire data pipeline. For this reason, it’s hard for us to find partners who can usefully consult us on these challenges, which is why it was great for us to work with Pipeline Academy.
P: Ecosia acts very transparently when it comes to financial information and respects the privacy of users. How is this translated into the day-to-day operations of a data team?
Zs: We are being transparent both inside the team and towards our stakeholders as well. We have a data-as-code infrastructure so everything we produce is transparent and documented.
We are also showing things honestly as they are. We don’t have user groups with different levels of data access, everyone in Ecosia can see everyone else’s dashboards. We find this very effective as well as there is always some cross-over of which data is needed for the different teams and sharing them openly increases trust in the data.
P: If you would start out in data today, what kind of skills would you focus on in the first place?
Zs: I think understanding where one’s own strengths lie and how they fit into the data products that teams deliver. There are so many aspects of data products and all of them are crucial in producing great output. From understanding stakeholder needs to planning, execution, building the pipeline, designing the UX for users and actually understanding the trends. To master them all is almost impossible, so understanding your own growth path and fitting it together with the team needs is essential. Having good skills in SQL and python helps a lot though, and understanding tools that became industry standards like dbt or Airflow gives one a good advantage.
P: You've hired Pipeline Academy to support your team's move towards analytics engineering, a hot topic in the world of BI. What kind of outcomes do you expect from the upskilling training?
Zs: For us as a small and high velocity team it has been essential that we make the right tooling decisions and avoid some of the pitfalls those tools provide. So for us, trial and error is rarely a path we can take. We started to work together with Pipeline Academy because we knew Daniel and Peter have extremely in-depth knowledge of the tooling landscape and are also really embedded in the data community, so they can really well advise us on datatools across the entire pipeline. I also like how they didn’t offer us a solution as the Holy Grail but gave us an evaluation framework by which we can make these decisions ourselves.
P: I understand your team is growing, and you are looking for data engineering talent. What kind of hard and soft skills do you think makes a newcomer in data stand out?
Zs: We need someone who understands the principles of data infrastructure as code and can help us to be even more reliable towards our users. We’d like to have someone with the mindset who sees data products in the holistic way we do in the team and doesn’t just work on engineering tickets. We would like to have someone who is excited about learning but also can mentor others in the team. It would be great to have someone who has experience with Airflow, dbt and with different cloud providers.