This is the second part of the story about setting up a live remote coding workshop in the midst of an economic downturn. Part one is about the circumstances that led to all this, you can read it here. Part two is about how we've rocked the workshop itself and about the feedback we've received after.
When I started reading the responses from the eight participants of the Data Engineering Summer Camp to the feedback questionnaire I've shared with them right after the workshop, the feeling this whole situation sparked in me reminded me of the effect named after the 1950 Jidaigeki crime movie Rashomon by Akira Kurosawa. Karl G. Heider used the term "The Rashomon Principle" to refer to the effect of the subjectivity of perception on recollection, by which observers of an event are able to produce substantially different but equally plausible accounts of it. Even though the ten of us (including Daniel and myself) were not external observers of the Summer Camp but active participants of the event, when putting these ten individual recollections next to each other the data becomes a physical manifestation of this particular gestalt.
My point: whoever is reading this should keep in mind that this summary is inherently biased and there is no one correct interpretation of the event in question. What I am looking for are shared patterns in the multitude of witness perspectives involved that indicate a direction for the path to our own improvement.
But then you see a tweet like this from one of our happy students and you instantly forget about the semi-scientific approach you were about to force upon yourself:
First, let's go back to where we left off the last time.
Quick recap: between the 18-22nd May (Monday-Friday) we've held the first Summer Camp by Pipeline Academy with eight participants in the virtual classroom. The purpose was supporting people and businesses by teaching them a valuable competence through a hands-on project and learning about our own areas of improvement through the process.
The agenda was designed to cover the most important building blocks of setting up a simple data product: an introduction to data engineering, ETL basics, some SQL and deployment. Since we prefer actual building to just talking about building, the decision to focus on real implementation was quickly made. The below schedule reflects this attempt with 2.5 days of classroom experience aka 'campsite' (frontal teaching, live coding, group discussions, presentations and demos, Q&A) and 2.5 days of individual or team adventures (in duos) for implementing the data pipeline.
The week before the workshop the participants received homework. At first glance, the task was fairly simple and straightforward, but it included some challenges that could be solved in various ways, depending on what tools and methods a student prefers. Imagine you're asked to travel from point A to point B within a city: there are plenty of alternative ways of completing the task picking different routes, means of transportation, and the chosen way serves as a testament to the preferences and skillset of the traveler. We used this dry run to get a better understanding of what our cohort is comfortable with vs. what topics we need to put more emphasis on to achieve the desired outcome for the week.
The first morning was about getting familiar with each other, some chit-chat about our daily routines during corona and our different experiences with working with data. We've moved on to discussing what data engineering is and how the skills that make a data engineer relate to the participants' individual skillset. Here's an example: a data scientist usually has more advanced mathematical know-how than a frontend engineer, but the latter is likely to have seen or written more maintainable code (especially collaboratively in larger teams). The afternoon was spent with an overview of ETL procedures (extract, transform, load).
The first day was very exciting but exhausting as well. I was relieved that we've managed to capture the attention of the participants as they kept coming back the following days, without any measurable churn rate. Some of the students had to skip a couple of hours of classes during the week due to bureaucratic obligations and work emergencies, and sometimes people did not join the class instantly due to issues with their internet provider at home (Berlin, du bist so wunderbar). We've had one single person who pretty much gave up on delivering his solution (albeit staying on board until the end) as a result of unforeseen duties that hijacked their time and attention for the week.
Part of the plan was letting the hands of the students go so they can explore the newly learned methodologies and tools by themselves and combine it with their existing knowledge. Our aim was to enable independent work and push for applied creative problem-solving, while staying available for everyone for questions and support via chat. The student feedback confirmed our assumption: this was a highly productive segment of the week that accelerated the pace of learning rapidly.
It was remarkable how much interest there was in seemingly niche data engineering topics and obscure tools: some people were enthusiastic and some were a bit more skeptical about building a data pipeline with some unfamiliar puzzle pieces they were not accustomed to (i.e. SQLite), but most students were open to exploration and experimentation. Five days passed by in an instant.
As a closing event, the participants who have worked in teams had to present their data products on Friday. The solutions showed two examples of highly successful and productive collaborations, and two teams had hit roadblocks that could be clearly identified and discussed to pave the way for a late delivery. In addition to the course certificate, Daniel and I decided to gift Udemy courses to the students in order to make sure that they continue pushing themselves forward on the endless path of data engineering.
We've asked the students to fill out a survey so we get a better understanding of how they see the Summer Camp, you can read some of their sentiments in italic below. The learnings were many:
General concept: there is a large demand for data engineering skills on the market, and even for more seasoned veterans, finding the right resources to access and learn this know-how in a structured and methodologically proven way is almost impossible. We're on the right track.
The quality of the content and the delivery was rated very high across the board and our students had fun. Oh, you are asking about an NPS score you crazy marketing researcher: it's excellent, baby. Good stuff.
"Overall, very satisfied with the course, Daniel is a great teacher and this course was very valuable to me."
In-person interaction in a classroom vs. live remote learning via Zoom is something polarising, however the majority of students prefers personal contact. Reasons: higher productivity, more interactions with peers (aka camaraderie) and online education is just not as direct and immersive. At the same time, the remote live experience is the right way to go when it comes to a fallback plan in case COVID-19 is coming back for seconds in the winter.
Teamwork makes the dream work: we're strong believers of the value that lies in working in teams, and this is something we'll keep as integral part of our methodology.
"The team adventures helped me to see other ways to solve the same problem and I think this should be kept for the next courses for sure.", "They helped consolidate the learnings by rolling up your sleeves and solving challenges practically with fellow peers or by yourself.", "Yes, got unblocked a few times by my partner."
We need to focus more on defining clear expectations for the challenges and tasks: striking a balance between allowing room for creative problem-solving and defining clear outcomes is the name of the game for engineering exercises. This is something we're going to fine tune by the launch of our first cohort in the fall.
"I was demotivated at times. The instructions were too vague and I got lost in useless details (flask issue on pythonanywhere)."
Assessing the difficulty of the course based on our small sample is tough: depending on the student's level of experience, general mindset, life-circumstances outside of the summer camp and approach to self-assessment, students end up with a variety of sentiments. Fortunately a proper bootcamp experience with 10+ weeks of full-time education allows for more personalised mentoring.
"The course delivered a good blend between technical hands-on and practical aspects about the Data Engineering profession and roles. The course has a modern approach to combine the pragmatism of the data activities (e.g. reliability, well thought changes) with modern approaches (e.g. online deployment, orchestration, monitoring, etc)."
Curricula and roadmaps for getting started with data science are plenty, data engineering however requires a different type of didactics and a learning environment that mimics the engineering and collaboration processes at companies leveraging technology.
Adam, long time data science teacher has written up his experiences at the summer camp on his blog focusing on how this week's learnings enabled him to move forward with his own projects. Make sure to check out his post for a more technical POV (full disclosure: I've had the pleasure of working with Adam for a while and consider him a friend):
"As someone who has taught data science for a while, the most impressive thing was the simplicity of the stack. It is very easy when teaching to complicate things for students, or to teach complex tools that confuse more than help.
It can’t be understated the power of leaving after five days with a working product. The value of being able to see and interact with your data is huge, for spotting problems with your data pipeline to showing off to customers (or employers!)."
The Summer Camp was my major highlight of the lockdown: it turned out to be a productive validation for Pipeline Academy, and furthermore we've managed to support people and share valuable knowledge while staying true to our principles - transparency, collaboration, pragmatism and common sense.
I hope to see you at our campus in the fall.