Learnings from teaching data science

Learnings from teaching data science

A few weeks ago I taught a 4 week Python for Data Science class (find the curriculum here) for Girl Develop It, an organization that provides opportunities for women to learn software development and to achieve their technology goals (though men are allowed to take classes too).

The idea was to teach to people who had some basic knowledge of coding how to use Python for data analysis (and if there was time, data science too!). I'm passionate about teaching people new things (since I enjoy learning myself) and that includes taking the mystery out of data science. This particular class was the perfect opportunity to help beginner coders apply their existing knowledge to the world of data. And since learning is the name of the game, below are a few lessons I learned along the way:

  1. Don't be a perfectionist

    • It's so hard (impossible) to distill all you know into a few two hour classes. Some things will fall through the cracks (sorry matplotlib and pandas, you missed the cut) and that's okay. Focus on only the most important things and be sure to explain the foundational nature of the material you cover. If your students are eager to learn more, you will have equipped them with the tools to pursue knowledge on their own (or sign up for Level 2 of your course!).
    • Be clear on what you want them to learn. And then be ready to throw it out the window if you need to pivot. Be flexible with the course material. As a result, don't spend too much time creating the perfect curriculum. You'll have time to work on it between classes. It's fun to come up with the most creative and beautiful code examples. What sucks, though, is realizing 15 minutes into your first class that you either won't cover that material at this pace, or if you do, it'll go completely over their heads.

  2. Encourage participation

    • Unless you want them to fall asleep in class, prepare many in-class exercises. And make some of them really easy to get their spirits up.
    • Group work is just the best so encourage students to work on in class exercises together. In the first session people worked individually on the exercises and it was crickets all class. In subsequent sessions, I put them into teams of 3, and there was lots of talking, laughter, and people helping each other out. Students who were lost were able to participate and not feel left out. Stronger students reinforced their own knowledge by teaching their partners. It was Python bliss. Added bonus: Groups finished exercises earlier so we could cover more material.

  3. Use the real world

    • When teaching working professionals, they could care less about the theory. They want to solve problems. So it's best to motivate topics with case studies. Use real world datasets, examples from work, or ask the students about problems they are trying to solve at work and incorporate those into class. This takes more effort than pulling stuff from online tutorials or fabricating data, but hey, thats why they pay you the big bucks.

  4. Pick your battles

    • Accept the fact that you can't teach it all, so pick your battles. I wanted to teach them about all kinds of python iterables - lists, dicts, sets and tuples. But to perform most data science tasks, you need only understand lists and dicts. So I skipped the rest. And it turns out that in future classes we covered sets and tuples through examples.
    • Figure out clever and efficient ways to teach material when you're pressed for time. Example: instead of teaching them what a list is, lightly introduce them to vectors in the context of machine learning algorithms. Then by manipulating vectors, you'll necessarily have to cover basic lists and list comprehension. Digression - I love list comprehensions. The students weren't sure why I made such a big deal about them until we covered functional programming (map and reduce) and then string manipulations. You could see the light bulb moment go off. That's the best.

  5. Work the crowd

    • Learn your audience quickly! If they don't know math, skip linear algebra (as much as it may pain you to do so) and just show them how they can use sci-kit learn to run regressions. If they don't know regex, then write the regex code for them when doing string manipulation and point them to a resource later. The point is to solve problems, analyze outcomes and plant the seed that Python and data science make a nice partnership. It's not necessarily (in this course at least) about understanding all the nuances of how to do it and what goes on under the hood. That may be where your brain (i.e. my brain) goes but that's probably not what they want to learn. If they really want to, they'll do it on their own or in another class.

  6. You can't control how they feel

    • This was the toughest part for me. After the first class of 18 students, we got down to a steady state of about 14. So it goes right? There's a baseline level of churn and as instructors we shouldn't take that as a reflection of our teaching skills (how convenient right?). I assume that some people just didn't know what to expect out of the class and it wasn't for them. Some people couldn't make the times work. Perhaps some people didn't take kindly to my sense of humor. All you can do is just focus on your work and provide a great experience for those who stick around.

As cheesy as it sounds, knowledge truly is power. And it's an empowering feeling to be able to share and transfer that knowledge to a group of students eager to apply their skills to a new field, especially a field about which you yourself are passionate.


The ultimate cop-out.

The ultimate cop-out.

So many options!

So many options!