Breaking The Jargons #4 August Edition

Parul Pandey
Machine learning pitfalls, cutecharts, Github features, and more…

Hi there!
Welcome to the fourth edition of the newsletter. In this newsletter, we’ll look at a cool visualization library, get some Kaggle tips from a Taiwanese Grandmaster and look at ways to use Github more efficiently. There is also a guide to help newcomers avoid some mistakes when using machine learning within an academic research context. Finally, as always, there are also a bunch of helpful resources mentioned.
📜 Articles
Cutecharts is a Python library that renders interactive and hand-drawn charts in Python. It basically mimics the xkcd charts library but in python. You might want to use them in your presentations or blogposts to jazz them a bit.
This article compiles some valuable tips and hacks that I have discovered over time while using Github. These have been gathered from various sources over time. I have filtered out the ones that were too familiar to avoid repetition. I’m sure you’ll find the list useful, and you might like to use them in your day-to-day work.
This article is a compilation of the various techniques to acquire datasets. For instance, there are some websites where you can find suitable datasets. Alongside, I have also listed ways to create your custom datasets too.
🎙️ Interviews
It was an absolute pleasure to interview Kunhao Yeh, a Kaggle competition Grandmaster from Taiwan. He shared how he initially wanted to become a professional GO player and later transitioned to machine learning and ultimately Kaggle. He shares that the people who were once his mentors became his colleagues.
🔬 Research Paper Summary
If you are beginning your journey in machine learning research, there are few salient points that, if kept in mind, will help you immensely. The following paper by Michael Lones is an excellent guide and a reminder to be aware of the pitfalls of machine learning for academic researchers. I have summarized the main points in the form of a poster below which you can download.
A summarization of the paper with permission from the author. Click on the image for a full resolution image.
A summarization of the paper with permission from the author. Click on the image for a full resolution image.
💡 Concept corner
Chris Moffitt’s “Effectively Using Matplotlib” is an excellent crash course into Matplotlib terminology and usage. This graphic from the same post is also super useful.
 source: Effectively Using Matplotlib
source: Effectively Using Matplotlib
🎁 Resource of the Month
The PyCon US 2021 conference tutorials, talks, workshops, summits, and more are available on their YouTube channel and contain some great sessions and presentations.
PyCon US 21 videos
PyCon US 21 videos
That is all for this edition. See you with another roundup next month.
Until next month,
Breaking down data science jargon, an article a time.

