View profile

Data Science Resources et al. | Next - Issue #22

Harshvardhan
Harshvardhan
Hi there!
Resources for budding data scientists are always helpful. I still remember my MBA class which started with the Harvard Business Review article, “Data Scientist: The Sexiest Job of the 21st Century”, which inspired a generation of data scientists. I am collating more articles and listicles in today’s letter in the same spirit. Apart from that, there is also an exploratory analysis of words in Wordle. Instead of listing four packages, I’m presenting an article that shows classic and one-hit packages on CRAN.
Let’s dive in.

Five Stories
Tidyverse is a vital package ecosystem for R users. Not many people are aware of its smaller functionalities: na_if(), select_if() and summarise_all() among others. These functions have been superseded by across(). This old article by Emily focuses on the ancestors, but you can easily follow the learnings with the new documentation for across().
These functions are helpful in exploratory analysis before you jump to actual application. Check it out!
Recently, the Ministry of Electronics and Information Technology, Government of India launched a wide array of skill-based courses collaborating with Nasscom. It aims to “bridge the huge talent gap at the entry-level and train existing workforce in the technologies of the future.”
IT-related courses are prepared by organisations such as Microsoft, C-DAC, Accenture, among others. Some soft-skill courses are presented by Harappa, etc. Some of them are free; the rest are reasonably priced. Check the catalogue here!
For beginners in data science, this listicle is a gold mine. It presents resources in three categories: programming, statistics and communication. Some notable mentions:
  1. The Unix Workbench by Sean Kross
  2. R Programming for Data Science by Roger Peng
  3. Section on People
I’m jumping to The Unix Workbench very soon. What are you picking up from the list?
You are likely familiar with Wordle. It is a simple game where you guess a five-letter word in five tries. That simple game hides many complex math theories, such as Information Theory and Entropy (see 3blue1brown’s video for it). Arthur Holtz reverse-engineered the list of 2,309 words to find the most common letters and improve his guesses.
This blog post contains codes and other details on his analysis of Wordle words in R. Check it out!
Are you looking for resources to break into the field of data science? Look nowhere else; your search stops here. This website is the one-stop-shop to find new content with reviews as well as bookmark content you like. The website is filled with curated lists of courses that people could follow and learn to be data scientists.
Here’s their link to find content and a list of recommended resources for beginners.
Four Packages
Today, I’m writing about forty packages instead of four. Joseph Rickert writes a monthly blog on R Views about the top 40 packages of that month. Last month, he published an article about CRAN’s “Golden Oldies” and “One Hit Wonders”. Golden Oldies are classics that continue to get a lot of air time. One Hit Wonders are old packages that never updated but continue to be popular.
Three Jargons
Memorylessness: Certain probability distributions do not depend on their past state to predict the future state; history doesn’t affect the future. The exponential distribution is the only continuous distribution with the memoryless property.
Convolutions of random variables: A convolution of n random variables is simply their sum. Here’s more meat on it.
Indicator functions: A function that takes value 1 when a condition is satisfied, and zeroes otherwise is called an indicator function.
Two Tweets
Michael Thomas
Listening to @VanishingData this morning, and can’t help but envision @heatherklus as the cat in this meme

#putrinprod https://t.co/MDaiVB0HLO
R-Ladies NYC
We're looking forward to welcoming @W_R_Chase this Thursday! 📊🗞️ There's still time to RSVP for his talk entitled "Behind the scenes of visual journalism": https://t.co/dGwIwi7idK
One Meme
That's a wrap!
Hope you enjoyed today’s letter. Share it with a friend or a collegue. See you next week!
Harsh
Did you enjoy this issue? Yes No
Harshvardhan
Harshvardhan @harshbutjust

A short and sweet curated collection of R-related works. Five stories. Four packages. Three jargons. Two tweets. One Meme.

Personal website: https://harsh17.in.

List of all packages covered in past issues: https://www.harsh17.in/nextpackages/.

In order to unsubscribe, click here.
If you were forwarded this newsletter and you like it, you can subscribe here.
Created with Revue by Twitter.
Knoxville, Tennessee.