View profile

Bring the best of ML according to your needs, with Pavel Dmitriev (VP Data Science @ Outreach.io)

Bring the best of ML according to your needs, with Pavel Dmitriev (VP Data Science @ Outreach.io)
By MirData.Report • Issue #2 • View online
In this issue, we bring you diverse ideas and tweaks to get things done for your business by applying the Machine Learning approach. Our speaker for these five talks here is Pavel Dmitriev. 
Pavel is currently the VP of Data Science at Outreach.io He also has been working in Microsoft for 8 years, as a principal Data Scientist. He’s taken up some impressive projects as enabling and scaling experimentation as part of Microsoft.
During our interview, Pavel shared why Data Science (and machine learning frameworks in particular) play such a critical role in determining the success of any current Data-heavy projects. 
Like what is the playbook for connecting Marketing, HR, and Product teams with Data people? 
Or how the Data Science approaches can be applied to any business area? He gives a great example about ML helping understand the target audience and structure process of customer development. 
Bottom line, I’m sure that you will find Pavel’s practical knowledge he’s sharing to be extremely valuable.

Setting goals and understanding the level of DS qualification your product needs can rapidly increase the development speed
- How to maximize the value of the role of Data Scientist in your product?
- How to determine what kind of Data Scientist you need and then hire the right one?
- What are the differences between a Researcher, an Applied Scientist, and a Practitioner?
What are the characteristics of the 3 different types of Data Scientists?
What are the characteristics of the 3 different types of Data Scientists?
Data availability as one of the most uncertain problems in Machine Learning product development.
Handling Machine Learning projects is not only about building algorithms and getting your product to market on time. One of the hardest parts is having your Data structured and ready to use. 
There is a big difference in saying ‘data available in theory’ vs. ‘data available in practice’. Yes, building an ML algorithm is challenging, but if you do not have the right quality of Data available, the project can shoot out of budget, besides delays. 
While you can have all Data available in theory, it may be raw and unstructured, lying across disparate data sources. Getting it together for practical use is what matters, says Pavel.
Is data availability really a big problem in Machine Learning?
Is data availability really a big problem in Machine Learning?
Data Science approach as a way to better understand your target audience
To drive sales we need to know our audience, our competitors, our USPs, and client’s feedback. We need Data to make educated decisions that influence our business. To get it and be confident about its quality we need to ask questions.
But what else if not Science can teach us to ask the Right Questions? That is one of the topics we talked with Pavel about.
Why do you need to ask the right questions before building a Machine learning model?
Why do you need to ask the right questions before building a Machine learning model?
Feedback Loops: Continuous Training Data influx in Machine Learning
User data input is what really keeps the model up to date. Without a feedback system that influxes the Data and updates the inputs with training Data, the model may lose significance fast.
Pavel provided an insight into one of the most unexplored areas in data science: establishing feedback loops. Here follows major points he mentioned during the talk:
- How to organize continuous input of training date to your ML model?
- How to get users’ information automatically?
- How and why looping Data can help tackle major problems and why training Data bears so much significance?
How do you gather the most training data by engineering the Product UX?
How do you gather the most training data by engineering the Product UX?
How to improve by more than 2x the Machine learning building lifecycle?
Building a feedback loop means that when you actually release your solution to users, maybe as part of the product, or internally, there is a way to collect the usage data and then turn that usage data into training data automatically.
And you always want to have more of it, not just to produce a higher quality model, but also because the world changes.
Thats why any business wants a continuous influx of fresh training data, which can be used to keep updating the model and the best way is to get it automatically.
How to improve by more than 2x the Machine learning building lifecycle?
How to improve by more than 2x the Machine learning building lifecycle?
Need some more Data Science insights?
Just in case you don’t want to wait for the next newsletter and get all interviews firsthand - then subscribe to our Linkedin page. We’re publishing two-three new Data Talks every week!
Did you enjoy this issue?
MirData.Report

Latest information from the data science world

In order to unsubscribe, click here.
If you were forwarded this newsletter and you like it, you can subscribe here.
Powered by Revue