Tips for getting a Data Scientist Job
Apr 24, 2022
My opinion on how to become a DS
If you're reading this, chances are you have reached out to me on {LinkedIn/Twitter/Github/Email} asking how become a data scientist. or your have read my post on how to get a job.
Here are some important things to keep in mind:
- Understand that Data Science is made up term that has little consistent meaning
- Figure out what the specific "thing" you like about the DS job really is
- Become excellent at it whatever way works best for you
- Do it in a way that sets you up to get the job
Let's elaborate on each of the points above
What is a Data Scientist and what do they do?
This is seemingly an obvious question to start with; however, it is also the worst one. Hidden in the explanation of why this question is so bad lies a useful insight.
The honest truth is that the term Data Science is completely made up. Now you could say "Well Ravin everything is made up." But let's dig further. Let's compare Data Science to my previous profession, mechanical engineering. Mechanical engineering is a formalized process with long-running organizations that set standards for the field internationally, such as the American Society of Mechanical Engineers or ABET. While the idea of mechanical engineering is made up by people, there is a key group of influential people that are defining the "central identity" of Mechanical engineering. There is no such group for data science. Everyone is making it up, in whatever way suits their narrative.
On the abstract side Drew Conway was one of first to try and define Data Science with his popular Venn diagram.
While it's appealing in simplicity, it's quite unhelpful for newcomers, as it's not clear where to start or what any of these things included in the diagram actually mean.
Typical of data scientist writings someone then took a simple framework and made it even more complex, unfortunately making it even less useful.
On the other end some people try to strictly define data science by limiting its scope to exactly what they do. They then post their opinions on Twitter and LinkedIn and arguments ensue, which also doesn't help newcomers.
In my particular data science role, I don't currently use statistics. Or R. Or deep learning. But I also don't go around telling people those aren't useful in data science as though my job represents all data science work
— Ben Lindsay (@ben_j_lindsay) March 27, 2022
"But these are only individuals and their own views," you might say, "Companies will clearly be more consistent." Wrong again. Wrong again. Companies like Lyft write publicly about relabeling all data analysts to scientists, and all data scientists to research scientists with no change in expected responsibilities. Doordash has various data scientist roles but when you dig deeper you learn those are not alike at all. They are not interchangeable and have two separate organizational structures internally.
So how else can we frame the problem?
A subset of a set of skills
Because I have been knighted with the title Data Scientist, I'm legally obligated to provide my own take on this mess, or at least make it worse. Let's do that.
I tend to think of data science as a subset of a set of skills. This could include algorithm development, dashboarding, modeling fitting, programming in Python, programming in SQL, fitting black box models, data communication, deploying Spark cluster jobs, the emotional numbness of being asked to join two datasets with no common key, knowing how to not laugh or express shock when your stakeholders ask for models with 99% accuracy, writing blog posts or tweet threads with advice on how to become a data scientist (half-joking while doing it).
Now some people with data scientist LinkedIn titles spend their time squeezing out .001 more precision/recall/accuracy out of a black box model. Sometimes this role is titled Machine Learning Engineer. Some people find this kind of work to be fascinating. I think it's very boring, and when I was looking for a job, I would avoid these entirely.
Instead, I like working with stakeholders to help them understand a complex business and use data to make informed big-scale decisions. Sometimes this role is called Decision Scientist, and this is what I do at Google, where, surprise, my title is Data Scientist, which previously used to be called quantitative analyst. What matters to me though is that I get to do what I enjoy, which is learning things from data and sharing those learnings with others.
A suggested strategy
So let's focus back on you. It's no secret the junior data science field is completely saturated. That given here are my suggestions for concrete steps on how to get a DS job. This should help match you to the right company and put you ahead of many other candidates.
Step 1: Figure out what you like
Instead of trying to nail down a specific definition of DS start with trying to understand what about DS is interesting to you and once you find the things about DS that make you tick, and start excelling at those. It's a more focused strategy than trying to be good at everything which will ultimately prove impossible. If you go with "good at everything strategy" you won't be able to stand out in any interview as each company tends to look for specific skills and knowledge.
Step 2: Get good at it
Now that you've narrowed down the part of DS that interest you the most, figure out how to be great at it.
If you're into black box predictions, places like Kaggle are the obvious answer. If you really like academic algorithm development papers and algorithm competitions would be a good area to focus on.
For me that learning came in the form of Open Source contributions. By writing code and working with others I was able to more deeply learn the methods. It also helped because the networking and public display of value was built right in, which I write about in my How to get a job post
Step 3: Identify the adjacent skills
Find people that are excellent at what you want to learn and find places where they share their experience, whether it be writings, podcasts, conference talks etc. Write down all the skills and useful things they mention. For instance if they work on big prediction models are they also good at deploying them, or are they more focused on the internal mathematics? Do this again for multiple persons in your field of interest then see what other skills are commonly listed or mentioned and become proficient in those as well.
In my case I often noticed a stronger emphasis on communication, intuition, and an explanation versus model accuracy metrics. I also saw these folks contributing examples publicly in blog posts, talks, and open source notebooks. I now spend much of my time honing those skills, writing this blog post for example.
Step 4: Figure out what organizations value your skills
By this point you'll have a good sense for the code words that appeal to you. Look for job postings that specifically contain the thing you're interested in. During the interview be sure to ask how DS provides value to that team you're about to join. Even if you get rejected you'll have extra clues as to what to focus on for the next interview. Demetri Pananos provides additional advice on his blog.
Best of luck
Unfortunately the data scientist job search is more of a random search than a strict, precise process. Nonetheless, if you're dead set on getting a Data Scientist job I hope the advice you got here will help. And once you get a DS job be sure to send me a link to your blog post or tweet thread sharing your advice. At that time you too will become a true data scientist.