It can be hard to break into data science - let's talk tactics.
First of all, what companies perceive as a "data scientist" can differ widely so before you even look at job postingsor talk to recruiters, let's talk about how you should interpret the job description. Let's start with an example:
Let's say the data science team at Expedia wants to emulate Pinterest's deep learning technology to recognize pictures of different hotels that are being uploaded by customers. There are millions of photos uploaded to Expedia every day, making it hard to display the best photos in reviews for each hotel. Some users may upload very similar photos of the interior of the hotel or the food served at the restaurant. Expedia wants the images comprehensively evaluated to provide a full review of the hotel. The team also thinks they can use machine learning to classify the photos and automatically include photos a specified category. To achieve this goal, you need to help the computer use a training set to identify which photos are the exterior of the hotel and which photos are just food.
The ⚙️data scientist is responsible for building the model, letting the machine create different picture categories, and extracting all relevant data types from the keywords of the photos and photo captions marked by the user. This is a a senior role on the team, usually managing the full lifecycle of data products, and solving data science problems from algorithm selection to engineering design.
The 🔩data engineer is responsible for building the system, acquiring and storing all picture data, and implementing the algorithm selected by the data scientist. This position requires superior technical skills, but does not require in-depth understanding of algorithm theory.
🎱Data analysts are responsible for querying data and showing the impact of business changes and answering questions like "How much traffic has the recent revision brought to Expedia?" A good data analyst is good at knowing what questions they need to answer to measure success of the product. In addition, the data analyst must communicate the results of the data analysis across multiple stakeholders. This is considered an entry-level position.
The interviews for each of these types of data science position are completely different.
If you want to land a job in data science, applying to job postings on LinkedIn is often not the way to go for an entry level role. Many of these roles, even if posted, will be filled by returning interns. Having a strong portfolio of data science projects and a strong referral into the job is often the best way to go, if you want to be the ideal candidate. An emerging trend for the junior data scientist is to also have your own website showcasing your projects.
Some tips on building a strong portfolio:
Your portfolio doesn't need to be its own website if you don't have the time or skills to develop it, especially early in your career path. A good place to start is by keeping a repository of all of the projects you may have done during your academic coursework, your thesis or dissertation (if you did one), and a thoughtfully selected group of 'learning' projects where the focus was almost entirely about learning a skill. Here are some examples of projects we really enjoy that fit the caliber of "portfolio worthy":
- Eluding Mass Surveillance: Adversarial Attacks on Facial Recognition Models ( in an academic paper)
For the "ultimate" example of a portfolio - Donne Martin's repo of jupyter notebooks on GitHub and http://davidventuri.com/portfolio It's insanely well done, even scarily so. A portfolio of this quality is not required (or anticipated) but having a way to showcase practical achievements and learnings is critical. We will dedicate an entire post soon on how to nail this.
If you’re short on projects you can participate in data science competitions. Platforms like Datakind and Kaggle allow you to work with real problems that you can later showcase to boost our standing for potential data scientist jobs and also to increase your skillset.
Take time to meet people in the data science community - entry level jobs with tech companies are all about referrals. A very small number of people get in straight from university recruitment. So get out there! Do meetups, conferences and other local events.
For conferences, Strata takes place worldwide in diferent cities. Speakers come from academia and private industry: the themes tend to be oriented around cutting-edge data science trends in action. Practical workshops are provided if you want to learn the technology behind data science, and there are plenty of networking events.
Find out how much you’re worth and how to ask for more — the right way.