Pass the Uber Data Scientist Interview- Do you have what it takes?

If you're a data scientist, you want to work at a company that has interesting data — Uber definitely fits the bill!

If you're a data scientist, you want to work at a company that has interesting data — Uber definitely fits the bill! The ride-hailing company has plenty of data to crunch — with 100+ million users of the Uber app in 785 locations and a multi-billion dollar marketing budget.

On top of their core business, they also count food delivery (Uber Eats), electric scooters (Lime) and helicopter flights (Uber Copter) amongst its services. With 14m rides completed per day and $50b in gross bookings, your analysis can affect decision-making at a gargantuan scale.

There have been signs that Uber's famous toxic work environment is improving since the departure of co-founder and previous CEO Travis Kalanick. Dara Khosrowshahi is confident Uber has the cash to ride through the recession, and though there are reports of a hiring freeze, you have to imagine Uber will resume aggressive expansion shortly after social distancing lifts.

So what does the job interview look like at Uber for a data scientist or analyst? We talked to someone, who interviewed for a position in their marketing team, to find out what the assessment looks like — decide for yourself if you'd pass.

The Format

In our candidate's case, this was presented as a take-home assignment, and they were given about a week to put together. This came after prior rounds of screening, and the aim is to assess the candidate's technical skills as a data scientist or data analyst.

The assessment works by posing a fictional scenario representing the type of work you'll do in a role at Uber, providing fictional or anonymized data for you to demonstrate your abilities. While not every company does something like this, it's extremely common practice and something you should get comfortable with.

We suspect the assessment will change over time and will be different for different teams or locations (even within the United States, expect the San Francisco team to care about different things to the one in Seattle or New York City. So take this exercise as a general example of what to expect — don't overfit to solving just this one problem!

These tests will be very different at other companies, like Google or LinkedIn, though the technical skills you're demonstrating in these exercises are widely applicable to any data science or analytics role.

Part 1: Marketing Channel Efficiencies

The candidate is provided with a data set from the fictional city of "Qarth" (hello Game of Thrones fans!), showing sign up and trip data of riders. You're told the average Lifetime Value (LTV) of a rider is $45, and asked the following:

1. Are we spending the right amount on paid marketing in this city?

2. Is the budget allocation across the sub-channels appropriate based on this data?

3. Propose a plan that will be put in place over the next 4 weeks, to help maximize marketing spend efficiency.

The Output: Please put together a max of 8 slides and be prepared to speak to them. Make sure to explain any assumptions you make.

Insider Tips

This is a pretty straightforward exploratory data task — you could even do it in Excel without cracking open a Jupyter Notebook. They're looking for a basic understanding of how to analyze data, and this would be a great place to demonstrate any domain experience you have working with marketing performance data. The key is segmenting cost per first trip by channel, channel group and platform, then seeing how performance is trending over time.

Part 2: Data Analysis

In the second fictional scenario, Uber's Driver team (i.e. the supply side of the business) is interested in predicting which driver signups are most likely to start driving. They provide a sample dataset of a cohort of driver signups from January 2015. It includes background information gathered about the driver and their car.

The task is to identify what factors are best at predicting driver activation, so you can make suggestions as to how to operationalize these insights to improve Uber's business.

You're expected to follow this process:

1. Perform any cleaning, exploratory data analysis and/or data visualization, and answer "what fraction of driver signups took a first trip?"

2. Build a predictive model to help Uber determine if a driver signup will start driving. Discuss your approach and include key performance indicators.

3. Briefly discuss how Uber might leverage the insights gained in your model to generate more first trips by drivers.

The Output: Please put together a max of 8 slides and be prepared to speak to them. Make sure to explain any assumptions you make.

Insider Tips

Though they don't say, this is a classification problem, and they're expecting you to recognize that, as well as have an opinion on what is the right model to use — Logistic regression, Naive Bayes, Random Forests, etc. You decision here should be made based on the predictive performance of the model, with an eye to avoiding over/underfitting the data. Pay particular attention to data cleaning — how do you handle scenarios where key attributes of a driver are missing?

Part 3: How can Uber best use data to drive marketing decisions

In this final section they're looking for a more general, bigger picture view on how the marketing team should be using data to make the best decisions. They're looking for the following information from you:

1. Provide a recommendation on monthly scorecard metrics that you believe we should review as a marketing team to measure performance.

2. Uber has been making good progress on understanding incrementality from marketing campaigns, please provide insights on how to measure incremental lift across channels.

3. Provide options for segmenting Uber's customer base. Provide 3 different segmentation approaches, explain their dimensions/attributes and what they would be most useful for?

The Output: Please put together a max of 10 slides and be prepared to speak to them. Make sure to explain any assumptions you make.

Insider Tips

The key question is on incrementality — this is a very sore issue for Uber after they sued their own ad agency for $40m when they discovered essentially no uplift from the activity (at odds with the reports they were getting from the agency). It's a sensitive subject so don't be the first to bring it up, but make sure you know all the technical aspects and can explain their importance.

How would you do?

Depending on how many years of experience you had as a data scientist, this could be trivial, or you might have no idea where to start. But at least now you know what to expect! There's something tangible about seeing real examples of a companies work, that cuts through the buzzword jargon you get in job descriptions. If you read this and felt confident, it's worth applying! If you were out of your depth, now you have an example of exactly what you need to learn.

This role was closer to what a data analyst does, rather than a data scientist — it could be done without Python (you can do regression analysis in Excel!) or SQL, it doesn't need to be done in real-time and you don't need to be a software engineer, know any computer science or use machine learning to do this task. 

The difficulty for even senior data scientists or data science managers in completing these assessments will be in getting the culture right. There will be specific topics that are extremely meaningful to the hiring manager or even recruiter, who in this case has probably been told "find me someone who understands attribution and incrementality" after the lawsuit debacle. It's looking out for these verbal or non-verbal queues in the interview process, and doing your own research that can give you the edge here.

Is your offer competitive?

Find out how much you’re worth and how to ask for more — the right way.