How Duolingo Builds Its Data Science Methodology

How Duolingo Builds Its Data Science Methodology

With more than 300 million users completing more than seven billion language-Learning exercises each month, Duolingo offers an example of data science methodology in action.

For business leaders and other powers that be within an organization, data science can be a mysterious, almost-magical tool: They don’t necessarily understand it, but in it they see the possibility of answering any question they have about their users, business, revenue or product.

So when a request lands on the desk of a data science team, it can sometimes betray the author’s lack of understanding around its viability — or lack thereof.

Data science methodology is the task of crafting a project that answers the needs of customers and colleagues alike and requires a deep understanding of a request and the motivations behind it. Following a rigorously defined methodology, a well-run data science team will then craft a project plan that defines and collects the type of data they need and then prepares and models the data to enhance their understanding of the insights contained therein. Finally, it must be ready to deploy the requested tool or feature with the expectation that feedback processes will probably require some post-production tweaks or updates.

This, of course, is your average data scientist’s bread and butter. But what does this process actually look like in practice?

With more than 300 million users completing more than 7 billion exercises each month, language-Learning platformDuolingooffersan example of data science methodology in action. Not only do the company’s enormous databases inform tweaks to Duolingo’s user experience and underlying infrastructure all the time, but the company’s data science teams conductregular researchinto everything fromoptimizing reminder notificationsto theories on how to improve teaching practices and outcomes for learners ofindigenous languages.

Duolingo’s data science methodology underpins much of this work. To learn more about the nuts and bolts of how a project moves from an amorphous idea to a usable tool or valuable insight, Lead Data Scientist Erin Gustafson — one of RE•WORK’s Top 30 Women Aiding AI Advancement back in 2019 — took us through her team’s best practices.

What are your team’s best practices when designing your data science methodology for a new project?

Our number one best practice is a project kickoff process that we’ve been honing over time. Most of our projects go through this process, which involves drafting a kickoff document and scheduling a meeting with key stakeholders to discuss the plan. We’ve found that both phases of this process add a ton of value.

At the doc phase, data scientists work with their managers and team leads to define the goals, requirements, key stakeholders, technical approach and timeline for the project. This phase forces us to do the important foundational thinking for a project so we can make sure we have the data we need — more than once, the kickoff process has helped us realize we don’t — and that the project has high ROI.

In the kickoff meeting, the data scientist talks through the plan and any areas that need further alignment with cross-functional stakeholders. The cross-functional nature of this meeting is really important because the success of a data science project is not solely determined by how well the technical approach is executed — success is also driven by the impact that the work has on the product or business more generally.

Share it:
Share it:

[Social9_Share class=”s9-widget-wrapper”]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

You Might Be Interested In

How companies can use AI to get ahead of the competition

22 Jul, 2021

Leveraging artificial intelligence (AI) provides companies with a unique and enduring competitive advantage, witnessed by the fact that AI-first companies …

Read more

‘Small Data’ Are Also Crucial for Machine Learning

25 Oct, 2021

When people hear “artificial intelligence,” many envision “big data.” There’s a reason for that: some of the most prominent AI …

Read more

How to build the data infrastructure to get personalized customer experiences right

28 Jul, 2021

As the pandemic has accelerated the growth of online shopping, customers have come to expect a personalized digital experience. But …

Read more

Recent Jobs

Senior Cloud Engineer (AWS, Snowflake)

Remote (United States (Nationwide))

9 May, 2024

Read More

IT Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Data Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Applications Developer

Washington D.C., DC, USA

1 May, 2024

Read More

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

3-steps-to-drive-analytics-adoption

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.