How Duolingo Builds Its Data Science Methodology
- by 7wData
With more than 300 million users completing more than seven billion language-Learning exercises each month, Duolingo offers an example of data science methodology in action.
For business leaders and other powers that be within an organization, data science can be a mysterious, almost-magical tool: They don’t necessarily understand it, but in it they see the possibility of answering any question they have about their users, business, revenue or product.
So when a request lands on the desk of a data science team, it can sometimes betray the author’s lack of understanding around its viability — or lack thereof.
Data science methodology is the task of crafting a project that answers the needs of customers and colleagues alike and requires a deep understanding of a request and the motivations behind it. Following a rigorously defined methodology, a well-run data science team will then craft a project plan that defines and collects the type of data they need and then prepares and models the data to enhance their understanding of the insights contained therein. Finally, it must be ready to deploy the requested tool or feature with the expectation that feedback processes will probably require some post-production tweaks or updates.
This, of course, is your average data scientist’s bread and butter. But what does this process actually look like in practice?
With more than 300 million users completing more than 7 billion exercises each month, language-Learning platformDuolingooffersan example of data science methodology in action. Not only do the company’s enormous databases inform tweaks to Duolingo’s user experience and underlying infrastructure all the time, but the company’s data science teams conductregular researchinto everything fromoptimizing reminder notificationsto theories on how to improve teaching practices and outcomes for learners ofindigenous languages.
Duolingo’s data science methodology underpins much of this work. To learn more about the nuts and bolts of how a project moves from an amorphous idea to a usable tool or valuable insight, Lead Data Scientist Erin Gustafson — one of RE•WORK’s Top 30 Women Aiding AI Advancement back in 2019 — took us through her team’s best practices.
What are your team’s best practices when designing your data science methodology for a new project?
Our number one best practice is a project kickoff process that we’ve been honing over time. Most of our projects go through this process, which involves drafting a kickoff document and scheduling a meeting with key stakeholders to discuss the plan. We’ve found that both phases of this process add a ton of value.
At the doc phase, data scientists work with their managers and team leads to define the goals, requirements, key stakeholders, technical approach and timeline for the project. This phase forces us to do the important foundational thinking for a project so we can make sure we have the data we need — more than once, the kickoff process has helped us realize we don’t — and that the project has high ROI.
In the kickoff meeting, the data scientist talks through the plan and any areas that need further alignment with cross-functional stakeholders. The cross-functional nature of this meeting is really important because the success of a data science project is not solely determined by how well the technical approach is executed — success is also driven by the impact that the work has on the product or business more generally.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
From Text to Value: Pairing Text Analytics and Generative AI
21 May 2024
5 PM CET – 6 PM CET
Read More