Publication Date

Fall 2023

Document Type

Capstone Project

Degree Name

Master of Science

Department

Computer Science

First Advisor

Steve Shih

Second Advisor

Yunchuan Liu

Third Advisor

Xin Chen

Abstract

The rapid growth of the EdTech industry, coupled with the increasing demand for online education, has led to a surge in the number of potential customers or "leads" in the market. ExtraaLearn, an emerging startup in this sector, faces the challenge of efficiently identifying which leads are most likely to convert into paid customers (Howarth, 2023). This project focuses on utilizing machine learning techniques, specifically Decision Trees and Random Forest, to analyze leads data and predict lead conversion probability (Breiman, 2001; Couronne, Probst & Boulesteix, 2018).

By harnessing the power of data science, the objective is to create a predictive model that not only distinguishes between high-conversion and low-conversion leads but also uncovers the underlying factors influencing the conversion process (James, Witten, Hastie, & Tibshirani, 2013). Through this analysis, a comprehensive profile of leads with a high likelihood of conversion will be developed. The findings of this project can be used to empower EdTech companies like ExtraaLearn to allocate its resources more effectively, optimize its lead nurturing efforts, and ultimately enhance its success in the competitive online education market (Howarth, 2023).

In a landscape where the EdTech industry is experiencing exponential growth, this project aims to provide ExtraaLearn with actionable insights and data-driven strategies to stay ahead in the game by focusing on the leads most likely to become valued customers.

Share

COinS