Salesmanship Enabled MarTech with Reinforcement Learning

/ / Data Science, Marketing

Written by Yunxiao He, edited by Yifei Xu, Carol Huang

This is a recap of a FocusKPI Analytics Leadership Forum event. Join our  LinkedIn group to learn more.


Speaker: Yunxiao He

Chief Data Scientist at GrowingIO; Advisory Data Scientist at FocusKPI, Inc
Expertise: Data driven advertising, blockchain based ad tech, AI powered supply chain management.


Over the past few years, Reinforcement Learning (RL) has drawn tremendous attention and investment from the government, industry, and academia. In marketing, RL has the potential to play a pivotal role in the consumer-centered, data-driven, and algorithm-mediated brand communication to orchestrate customer journey with the goal of maximizing long-term gains in a dynamic environment.

“Advertising is salesmanship in print”

In 1905, John E. Kennedy famously described advertising as “Salesmanship in print”. The media landscape has dramatically expanded from the early 20th century. However, the spirit in this intriguing definition of advertising still provides insightful guidance on how marketers should engage customers by thinking about what a good salesperson will do in talking to a prospect. For example, a good salesperson will treat customer engagement as an interactive process, which is a two-way flow of influences. Each time the salesperson, equipped with prior knowledge about the customer, conveys certain messages to the customer, the customer will reply with some implicit or explicit signals. A smart salesperson will no doubt take those signals into account and adjust his or her communication accordingly. It is also common sense that the salesperson will keep their eyes on the final reward of converting the customer eventually, rather than pushing hard on “winning” every step in the conversation, i.e., they will balance the short and long term gain when talking to customers.

Marketing analytics as we know it

Many well-known modeling tools have been established to power various marketing campaigns over the years. Here are some examples.

Predicting the likelihood of customer actions (conversion and churn, etc.) or customer lifetime value

These types of models are typically used to narrow down the scope of the audience in a campaign so precious resources are directed to customers who have a higher likelihood to favorably respond. Generally, these models focus on the immediate return following the campaigns and consider customers as static entities. For example, a model for predicting customer lifetime value normally will not consider further brand communications and potential changes in customer status later.

Personalization / next best action

Another potential issue with the prediction tools above is that they typically have to take a one-model-per-action approach, which will not be feasible in many situations. For example, an e-commerce site needs to personalize its communications to a large customer base by picking recommendations from millions of products or a long list of ads (actions). This is where personalization tools can help, which are designed to make the prediction solutions scalable. However, typical personalization tools again only focus on the “immediate” effect, e.g., the likelihood the customer will click what is shown on the current page rather than the total value the platform can gain in a visitor session including a series of page views.

Multi-Touch Attribution for marketing effectiveness measurement

Marketers always ask the question of what the best sequence of customer communications might be or whether analytics can take into account the synergy among various forms of ads. While it is very challenging to solve this problem with traditional modeling tools, RL is a natural fit as it is designed to solve the sequential-decision-making problem.

How does Reinforcement Learning approach customer engagement?

Reinforcement learning is typically framed as a Markov Decision Process (MDP) problem. In simple terms, MDP formulates the process where an agent interacts with its environment and RL proposes strategies for the agent’s decision making to maximize its long-term gains. In the marketing context,an agent represents the marketer, the agent will strategically employ various marketing tools (actions) given what it knows (states) about the customer (environment), the agent will observe the customer’s reactions (reward and state changes) and determine what the next best action is, and the agent will balance between exploration (further learning) and exploitation (of prior learning) to figure out the engagement strategy (policy) optimal in the sense of maximizing long term returns.

This paradigm clearly mimics the behavior of a good salesperson as discussed earlier.

If you are interested in knowing more about how RL can help to improve your customer-centric marketing, please watch the video below which provides an overview of the topics as well as how an RL agent can be trained and the typical steps required in an RL powered project for the purpose of customer engagement (including a demo) and Multi-Touch Attribution.

Analytics Leadership Forum on September 18th, 2020

View Video


If you are interested in this topic or have any related questions about our event and service, please reach out to NEWS@FOCUSKPI.COM.