The best member data to input into your AI model includes a combination of the type of data, amount of data, recency of data, and responsiveness of the data model. While there’s no silver bullet approach for all associations, considering these member data characteristics through focused Data Design Sprints will help you make better predictions on behalf of your organization.
When we think of AI, our gut instinct is often to think of robot brains that are infallible and maybe even a little bit creepy.
The reality is that an AI model is simply a program that makes educated guesses based on the data you feed it. This is why, at Tasio, we stress the importance of not just having data, but having the right data and knowing how to use it.
So what exactly is “the right data” to help train an AI model? In this article, we’ll talk about the characteristics of the data you need to build a machine learning model that can reliably make predictions on behalf of your association. We’ll specifically focus on:
We’ll also discuss how you can run Data Design Sprints to continually develop prediction goals and recipes to test your hypotheses.
When you're in the process of teaching an AI system how to predict membership loss or retention, there are really four key areas that you need to focus on to make sure that you have "the right data."
Some types of data you need to properly train your machine learning model are relatively constant such as profile data, while transactional and behavioral data are more variable.
Profile data includes data points like:
Transaction data may include things like:
Behavioral data may include:
You also can’t forget about master data (e.g. different products and services), as well as reference data that’s typically governed by regulations and compliance standards. While we wouldn’t call master or reference data points member data per say, they both without a doubt impact the member experience.
The amount of data you need to train an AI model depends on the type of problem you want to solve. But there are definitely some general rules of thumb.
If you’re trying to predict 12 months out into the future, you should have at least 12 months (or 365 days or 52 weeks) worth of data to train your model. In other words, you should have a data point for every month out you’re forecasting to have valid results.
Another key factor is gathering a good amount of data from a number of sources. For many associations we work with, data comes from AMS systems, Google Analytics, and other third party sources. Remember: the more volume of reliable sources of data, the better the predictions.
The recency of member data is the amount of time since members’ last data history (whether it be their last activity on your site, webinar viewed, etc.).
Again, there’s no “one-size-fits-all” for determining how recent member data needs to be to feed your AI model. Keep in mind that many failed AI models produced inaccurate or unintended predictions not because of the data they had but because of the data they didn’t have. So the more recent, the better.
A good AI model is able to make predictions based on data and then learn and become even more capable and more knowledgeable from feedback and subsequent data. So you must consider the development and use of your AI model in response to changing circumstances.
This is where that training data and feedback data really come into play.
RELATED ARTICLE>> How COVID-19 Can Help Your Association Retain Members
A Data Design Sprint is a great way to select the right member data to feed your AI model.
The Data Design Sprint was created by Sam Chow, PhD, a top AI Product Manager, to compress months of work and debate over your AI model into a single hour.
Instead of launching your model only to realize later that it predicts member behavior poorly, you can walk your team through a member data brainstorm beforehand to get a tactical roadmap in place.
Building a machine learning model is all about experimentation. So you’ll need the right people and strategic brainstorming processes during your Data Sprint to capture and test hypothesis and prediction goals.
Here are the main requirements and best practices for a Data Design Sprint:
Here’s how Chow breaks down the Data Sprint process and time spent:
Member data is your most valuable asset to becoming indispensable to the professional community you serve. It’s a precious thing that helps you provide a more valuable member experience given individual needs and interests.
But there’s no silver bullet approach to deciding what types, amount, and recency of member data are needed to train your AI. What works for one model may not be the best for another. It’s all about the process and continued experimentation through good training data and feedback data that you can gain by conducting focused brainstorming huddles and Data Design Sprints with your team.
Want more information on how your association can better identify and handle member data to increase retention? Download the free Association Retention Playbook to get a proven 5-step strategy you can start implementing right now.
Thomas Altman