Customer Lifetime Value Formula

How To Maximize ROI with Customer Lifetime Value & Segmentation (the right way)

Customer Lifetime Value Formula

Data Science is bullsh*t.

I say this a lot.

And I say it despite the fact that I run my own data science lab.

And it isn’t the science per se that is the issue.

It’s the way it’s often used.

There is an overabundance of useless projects, useless ideas, and useless analysis that do not align with business goals.

You see, the thing you must understand (especially if you’re a business owner) is that data science has mostly fallen short in providing executives with something of value to grow revenue, reduce cost, improve efficiency, or maximize profit.

There is a lot of useless research, a lot of circle jerking, and a lot of wasted money and time.

I’m not here to bullshit you though.

Data Science is hard.

This goes without saying.

But a lot of scholars who work in the science of data these days have a very weird yearning to appear super intelligent, and in doing so, often sprinkle in an excessive abundance of quantification in an effort to impress the masters.

The problem with doing this is that through this inundation of numerical manifestations they actually inadvertently appear far “less intelligent” as their analysis often comes across as an ostentatious display of statistical acrobatics and numerical intricacies, full of erudition and empty of meaning.

On one hand, a display of pure mathematical brilliance, and on the other, nonsensical ramblings and chaotic numeric sequences that nobody understands.

Which strikes me as weird, because despite having studied the art of mathematics and statistics for many years, many data scientists (even on the PhD level) still fail to mathematically determine the suboptimality of this reasoning choice.

Which also begs the question: “Is it possible to become so educated that you become an idiot?”

To properly solve for this dilemma, one must first come to understand the twofold nature betwixt quant and logic — the complimentary aspect of Newtonian third.

If one decides to venture down this path, this inescapable paradox, a duality of opposing forces between quant and logic, can become a repetitive contemplation, a chatter in the skull, where the size and scope of the very analysis itself requires one to spend entire days, weeks, years, even an entire career, pondering its meaning.

And soon, despite your best efforts to contain it, it will eventually consume you, driving you mad.

This is where the reasoning path begins to grow dark.

Where you have nothing to think about except thoughts.

Where you cannot see the forest for the trees.

Where you lose touch with reality and live in the world of illusions.

And the illusions become an uninterrupted chain of disquisitions, upon the very nature and qualities of your mind and its relation to all that constitutes the future.

Here, the reasoning path is unclear.

The road is dark.

More analysis is needed.

One must dive deeper.

So, you wander aimlessly in the forest of decision, into the black.

Each step more unsure then the last.

You go left when you should go right, up when you should go down.

And you wander into the world of confusion.

A total nightmare.

To avoid this suboptimality, I believe one must (at all costs) avoid trying to showcase theoretical prowess, and instead, focus on just showing results.

The results are all that matter.

And once you have those results, the data must align and augment executive budgets, and ad-hoc tactical decisions.

There is no time for circle jerking.

Nobody cares how smart you are if you can’t make them any money.

Or worse, you just cost them money.

A negative net.

That said, today, I’m going to show you how to avoid useless analysis and discuss two critical analyses that can actually help your business make money (that nearly every business owner overlooks).

And not only do most businesses overlook them, but the ones that do try either do it improperly or fail spectacularly.

These two critical elements are Customer Lifetime Value and Customer Segmentation.

If you’re a business owner, these are two critical metrics that you need to study and understand, especially if you want to maintain a competitive advantage in today’s hyper-tech and data focused business environment.

But first, let us define what these things mean.

Customer Lifetime Value 

Customer Lifetime Value (CLV) is defined as: The present value of the expected sum of discounted cash flows of an individual customer or cohort/segment of customers.

customer lifetime value formula

The key variables are as follows:

  • ARPU (average revenue per user)
  • Avg. Cust. Lifetime, n (This is the inverse of the churn, n=1/[annual churn])
  • WACC (weighted average cost of capital)
  • Costs (annual costs to support the user in a given period)
  • SAC (subscriber acquisition costs, sometimes referred to as CAC = customer acquisition costs)

In simpler terms, Customer Lifetime Value (CLV) is a multidimensional construct that quantifies the net present value of all future cash flows attributed to a customer throughout their entire lifecycle with a business. It serves as a key metric for understanding the long-term value a customer brings, as opposed to transactional value.

Understanding CLV allows a business to:

  • Allocate marketing budgets more efficiently
  • Tailor products and services to high-value customers
  • Make informed decisions on customer retention strategies

It’s a symbiotic relationship where various aspects of customer behavior (e.g. acquisition, retention, and monetization) are combined into a singular value that serves as an indicator of long-term customer profitability.

We usually calculate CLV using advanced probabilistic models (our latest build is actually 25 models in 1) where stochastic elements of customer behavior is used as a pivotal variable in strategic customer relationship management (CRM), resource allocation, and firm valuation.

If you calculate it correctly, the CLV can serve as a cornerstone metric within the interdisciplinary nexus of marketing, economics, and finance, providing a quantifiable measure of customer capital.

The key LTV is predicting the future based on historical data.

This is where normal LTV models (including the generic ones typically found online in Google search) fail miserably:

customer lifetime value - actual vs regression

The LTV model we typically use in The Lab is a Shifted Beta-Geometric Probability Model with two parameters alpha and beta.

This model is a combination of the geometric distribution and beta distribution.

This distinction is key because the geometric distribution captures the slope or rate of change of churn over time. The beta distribution captures heterogeneity/difference in churn across customer segments/cohorts.

Geometric distribution shown below:

geometric distribution

Beta distribution shown below:

beta distribution

Actual vs

actual vs sbg model based estimates

Below is the implementation using Python:

customer lifetime value python implementation

And also:

customer lifetime value python implementation

We typically build and code these mathematical algorithms from the ground up, run the implementation, etc.

Coding the mathematical algorithms in Python allows for scaling to millions of customers across numerous customer segments shown below:

customer lifetime value probability distribution curve

Customer Segmentation

The next step would be segmenting customers into unique groups (aka customer segmentation).

This is a challenge for many businesses, and most fail to do it correctly.

But customer segmentation is key to producing an accurate and scalable LTV model.

So, what is customer segmentation exactly.

Customer Segmentation is the practice of dividing a customer base into groups of individuals that are similar in specific ways, such as age, spending habits, or behavioral traits.

This is a rather advanced “analytical framework” that incorporates statistical and machine learning algorithms to partition a heterogeneous customer base into homogenous subsets based on a range of variables like behavioral patterns, demographic attributes, psychographic factors, and transactional histories.

Note for advanced users: you can also run cohort analysis that segments the customer base not just based on static attributes like age or income, but on a temporal or behavioral dimension, such as the time of first purchase or engagement with a service. You can analyze cohorts as a single unit over time to assess patterns, correlations, or causal relationships within the context of longitudinal studies. By tracking these cohorts over long timescales, you can begin to see interesting customer lifecycle patterns such as retention rates, changes in behavior, and lifetime value. 

It’s worth noting that segmentation is not just a generic classification exercise.

It’s a strategic methodology.

There is a science to it.

And if you do it right, you’ll end up with optimized marketing campaigns, optimal resource allocation, and efficient product development.

But it’s often easier said than done, as segmentation analysis typically requires advanced knowledge of multidimensional scaling, cluster analysis, and predictive modeling to identify any sneaky patterns and correlations hiding deep in the data that are not readily apparent through traditional descriptive analytics.

There are three general types of segmentation: 

  1. Demographic Segmentation: Age, Gender, Income
  2. Behavioral Segmentation: Purchasing Habits, User Activity
  3. Psychographic Segmentation: Lifestyle, Values

And we typically use three different techniques to run segmentation analysis:

1. K-means Clustering: an unsupervised machine learning algorithm that partitions a dataset into K distinct, non-overlapping subsets (or clusters) based on feature similarity. The algorithm employs an iterative optimization technique to minimize the within-cluster sum of squares, effectively assigning each data point to the cluster whose centroid is nearest in the feature space. While computationally efficient, K-means assumes spherical clusters and is sensitive to the initial placement of centroids. It is widely used in various domains, including but not limited to, market segmentation, anomaly detection, and image segmentation, and serves as a foundational algorithm in vector quantization and pattern recognition.

2. Density-based Spatial Clustering (DBSCAN): an unsupervised clustering algorithm that identifies clusters as high-density regions in feature space, separated by regions of lower point density. Unlike partitioning methods like K-means, DBSCAN does not require the pre-specification of the number of clusters and can discover clusters of arbitrary shape. The algorithm operates by defining a neighborhood around each data point, and attempts to grow clusters by stacking together closely packed data points based on a distance metric and density parameters. DBSCAN is particularly effective for applications where the distribution of data is non-uniform and the cluster structure is complex.

3. Expectation-Maximization (EM) Clustering: a statistical technique that employs the Expectation-Maximization algorithm to find maximum likelihood estimates of parameters in probabilistic models, particularly Gaussian Mixture Models, where the data is assumed to be generated by a mixture of several Gaussian distributions. The algorithm iteratively performs two steps: the Expectation step (E-step), which computes the expected value of the log-likelihood function concerning the conditional distribution of latent variables given observed data and current parameter estimates, and the Maximization step (M-step), which updates the parameter estimates to maximize the expected log-likelihood found in the E-step. EM Clustering is widely used in scenarios where the data is incomplete or has missing values, and serves as a robust method for soft clustering, where each data point can belong to multiple clusters with different probabilities.

And if you want to get fancy you can use decision trees to classify customers into different segments based on decision rules.

game theory decision tree

An artists depiction of a decision tree

Finding the optimal number of clusters is key.

The elbow method finds the value of the optimal number of clusters using the total within-cluster sum of square values.

This represents how spread-apart the generated clusters are from one another. In this case, the K-means algorithm is evaluated for several values of K, and the within-cluster sum of square values is calculated for each value of K.

After this, we plot the K versus the sum of square values. After analyzing this graph, the number of clusters is selected, so that adding a new cluster doesn’t change the values of the sum of square values significantly.

elbow method

The area bounded by the rectangle is optimal (clusters 1-5).

Clusters 6-12 have very little variation.

Next Steps

Business analytics can be tricky, but regardless of whether you’re selling autonomous rockets or have a lemonade stand, understanding your customer is paramount.

In today’s hyper-competitive business environment, understanding your customer is not just a marketing necessity but a strategic imperative.

Customer Lifetime Value and Customer Segmentation are two critical frameworks that can help your business optimize its marketing strategy by offering deep insights into customer behavior, maximizing profitability.

That said, it’s important to note the symbiotic relationship between Customer Lifetime Value (CLV) and Customer Segmentation. These are not isolated concepts but interconnected strategies that, when employed in tandem, can significantly elevate a business’s customer-centric approach, thereby driving long-term profitability and sustainable growth.

Here are a few ways to use CLV and Segmentation:

1. Financial optimization through Customer Lifetime Value (CLV): Here, you would analyze the net present value of profit a customer is expected to bring over their entire relationship with your company. Once you have this number, you’ll be able to more efficiently allocate marketing/operational spend, targeting the retention and acquisition of high-value customers and divesting from less profitable segments. This will allow you to forecast better and improve your decisions, leading to stronger customer relationships that are most likely to drive long-term growth and profitability. In simple terms, the benefits of CLV analysis include: enhanced resource allocation, more tailored product/service offerings, increased marketing ROI, and better overall company financial health.

2. Targeted Strategies Unlocked Through Customer Segmentation: The next step would be to run customer segmentation analysis, where you will be categorizing your customer base into distinct segments based on various attributes like demographics, buying behavior, psychographics, etc. This segmentation will provide you with a granular view of your customer, giving you the ability to create more targeted marketing strategies, categorizing customers into distinct groups based on demographics, behavior, purchase patterns, and preferences, drilling down deeper than ever before. From there, you can tailor campaigns into specific needs, optimizing price and offerings that focus on your most profitable segments. The benefits of doing this analysis are multifaceted: (1) it increases efficacy of marketing efforts; (2) delivers more relevant, personalized experiences; (3) elevates customer satisfaction and loyalty; (4) improves resource allocation; (5) drives higher revenue and profit margins. By understanding the unique attributes and needs of each segment, companies can cultivate a more engaged customer base and competitive edge in the marketplace.

3. Reap The Synergistic Benefits: By now you should be starting to see the power synergies that exist when CLV and Segmentation are aligned. Understanding the specific customer profitability (CLV) with the broader strategic insights you will get from grouping customers with similar characteristics (segmentation) will not only allow you to identify your most valuable customer segments but also craft your marketing/sales/service strategy to perfectly allocate spend within these segments. The benefits of this dual analysis (when run correctly) include more precise targeting and personalization, more efficient resource allocation (to high-value segments), improved customer retention rates, and maximized overall profitability.

4. Plan Better and Mitigate Risk: It goes without saying but when you pinpoint which customer segments yield the highest CLV, you’ll be able to hedge risk by diversifying investment into multiple (stable) revenue streams, rather than running blind and relying on volatile or unsegmented mass markets. You’ll also be able to make better projections and anticipate market changes and/or customer trends much better. You’ll have a stable, more consistent revenue flow. You’ll be better protected against churn. You’ll have a more resilient and adaptable business model. Pivoting and adapting your strategy and/or resource allocation can happen much faster. Your business will be positioned to capitalize on growth opportunities and hedge against market uncertainties — and you will be able to double down on nurturing/retaining your most profitable segments.

Note: sometimes you may encounter a high-value customer segment that shows signs of high churn, but a solid understanding of CLV within different segments can be a solid strategy guide. Perhaps you need new product development, additional services/features, or even market expansion. If a high-CLV customer segment shows a preference for a specific product feature, it makes a ton of sense to allocate R&D resources to enhance that feature.

Case Study: Deimos-One  Customer Lifetime Value and Segmentation Analysis in Stratospheric Observation Services

Mission Plan: To conduct a customer lifetime value (CLV) and segmentation analysis for Deimos-One, a company providing data collection and analysis via a stratospheric observation vehicle.

Mission Notes: Deimos-One designs and develops stratospheric observation vehicles for the purposes of data collection and analysis. With a clientele that spans military outfits to government agencies and mineral expeditions, understanding and maximizing the value derived from each customer segment is pivotal. This case study will delve into the intricate process of running a customer lifetime value (CLV) and segmentation analysis for a company operating within this niche.

To do this, we must first establish a framework for understanding the unique value each customer segment brings over the course of their relationship with the company.

Note: this case study is entirely speculative and hypothetical, meant solely for the academic discussion of illustrating the potential application of customer lifetime value and segmentation analysis in the near-space industry. 

Here’s a (watered down) Customer Lifetime Value (CLV) Framework:

  • ARPU (Average Revenue Per User): The average amount of revenue generated per user over a specific period, typically a month or year. The primary revenue stream is a monthly subscription fee paid by customers for data services. This fee may vary based on the level of service and data requirements.
  • Avg. Customer Lifetime, n: The average length of time a customer continues to subscribe to a service, calculated as the inverse of the annual churn rate (n = 1/[annual churn]). For the purposes of this sh*tty analysis, we will assume the average customer has a lifespan of 5 years, which is typical for B2B contracts in specialized industries.
  • WACC (Weighted Average Cost of Capital): The average rate of return a company is expected to pay its security holders to finance its assets, weighted by the proportion of equity and debt financing.
  • Costs: The total annual expenses incurred to maintain and support a user, including service, support, and operational costs within the period.
  • SAC (Subscriber Acquisition Costs): The total costs associated with acquiring a new subscriber, including marketing and sales expenses, often synonymous with Customer Acquisition Costs (CAC).

This is a more financially nuanced approach that you’ll typically see if you conduct a general Google search for “Customer Lifetime Value Analysis”. The CLV will be calculated by considering the average revenue per user, the average customer lifetime, the weighted average cost of capital, the annual costs to support the user, and the subscriber acquisition costs.

Customer Lifetime Value (CLV) Calculation

The CLV for Deimos-One’s customers can be calculated using the following formula:

customer lifetime value formula


  • ARPU is the average revenue per user (per period)
  • Costs are the annual costs to support the user (in a given period)
  • WACC is the Weighted Average Cost of Capital, which represents the opportunity cost of the funds used for financing the customer relationship
  • Avg. Cust. Lifetime, n is the average customer lifetime, calculated as the inverse of the churn, n=1/[annual churn])
  • SAC (subscriber acquisition costs, sometimes referred to as CAC = customer acquisition costs)

Let us import the vars into our sh*tty bathroom formula, with the following assumptions:

Let’s assume the following for Deimos-One’s customer base:

  • = $13,000 per month
  • Annual churn rate = 10% (hence, = 10 years)
  • = 8%
  • = $2,000 per month
  • = $30,000 per customer

Applying the Formula

Given the assumptions, we will now apply our formula from above, plug in our numbers and calculate the CLV.

Our formula will yield the net present value of the profit that Deimos-One expects to earn from an average customer over the average customer lifetime, after accounting for the costs to support the user and the costs to acquire the customer.


The resulting CLV will inform Deimos-One of the value an average customer brings to the company. A positive CLV indicates a profitable customer relationship, while a negative CLV would suggest that the company is losing money on its customer acquisition and support efforts.

To illustrate the CLV calculation, we can create a chart that shows the projected net cash flows from a customer over the 10-year period.

The chart would display the initial negative cash flow due to SAC, followed by positive cash flows from ARPU minus Costs, discounted by the WACC each year.

Next, we will run a sh*tty bathroom segmentation analysis using the following criteria:

  1. Industry Verticals: Research institutions, U.S. military (and its allies), government agencies (e.g. NASA), fire and emergency management operations, and mining operations.
  2. Data Usage: Frequency of data access, volume of data used, real-time access vs. periodic reports.
  3. Service Tier: Basic data access, premium analytical services, or customized solutions.
  4. Geographical Location: Domestic vs. international operations, which could affect data collection logistics and legal considerations.

Our Segmentation nodes include: 

  1. Research Institutions: May require less frequent but highly detailed data for long-term studies.
  2. Military and Government: May require real-time data with high frequency for situational awareness and operational readiness.
  3. NASA, Air Force, DHS: May require continuous, real-time data for surveillance and monitoring.
  4. Fire and Emergency Management: May have seasonal usage patterns with high demand during certain periods.
  5. Gold and Mineral Miners: May have fluctuating needs based on exploration schedules and project lifecycles.

Next, we can expand the study parameters using scenario analysis:

  1. Research Contract: Pays $8,000/month, with a 5-year lifespan and 95% retention rate.
  2. Military Contract: Pays $15,000/month, with a 10-year lifespan and 98% retention rate.
  3. Govt Agency Contract: Pays $25,000/month, with a 10-year lifespan and 96% retention rate.
  4. Fire/Emergency Contract: Pays $7,000/month, with a 3-year lifespan and 87% retention rate.
  5. Gold/Mineral Contract: Pays $10,000/month, with a 1-year lifespan and 82% retention rate.

Using our formula from above, we will then calculate the CLV for each customer segment, adjusting the variables based on the scenario.

Post-Analysis recommendations may include: 

  1. Customized Service Packages: Developing tiered service offerings to match the varying needs of each segment.
  2. Flexible Pricing Models: Considering volume discounts or seasonal pricing for segments with fluctuating demand.
  3. Customer Retention Programs: Implementing loyalty programs or long-term contracts for segments with higher CLVs.

At the end of the day, understanding the CLV will help Deimos-One make informed decisions about how much to spend on acquiring customers and how to optimize service costs.

It will also provide insights into which customer segments are the most valuable and therefore where to focus marketing and customer retention efforts.

This way, the company will be able to tailor its marketing, sales, and product development strategies to prioritize the most valuable segments and develop targeted offerings to meet their specific needs.

Note: this case study is entirely speculative and hypothetical, meant solely for the academic discussion of illustrating the potential application of customer lifetime value and segmentation analysis in the near-space industry. 

Final Thoughts

Today we discussed a comprehensive framework for calculating Customer Lifetime Value and conducting Segmentation Analysis.

If you made it this far, you should have a solid understanding of:

  • The methodology for calculating CLV in any business environment.
  • The segmentation techniques needed for a diverse customer base.
  • The post-analysis strategies to maximizing the value from each customer segment.

At the end of the day, Customer Lifetime Value (CLV) and Customer Segmentation are two sides of the same coin.

CLV offers a lens to view the financial value customers bring over time, while segmentation provides a nuanced understanding of the diverse customer base.

Together, they form a symbiotic relationship that enriches customer understanding, enabling businesses to craft strategies that are not just customer-focused but customer-optimized.

In a marketplace where customer expectations are continually evolving, the combined power of CLV and Segmentation is not just beneficial but essential for any business aiming for long-term success.

By leveraging the strengths of both CLV and Segmentation in dual-analysis, you’ll be able to create a nuanced, actionable strategy that speaks directly to the needs and potentials of your most important customer groups.

For example, segmenting customers based on CLV can reveal things you did not include in your initial assumptions, such as high-value and low-value customer groups.

This information can be invaluable for resource allocation.

High-value segments may require more personalized services or loyalty programs, while low-value segments might be targeted with cool promo offers to increase their lifetime value.

Of course, one simple blog post cannot cover every nuance and detail of a high level dual-analysis like Customer Lifetime Value and Segmentation, but you should be able to take this skeleton analysis and use it build your own models. And if you do it properly, you should begin to see how customer value can vary significantly across different segments.

By understanding these differences, you’ll have the ability to (1) hedge against risk; and (2) strategically allocate resources to maximize profitability and growth.

Good luck.

If you need more help figuring out how to use LTV and Segmentation in your business, our Lab can get you all set up to strategize, develop and design a custom solution for your business use case. Contact us and get a head start on your project.


  1. Fader, P. S., & Hardie, B. G. (2012). Customer-based corporate valuation. Journal of Marketing Research, 49(1), 40-56.
  2. Gupta, S., & Lehmann, D. R. (2006). Customer lifetime value and firm valuation. Journal of Relationship Marketing, 5(2-3), 87-110.
  3. Smith, A. (2018). Data-Driven Marketing. Harvard Business Review.