Customer Lifecyle Architecture

If you've gone through any "intro to cloud development" presentations, you've seen the argument for cloud architecture where they talk about the needs of infrastructure scaling up and down. Maybe you run a coffee shop bracing for the onslaught of pumpking-spice-everything, or maybe you're just in a 9-5 office where activity dries up after hours. In any event, this scale-up / scale-down traffic shaping is well-known to most technologists at this point.

But, there's another type of activity shaping I'd like to explore. In this case, we're not looking at the aggregate effect of thousands of PSL-crazed customers -- we're going to look at the lifecycle of a single customer.

Consider these two types of customers -- both very different from one another. The first profile is what you'd expect to see for a customer with low aquisition cost and low setup activity -- an application like Meta might have a profile like this, especially when you expect users to keep consuming at a steady rate.

Compare this to a customer with high acquisition cost and high setup activity -- this could be a customer with a long sales lead time or one with a lot of setup work, configuration, training, or the like. Typically, these are customers with a higher per-unit revenue model to support this work. Examples of a profile like this could be sales of high-dollar durable goods, financial instruments like loans, or insurance policies, and so on.

So, What?

Why, exactly, would we care about a profile like this? I believe this sort of model facilitates some important conversations about how a company is allocating software spend relative to customer activity, costs, and revenue, and I also think an understanding of this profile can help us understand the architectural needs of services supporting this lifecycle.

Note that this view of customer activity has some resemblance to a discipline called Customer Lifecyle Management (CLM). Often assoociated with Customer Relationship Mangement (CRM), CLM is typically used to understand activities associated with acquiring and maintaining a customer using five distinct steps: reach, acquisition, conversion, retention and loyalty.

Rather than just working to understand what's happening with your customer across the time axis, consider some alternative views. We can map costs and revenue per customer in a view like this, for example. Until recently, the cost to acquire a customer could be known only on an aggregate basis - computed based on total costs across all customers and allocated as an estimate. But as finops practices improve in fidelity and adoption, information like this could become a reality:

FinOps promises to give us the tools to track the investment in software you've purchased, SaaS tools, and of course, custom tools and workflows you've developed. Unless these tools are extremely mature in your organization, you'll likely need to allocate costs to compute these values. Also note that these costs are typically separated on an income statement - those occurring pre-sales tend to show up as Cost of Goods Sold (GOGS), while those after the sale are operating expenses (OpEx), and if you intend to allocate COGS, these will need to be interpolated, as none of the customers you don't close generate any revenue at all.

Less commonly considered is the architectural connection to a view like this. You might see a pattern like this if you're writing an insurance policy and then billing for it periodically:

Not only do these periods reflect different software needs, they also reflect different oppportunities to learn about your customer. Writing a new insurance policy is critically-important to an insurance company -- careful consideration is made to be sure the risk of the policy is worth the premium to be collected. For that reason, an insurance company will invest a lot of time and energy in this process, and much will be learned about the customer here:

On the other hand, the periodic billing cycle should be relatively uneventful for customer and carrier alike, and less useful information is found here -- at least when the process goes smoothly.

I believe the high-activity / high-learning area also suggests a high-touch approach to architecture. Specifically, I think this area is likely where highly-tailored software is appropropriate for most enterprises, and it also likely yeilds opportunities for data capture and process improvement. Pay attention to data gathered here and take care not to discard it if possible. Considering these activity profiles temporally may also lead us to identify separation among services, so in this case, the origination / underwriting service(s) are very likely different from the services we'd use after that customer is onboarded:

What's next?

I'm early in exploring this idea, but I expect to use this as one signal to influence architectural decisions. I expect that an acvity-over-time view of customers will likely help focus conversations about technologies, custom vs. purchased software, types of services and interfaces, and so on.

At a minimum, I think this idea is an important part of modeling services to match activity levels, as not all customers hit the same peaks at the same time.

When is an event not an event?

Events, event busses, message busses and the like are ubiquitous in any modern microservice architecture. Used correctly, events support scalability and allow services to evolve independently. Used poorly, though, events create a slow, messy monolith with none of those benefits. Getting this part of your architecture right is crucial.

The big, business-facing events are easiest to visualize and understand - an "order placed" event needs no introduction. This is the sort of event that keeps a microservice architecture flexible and decoupled. The order system, when it raises this event, knows nothing about any consumers that might be listening, nor what they might do with the event.

sequenceDiagram participant o as Order Origination participant b as Event Bus participant u as Unknown Consumer o ->> b : Order Placed b ->> u : Watching for Orders

Also note that this abstract event, having no details, also doesn't have a ton of value, nor does it have much potential to create conflicts if it changes. This is the "hello, world" of events, and like "hello, world", it won't be too valuable until it grows up, but it represents an event in the truest sense -- it documents an action that has completed.

A more complete event will carry more information, but as you design events, try to keep this simple model in mind so you don't wind up with unintended coupling system-to-system when events become linked. This coupling sneaks into designs and becomes technical debt -- in many cases befor you realize you've acrued it!

The coupling of services or components is never as black & white as the "hello, world" order placed event - it's a continuum representing how much each service knows about the other(s) involved in a conversation. The characterizations below are a rough tool to help understand this continuum.

When I think about the characterization of events in terms of coupling, the less coupling you see from domain-to-domain, the better (in general).

Characteristics of high coupling:

  • Events produced with no knowledge of how or where they will be produced. Note that literally not knowing where an event is used can become a big headache if you need to update / upgrade the event. Typically, you'd like to see that responsibility fall to the event transport rather than the producing domain, though.
  • A single event may be consumed by more than one external domain. In fact, the converse is a sign of high coupling.

Characteristics of high coupling:

  • Events consumed by only one external domain. I consider this a message bus-style event - you're still getting some of the benefits of separating domains and asynchronous processing, but the domains are no longer completely separate -- at least one knows about the other.
  • Events that listen for responses. I consider these to be rpc-style events, and in this case, both ends of the conversation know about the other and depend to some extent on the other system. There are places where this pattern must be used, but take care to apply it sparingly!

If you see highly-coupled events - especially rpc-style events, consider why that coupling exists, and whether it's appropriate. Typically, anything you can do to ease that coupling will allow domains to move more freely with respect to one another - one of the major reasons you're building microservices in the first place.


A little more on events: State vs Events

“Hello world” is always beautiful

Every programming language you've ever learned began with "Hello world".

"Hello World" is typically a couple lines long, and it's always simple and easy to understand. You probably moved on to something like a to-do list after that -- also enticingly simple and easy.

Now, stop for a moment and consider the last bit of production code you touched. Not so beautiful, right?

There are a couple takeaways from this juxtaposition. First, any language / framework looks great in that introductory scope. Not only is a simple use case a godsend to show how elegant a tool is, any tool will pick a domain / scenario that suits their tool when they create that demo app.

The second takeaway is a little more reflective. Remember the joy of learning that new language, the freedom you felt looking at the untarnished simple code, and the optimism you felt when you started to extend it. Now, flip back to that production code again. What can you do to get some of that simplicity and usability back to your production code? Certainly, it'll never look like "hello, world", but I think it's worth a moment's reflection every now and again to see if you can get just a little closer.

Ten years of Nvidia

I haven't been writing actively for a couple years, and having a breath to catch up, I'm working on remediating that absence. I just took a quick scan through drafts I'd started, and I tripped over one from 2013 (yes, ten years ago). The draft was really an email I'd sent pasted into the editor to be picked up later. At the time, it may have been somewhat interesting, but in hindsight, I think it may be more impactful. Here's the email - the original context was me trying convince my ex-wife that a game development class my son was interested in wasn't a waste of time:

I sort of hinted at some reasons I thought the "intro to gaming" class [my son] was looking at next semester might be more useful than it sounds, but I probably wasn't very clear about it.

Here's an announcement from this morning from a graphics card mfg:

http://www.anandtech.com/show/7521/nvidia-launches-tesla-k40

http://www.engadget.com/2013/11/18/nvidia-unveils-tesla-k40-and-ibm-deal

http://www.anandtech.com/show/7522/ibm-and-nvidia-announce-data-analytics-supercomputer-partnership

If you actually look at the picture of the card, you'll notice there's nowhere to plug in a monitor, because this graphics card isn't really a graphics card in the same sense that we think of them.

As these cards have become crazy-specialized & powerful in the context of their original purpose, their insane number-crunching capabilities haven't been lost on people beyond game developers.  In the same way that you see these cards supporting physics engines in games, they're now becoming more and more commonly used in scientific and engineering applications because this same type of computing can used for simulations and other scientific uses.

As I mentioned to [my son] when he was working in Matlab, the type of programming needed for this sort of processing is very different from the traditionally-linear do-one-thing-then-do-the-next-thing style of programming used for "normal" CPU's -- by its nature, it has to be asynchronous and massively parallel in order to take advantage of the type of computing resources offered by graphics-type processors.

Cutting to the chase, even though "game programming" probably sounds like a recreational activity, I think there's a decent chance that some of the skills touched on in the class might translate reasonably well into engineering applications -- even if he doesn't necessarily see that during the class.

Damn. Right on the money. Ten years on, Nvidia rules AI. Why? It's the chips and the programming model. All the reasons I cited to give game programming a chance have come to pass, and for what it's work, the kid wound up using Nvidia GPUs to build neural networks in grad school.

Game programming, indeed.

To SaaS or not to SaaS

I just saw an announcement from 37signals about a “new” initiative to sell old-fashioned purchased, installed-on-premise software. Is this a legitimate reversal of the seemingly inevitable march to SaaS software, or merely another option for customer?

My first reaction: this is a breath of fresh air. Personally, I’ve been miffed seeing software I’d purchased when I wanted and upgraded when wanted move to a subscription model. It messed with my sense of self-determination. But when I took five minutes to work out the math, the subscription model worked out to just about what I’d been spending on purchases and upgrades. As an individual consumers, that stigma was all in my head.

But 37signals sells enterprise software, and for enterprise software, there are some pretty interesting implications if the pendulum starts to swing back. I’m interested to see how some of these turn out.

One of the largest implications of the SaaS model for companies is way these purchases impact accounting. While purchased software is typically treated as a capital investment that’s amortized with expenses incurred over the life of the software, the SaaS model typically shows directly as expenses. I’m curious whether this has factored into the shift back to purchase-once software.

Another defining characteristic of SaaS is its continual delivery model. Agile methodologies in general favor frequent deployment, and a true CI/CD model can see this happen very frequently — much more frequently than any enterprise would want to install on-prem software. So, an on-prem delivery model implies going back to distributing releases and hot-fixes that will be installed by customers. Anyone who’s ever supported that model will tell you it’s no piece of cake.

Finally, the announcement from 37signals triggered a mental model that I believe exists in a lot of business decision-makers. The model is rooted in that same CapEx accounting model, and suggests that software can be bought, installed, and left to run as you would treat a filing cabinet. While this model may be appropriate for some applications that change very infrequently, I think this idea can prove harmful when applied to software that needs to grow and change with a business. There’s a nuanced view of the care and feeding and evolution of an application that can be lost in the filing cabinet model, and I’m frankly nervous to see that model perpetuated.

I’ll be watching this development from 37signals as it’s rolled out. They’ve always been thought leaders in the industry, and I’m curious to see if this the beginning of a pendulum swing back toward a self-hosted model.