Ten years of Nvidia

I haven't been writing actively for a couple years, and having a breath to catch up, I'm working on remediating that absence. I just took a quick scan through drafts I'd started, and I tripped over one from 2013 (yes, ten years ago). The draft was really an email I'd sent pasted into the editor to be picked up later. At the time, it may have been somewhat interesting, but in hindsight, I think it may be more impactful. Here's the email - the original context was me trying convince my ex-wife that a game development class my son was interested in wasn't a waste of time:

I sort of hinted at some reasons I thought the "intro to gaming" class [my son] was looking at next semester might be more useful than it sounds, but I probably wasn't very clear about it.

Here's an announcement from this morning from a graphics card mfg:

http://www.anandtech.com/show/7521/nvidia-launches-tesla-k40

http://www.engadget.com/2013/11/18/nvidia-unveils-tesla-k40-and-ibm-deal

http://www.anandtech.com/show/7522/ibm-and-nvidia-announce-data-analytics-supercomputer-partnership

If you actually look at the picture of the card, you'll notice there's nowhere to plug in a monitor, because this graphics card isn't really a graphics card in the same sense that we think of them.

As these cards have become crazy-specialized & powerful in the context of their original purpose, their insane number-crunching capabilities haven't been lost on people beyond game developers.  In the same way that you see these cards supporting physics engines in games, they're now becoming more and more commonly used in scientific and engineering applications because this same type of computing can used for simulations and other scientific uses.

As I mentioned to [my son] when he was working in Matlab, the type of programming needed for this sort of processing is very different from the traditionally-linear do-one-thing-then-do-the-next-thing style of programming used for "normal" CPU's -- by its nature, it has to be asynchronous and massively parallel in order to take advantage of the type of computing resources offered by graphics-type processors.

Cutting to the chase, even though "game programming" probably sounds like a recreational activity, I think there's a decent chance that some of the skills touched on in the class might translate reasonably well into engineering applications -- even if he doesn't necessarily see that during the class.

Damn. Right on the money. Ten years on, Nvidia rules AI. Why? It's the chips and the programming model. All the reasons I cited to give game programming a chance have come to pass, and for what it's work, the kid wound up using Nvidia GPUs to build neural networks in grad school.

Game programming, indeed.

To SaaS or not to SaaS

I just saw an announcement from 37signals about a “new” initiative to sell old-fashioned purchased, installed-on-premise software. Is this a legitimate reversal of the seemingly inevitable march to SaaS software, or merely another option for customer?

My first reaction: this is a breath of fresh air. Personally, I’ve been miffed seeing software I’d purchased when I wanted and upgraded when wanted move to a subscription model. It messed with my sense of self-determination. But when I took five minutes to work out the math, the subscription model worked out to just about what I’d been spending on purchases and upgrades. As an individual consumers, that stigma was all in my head.

But 37signals sells enterprise software, and for enterprise software, there are some pretty interesting implications if the pendulum starts to swing back. I’m interested to see how some of these turn out.

One of the largest implications of the SaaS model for companies is way these purchases impact accounting. While purchased software is typically treated as a capital investment that’s amortized with expenses incurred over the life of the software, the SaaS model typically shows directly as expenses. I’m curious whether this has factored into the shift back to purchase-once software.

Another defining characteristic of SaaS is its continual delivery model. Agile methodologies in general favor frequent deployment, and a true CI/CD model can see this happen very frequently — much more frequently than any enterprise would want to install on-prem software. So, an on-prem delivery model implies going back to distributing releases and hot-fixes that will be installed by customers. Anyone who’s ever supported that model will tell you it’s no piece of cake.

Finally, the announcement from 37signals triggered a mental model that I believe exists in a lot of business decision-makers. The model is rooted in that same CapEx accounting model, and suggests that software can be bought, installed, and left to run as you would treat a filing cabinet. While this model may be appropriate for some applications that change very infrequently, I think this idea can prove harmful when applied to software that needs to grow and change with a business. There’s a nuanced view of the care and feeding and evolution of an application that can be lost in the filing cabinet model, and I’m frankly nervous to see that model perpetuated.

I’ll be watching this development from 37signals as it’s rolled out. They’ve always been thought leaders in the industry, and I’m curious to see if this the beginning of a pendulum swing back toward a self-hosted model.

An Unhelpful Exception

Last year, my son introduced me to a podcast called “Black Box Down”.  Each episode is a discussion about an air disaster of some sort, and in the vast majority of cases, the eventual BOOM is preceded by a long sequence of small problems, dealt with poorly, or over-reacted-to until recovery is impossible.

Exceptions are like that.

I’ve been working on getting DataStax / Cassandra installed on a machine running on Hyper-V, and although it’s been a little while since I experienced the joy of Java stack traces, it’s coming back to me in a hurry.  I Just worked through one that turned out to be rooted in my Java version.  It turns out when the installation requirements indicated Java 8 or better, they left out, “but not too much better”.  After a bit of googling, I found a link suggesting I might need to roll back my Java version, and that did, in fact, make things better.  The process to diagnose has left a bit to be desired, though - starting with the epic stack trace..

What might have been better than this?  How about an assertion on startup to flag the Java version as a problem?  Once upon a time, this software was using a contemporary version of Java, and it would never have occurred to any reasonable developer to put an assertion like this in place, but at this point, it would be really helpful.  Do I have any reasonable expectation that this sort of assertion would make it into the source code?  Nope.  I fully expect that anyone expert enough in this package to contribute to it is well aware of the Java version required, so there’s just no benefit to them to include an assertion like this.  Next problem: packages built on packages built on packages, which make it pretty hard to pin down who should really be responsible for calling out the version dependency.

You may be thinking at this point that an IaC or container solution might be helpful, and I agree.  In this particular case, one of the alternatives at my disposal was an Oracle Virtual Box image that was already set up.  I opted out of that because I’ve already set up Hyper-V and Docker Desktop, and I’m concerned about having too many virtualization platforms sparring with one another.  The “bare metal” setup wasn’t ideal right out of the gates, but the silver lining in this case is that I’m learning a ton.

My biggest takeaway from this experience is going to be to “embrace the suck” – paying attention to the frustration and impact on productivity so I can watch for problems like this in software I’m producing.  As always, putting yourself in your users’ shoes will pay dividends, I believe.

Team Topologies and Conway’s Law

Most of the reading I've done lately has been combinatorial in nature; that is, each book I pick up seems to reference another book I feel compelled to add to my reading list. My latest, Team Topologies, has been no different. This book has a pretty dry title but a really compelling pitch: it aims to help companies finally sort their org charts.

The book is structured with a "choose your own adventure" feel with an introductory section and a series of deeper dives that can apply more or less to different scenarios. In this way, a reader can skip straight to a section that's likely to apply to a problem they're experiencing. In working through the the first couple chapters, though, I was already experiencing "aha" moments.

A central source of inspiration for the ideas in Team Topologies is Conway's Law, which features prominently in the foundational chapters, Conway's Law suggests that organizations build systems that mimic their communication channels, so Team Topologies authors Matthew Skelton and Manuel Pais believe that the human factors of organizational design and the technical factors of systems design are inexorably intertwined, and in digesting this book, I found myself reflecting on scenarios that bear out this premise.

If there's a vulnerability in this book at all, it's that intertwining of organization and technical design. Strictly speaking, it's not a problem with the book at all; rather, it's an inexorable implication of Conway's Law. To the extent you buy into that relationship, it means that no software organization problem is merely technical and no organizational problem can be fixed by sliding names around on an org chart. Ostensibly, this may make change seem harder. In reality, I believe it points out that no change you may have believed easy was ever really changed at all.

Author Allan Kelly has done some writing on Conway's Law, as well, and I believe his assertion here carries deep implications:

More than ever I believe that someone who claims to be an Architect needs both technical and social skills, they need to understand people and work within the social framework. They also need a remit that is broader than pure technology -- they need to have a say in organizational structures and personnel issues, i.e. they need to be a manager too.

https://www.allankellyassociates.co.uk/archives/1169/return-to-conways-law/

Perhaps this connection of technology and organization helps explain the tremendous challenges found in digital transformation. I'm reminded of a conference speech I attended several years ago. The speaker was a CTO / founder of a red-hot technical startup, and he was sharing his secrets to organizational agility. "Start from scratch and do only agile things," to paraphrase, but only slightly. Looking around the room at the senior leaders from very large corporations in attendance, I saw almost exclusively despair. While a neat summation of this leader's agile journey, that approach can't help affect change in the slightest.

Does Team Topologies help this problem? Indirectly, I believe it can. Overwhelmingly, just highlighting the deep connection between people, software, and flow is a huge insight. Lots of other books and presentations kick in to address flow, but the background and earlier research docuented by Skelton and Pais in the early chapters of the book helped create a deeper sense of "why" for me. The lightbulb moment may be trite, but it's appropriate for me in this case. I found the book really eye-opening, and it's earned a spot on my short list of books for any business or technical leader.

Why we Architect for Experiences

Change is afoot once again. We've heard a ton about 5G for the last couple years, and we're just now starting to see this technology emerge in the market. Plenty of pundits have speculated that 5G is going to have the same magnitude of impact as the introduction of the web or the birth of mobile computing. Last week, we saw Facebook launch a major rebranding last week. The Metaverse is coming, powered in part by that same 5G network.

What's the connection to 5G? Edge computing and experiences. In this article from Ericsson, Peter Linder cites 5G as a key enabler for AR and VR without the use of a tethered PC. Few of us can imagine how our business applications will take advantage of experiences like this, but recall that just a few years ago, we'd have had a hard time imagining the rich mobile applications we take for granted today.

These new experiences aren't going to take the place of the experiences we've got today -- they're going to add to them. If you're not already seeing the same explosion in channels, that's coming, too. Partners, new brands, bundled products -- all these channels demand access to the capabilities you've got, packaged up in new ways.

There's no way to support this explosion without careful separation of experiences from capabilities. You should never see the mechanisms by which a command is carried out implemented in the same place your customer experiences it!

Spotify has some great experiences enabled by this sort of separation. I was streaming from my desktop PC this week through my "good" speakers (better than my laptop!), but I also had a Spotify window open on my laptop. I just happened to see the play list sync'ed to the laptop screen, and when I clicked next there, my session on my desktop followed right along. Although this is an ultra-simple example, it's evidence that Spotify is using its interfaces to send Commands to a back-end service -- nothing about the session that's streaming is connected to the commands at all -- otherwise, the audio source would have changed!

At this point, a little faith may be required in order to see a need like this in your enterprise, but for those with preparation on their side, I believe a myriad of new experiences will be possible in a few short years.

A whole lotta chickens

You may be familiar with the old joke about the chicken & pig's relative contributions to breakfast -- the chicken, of course, being involved by virtue of providing the eggs, and the pig being committed vis-a-vis the bacon.  The origin of this saying is now lost to antiquity, but it has been adopted as an illustration of dedication by sports personalities and business coaches because it manages to capture these relative levels of engagement succinctly and powerfully.

I found myself reaching for the chicken & pig business fable this week in the context of Product Ownership.  We've got a back-office product that's just not getting a lot of love from the business -- lots of people who want to provide input, but nobody who's interested in taking ownership.

This isn't unexpected or unreasonable.  Product Management as a professional discipline is still largely nascent.  On top of that, initiatives that raise the top line are always an easier sell than back-office cost-center programs.  For these reasons, I'm not sure that accounting, billing, document-management and other cross-cutting infrastructure-like programs are likely to lead the way in agile adoption or digital transformation, but transform they must -- eventually.

HTML is dead! Long live HTML!

While catching up on newsletters from CodeProject, I came upon an interesting article talking about the "why" of JavaScript UI's -- not the typical "how".

Why JavaScript is Eating HTML

In "Why JavaScript is Eating HTML", Mike Turley walks through the "classic" static HTML for structure + CSS for appearance + JavaScript for behavior example, and then examines how this application evolves as JavaScript begins to control the application more deeply by interacting directly with the DOM.

Reflecting on this article, we've done this sort of thing before.  Going all the way back to CICS to run terminal applications on mainframes, we've separated UI structure from behavior.  Microsoft Access had its forms, which propagated to Visual Basic, and eventually to .Net, WPF, XAML, and so on.  Static is easy, and frankly, it works pretty well most of the time, but as UI behavioral needs become more sophisticated, these static structures are ill-equipped to handle those needs.

So, I'm skeptical these techniques are going to put HTML out of business anytime soon, but in a dynamic application, they make a boatload of sense.

Design resources for developers

Keeping abreast of technology updates has always been a formidable job.  As always, there are all sorts of options for consuming the fire hose of news.  For me, RSS feeds (I use Feedspot for my reader now) from some trusted sources remains a favorite, and I'd like to share a couple of gems that have been really fantastic for a number of years now.

Both of these are design-oriented sites, focusing on UI technologies, techniques, and reviews as well as design theory, layout reviews, and the like.  Given my typical focus on back-end design and architecture, these sites are a great way to bolster by front-end perspective and give me some tools to jump-start UI work.  I've been following both of these sites long enough that I honestly can't remember where I found them, but I'm really impressed with the track record for both.

Honkiat.com (odd name, I know, but it's good stuff) is a real grab-bag of articles, so be prepared to filter out some topics you may not be interested in.  If you sift through to topics that are of interest, though, there are some really good resources.  In most cases, these articles are annotated link collections, so don't expect to find source code here.  Do expect to see examples of UI tools and technologies put to use with links to source sites where you can learn more.   A quick flip through recent articles shows topics like these:

Smashing Magazine tends to cover fewer topics more deeply, with more of a focus on their own content, vs curation of content from other sources.  This is a great source for tutorials and backgrounders, and they've published a number of books based largely on rolled-up content from the site.  Again, not everything will be of interest, but the quality of the content that's here is really high.  Here's a quick sampling of some recent articles from Smashing Magazine:

If you're looking for a couple good streams of design inspiration, give these two a look, and if you've got any other favorites, let me know!

Agile Enterprise, Part 1

I’ve recently had occasion to see the same challenge pop up in a couple different settings -- namely, bridging the gap between agile development and integration into an enterprise.  In both of the cases I saw, development employing agile practices did a good job of producing software, but the actual release into the enterprise wasn’t addressed in sufficient detail, and problems arose - in one case, resulting in wadding up a really large-scale project after a really large investment in time, energy, and development.

Although the circumstances, scope, and impact of problems like this vary from one project to the next, all the difficulties I’ve seen hinge on one or more of these cross-cutting areas: roadmap, architecture, or testing / TDD / CI.

Of the three of these, roadmap issues are by far the most difficult and the most dangerous of the three.  Creating and understanding the roadmap for your enterprise requires absolute synchronization and understanding across business and IT areas of your company, and weaknesses in either area will be exposed.  Simply stated, software can’t fix a broken business, and business will struggle without high-quality software, but business and software working well together can power growth and profit for both.

Roadmap issues, in a nutshell, address the challenge of breaking waterfall delivery of enterprise software into releases small enough to reduce the risk seen in a traditional delivery model.  The web is littered with stories of waterfall projects that run years over their schedule, finally failing without delivering any benefit at all. Agile practice attempts to address this risk with the concepts of Minimum Viable Product (MVP) and Minimum Marketable Feature (MMF), but both are frequently misunderstood, and neither, by themselves, completely address the roadmap threat.

MVP, taken at its definition, is really quite prototype-like.  It’s a platform for learning about additional features and requirements by deploying the leanest possible version of a product.  This tends to be easier to employ in green-field applications -- especially ones with external customers -- because product owners can be more willing to carve out a lean product and “throw it out there” to see what sticks.  Trying to employ this in an enterprise or with a replace / rewrite / upgrade scenario is destined to fail.

Addressing the same problem from a slightly different vector, MMF attempts to define a more robust form of “good enough”, but at a feature-by-feature level.  In this case, MMF describes functionality that would actually work in production to do the job needed for that feature -- a concept much more viable for those typical enterprise scenarios where partial functionality just isn’t enough.  Unfortunately, MMF breaks down when you compound all the functionality for each feature by all the features found in a large-scale enterprise system. Doing so really results in vaulting you right back into the huge waterfall delivery mode, with all its inherent pitfalls.

In backlog grooming and estimation, teams look for cards that are too big to fit into a sprint -- these cards have to be decomposed before they can fit into an agile process.  In the same way, breaking huge projects down by MVP or MMF also must occur with a consideration for how release and adoption will occur, and releasing software that nobody uses doesn’t count!

Architects and developers recognize this sticking point, because it’s the same spot we get into with really large cards.  When we decompose cards, we look for places to split acceptance criteria, which won’t always work for enterprise delivery, but with the help of architecture, it may be possible to create modules that can be delivered independently.  Watch for that topic coming soon.

Armed with all the techniques we can bring to bear to decompose large-scale enterprise software, breaking huge deliveries into an enterprise roadmap will let you and your organization see software delivery as a journey more than as a gigantic event.  It’s this part that’s absolutely critical to have embraced by both business and IT. The same partnership you’re establishing with your Product Owners in agile teams has to scale up to the whole enterprise in order for this to work. The trust you should be building at scrum-team-scale needs to scale up to your entire enterprise.  No pressure, right? Make no mistake -- enterprise roadmap planning must be visible and embraced at the C-level of your enterprise in order to succeed.

Buy-in secured, a successful roadmap will exhibit a couple key traits.  First, the roadmap must support and describe incremental release and adoption of software.  Your architecture may need attention in order to carve out semi-independent modules that can work with one another in a loosely-coupled workflow, and you absolutely need to be able to sequence delivery of these modules in a way that lets software you’ve delivered yield real value to your customers.  The second trait found in a roadmap of any size at all is that it’s nearsighted: the stuff closer to delivery will be much more clearly in-focus than the stuff further out. If you manage your roadmap in an agile spirit, you’ll find that your enterprise roadmap will also change slightly over time -- this should be expected, and it’s a reflection of your enterprise taking on the characteristics of agile development.

Next up, I’ll explore some ways architecture can help break that roadmap into deliverable modules.

 

Related

To agility and beyond: The history—and legacy—of agile development

 

Why Agile Fails in Large Enterprises

The Long, Dismal History of Software Project Failure

Different is Interesting

Last week, I was reminded of a lesson I learned from one of my mentors many years ago: when you're regression testing, "different" is all you really need to look for.

A little context would probably help, here.  Nearing the end of a sprint, we'd tested the new features in the sprint, but we really weren't sure whether we had good regression coverage.  Testing from prior sprints had part of the answer, but we couldn't really afford the time to run all the tests for this sprint and all the tests from our prior sprints.  Thus, the conversation turned to regression testing, automation, and tactics to speed it all up a bit.

The naive approach to regression testing, of course, is to run every test you've ever run all over again, checking each value, examining the dots on the i's and the crosses on the t's.  It's absolutely irrefutable and in the absence of automated tools, it's also completely impractical.  With automated tools, it's merely incredibly difficult, but fortunately, in most cases, there's a better way.

Segue back to my years-ago epiphany, in which I was struggling with a similar problem, and the aforementioned mentor gave me a well-aimed shove in the right direction.  He pointed out that once I'd achieved "accepted", and thus, "good", all I needed to look for when regression testing was "different".   All by itself, this didn't do much to help me, because looking for "different" still sounded like a whole lot of looking.   Combined with the fact that our application was able to produce some artifacts, however, this idea vaulted our testing forward.

Our application, it turned out, supported importing and exporting to and from excel files, so we were able to use this to radically speed up the search for differences -- we saved an export from a known good test, and future tests would just import this file, do some stuff, and export it back out again.  At this point, a simple file compare told us whether we'd produced the same output we'd previously validated as "good".

Armed with this technique, we began looking for other testable artifacts -- to the point where we built in some outputs that were used only for testing.  At the time, it was time well spent, because the maturity of testing tools made the alternatives pretty prohibitive.

And what do you do about a difference when you find one?  First reactions notwithstanding, it's not necessarily a bug.  A difference could arise as a result of an intended change, so it's not at all unusual to be able to attribute a difference to a change you're actually looking for; in which case, you throw out the old baseline and replace it with a new one.  Just remember you'll need to explain each difference every time you find one, which brings me to a final issue you'll want to sort out if you're going to use this technique.

Any artifact you want to use in this way must be completely idempotent -- that is, when you run through the same steps, you must produce the same output.  This seems like a gimme, but you'll find that a lot of naturally-occurring artifacts will have non-idempotent values like ID's, timestamps, machine names, and so on -- you'll need to address these in order to have artifacts you can use effectively for testing.

Once you get past this little hurdle, you're likely to find this idea powerful and super-easy to live with, and what more can you ask for in regression testing?