The politics of software

This fall, as healthcare.gov imploded before our eyes, we saw any number of self-proclaimed experts chime in on why it coughed, sputtered, and ground to a halt, and how, exactly, it might be fixed.  My guess is that the answer is more complicated than most of them let on, but I'll bet there's a healthy dose of politics mixed in with whatever technological, security, and requirements issues might have surfaced along the way.

IMG_6589.jpg

It seems somewhat counter-intuitive to talk about politics at all in the context of software development, of course.  One of the aspects of software that really appealed to me when I entered this field was that for most problems, there existed an actual correct answer, and there are no politics in algorithms.  Ah, to return to the halcyon days of simple problems and discrete solutions!

Today's problems are more complicated than ever before, though.  Prodigious capabilities have bred complex systems and murky requirements under the best of circumstances, and no government project operates under the best of circumstances.  For those of you in private enterprise, you surely are aware of the struggles bred of competing interests and limited resources, but in a government setting, all those factors explode.  Funding is rarely connected directly to stakeholders, opinions are everywhere, and "deciders" are nowhere to be found.   Not to put too fine a point on this, but if we were to think of government-sponsored software as having been congealed rather than developed, we might be on the right track.  It's actually a small miracle when these systems work at all, given the confluence of competing forces working to rip projects in seventeen directions at once.

Think back for a moment on the early days of Facebook or Twitter or any of the other massive applications that serve as today's benchmarks of reliability.  They weren't always so reliable, of course.  Twitter, in particular gave birth to a famous "fail whale" meme in 2009 as it sorted out its capacity and reliability issues.  To be clear, twitter operates on a huge scale, but all it's doing is moving 140-character messages around -- there isn't a whole lot of business logic there, short of making sure that messages get to the right person.  It's pretty easy to gloss over some of those growing pains, but virtually every large system has them.  In the case of healthcare.gov, the failures happened under the hot lights of opening night and amid opponents who wanted desperately to see the system fail, and fail hard.

If you work amid politics like this, I'd love to offer a simple solution, but sadly, I have none.  Instead, I'd urge a little empathy; walk a mile in the footprints of developers, project managers, analysts, and testers on projects like this before you criticize too vigorously.  I can assure you that if you think this was a failure with a simple cause, you're mistaken.

Related articles

 

 

Enhanced by Zemanta

Home PC administration — another lost opportunity

I've written in the past about places where Microsoft could absolutely *own* the infrastructure of the home by establishing a beachhead in the living room -- not to mention the previous assertions about their development tools.

I still believe quite strongly that a well-targeted home computing platform is just a couple of software tweaks away for Microsoft.  Today's edition is all about authentication.  I've got a bunch of PC's at home, including some VM's.  I've also got a Drobo 5N and a PS3 and a bunch of networking equipment.

You know what stinks?  I need to set up logins on every single one of these devices individually, and they're not connected to one another (so "Fred" on one box isn't really the same login as "Fred" on another box).

Stupid.

Microsoft, give me a lightweight Active Directory for the home -- something I can obtain without buying a Windows Server license, okay?  Here's a hint: if you built this into thew new XBox, I'd buy one, and I bet a bunch of other people would, too.  Let me use this for DNS, so I can type "router" into my browser and actually get my router, instead of making me set up a HOSTS file on every single PC I own.  By the way, how many average consumers would even know that's possible??

I fully expect that the new XBox, when it arrives, will let me stream photos and music off my Drobo, but if you want to really take this idea to the next level, how about selling us a Pogoplug -type of device I can give to my Mom & Dad so I can (1) set up user names for them, and (2) let them see photos that I don't plan on uploading to Flickr, etc.?  The idea here, by the way, since I'm spelling everything out in excruciating detail, is that just about every family has one or more members somewhere who (a) own a gaming system, and (b) understand enough about computers to be the family SysAdmin.

Get it??

Oh, and by the way, since you've given up on Windows Home Server for reasons I've never quite been able to fathom, and since you now aspire to be a "devices + services" company, why don't you just go ahead and buy Drobo and make their stuff work with yours?  I'd happily plug mine into a new XBox.

I swear, if Microsoft were able to get their collective heads out of whatever orifices they're lodged within long enough to make an XBox that actually acted like it was part of a family, they'd crank up another WinTel-style monopoly to last them a good dozen more years.

Enhanced by Zemanta

Why is it so hard to buy a standard developer workstation?

It's spring, 2013, and Intel has just released its Haswell processors and chipsets.  Motherboard vendors are touting their new wares and all the manufacturers are announcing new products.  As luck would have it, it's about time to refresh workstations in our office, too, so it's a fine time to take stock of the current state of the art for desktop systems.

So, without further ado, if you're a developer, you want a system along these lines:

  • Intel Core i7 -- Haswell is great, but Ivy Bridge would be fine.
  • At least 16 GB RAM.
  • An SSD boot drive - 128-256 GB.
  • Mirrored 7200rpm data drives - around 2-3 TB.
  • Integrated graphics are ok unless you do any serious graphics work.

As an aside, I checked my notes from 2009, and these are almost identical to the specs I put together the last time I looked at systems.  We've gone through a couple generations of CPUs and chipsets, and the "sweet spot" for storage buys about double the capacity now, but the rough idea is about the same.  I figure the cost for a system like this is down a third since then, too.

Incidentally, if you're a professional in a content-generation field (web design, illustration, photography, video), this is a decent starting spot for you, too, though you'll probably want to toss in a stout video card to help with the graphics.  Although you might be tempted to save a few bucks here or there, every single one of these elements is there because it adds value for a professional who relies on his equipment.  Nothing about this configuration is exotic or surprising.

Despite this, I continue to be astounded that nobody sells systems that look like this.  Obviously, you can build it yourself, and if you know the first thing about computers, I highly recommend this -- you'll wind up getting better parts, and there's something to be said for knowing your kit is built right.  Some shops would obviously rather buy PC's than pay people to assemble them, though, and so it is here.  Off to Dell.com, then.

Before I start ripping Dell, I've got to point out that I've generally been a fan of theirs.  I've used something approaching dozen of their PC's and laptops at work over the years, and I've got a Dell laptop at home that's recently been retired because it stopped charging. I used to have a Dell 1U server in my basement running VM's, in fact.  Noisy, but a nice little machine.  I've got no ax to grind with Dell, per-se.  Despite this, they're dead to me now.

I've always found their product lines to be a bit too complex, and their configurator is just about as much fun as a poke in the eye with a sharp stick.  I'd forgotten about this, but I shopped their site back in 2009 to see if they could build a system along the lines I indicated up front.  They couldn't.  Back then, this was disappointing, but not terribly surprising.  Intel's RAID-enabled chipsets were fairly new, so practical mirroring was fairly novel, and SSD's were just beginning to trickle down from enthusiast systems.  Today, both of these things should be considered absolutely mainstream.  I honestly don't understand how both of these features aren't considered standard equipment for anyone who makes more than minimum wage.

On top of not being able to build the system I wanted, the web site absolutely blew chunks.    As soon as I visited the site, I got one of those "Will you leave your opinion?" pop-ups, and every time I selected a system, the page scrolled over to the top-right, where a "Chat with Dell" window appeared, offering me help.  Offer accepted.  Needless to say, "Brandon" wasn't able to help me: "...and no workstations that we have will allow an SSD boot hard drive and then mirrored 2nd and 3rd hard drives."  No kidding.  I also participated in their feedback session.  At the end, they asked if I had any additional notes for them.  I did:

The last time I tried shopping for a system here was 2009.  At that time, I was looking for a Core i7 desktop with around 16GB RAM, an SSD boot drive and mirrored data drives.  I couldn't find one.  Today, I tried shopping for the same thing, and guess what -- can't get there from here.  This is a standard configuration for developers, and you literally can't buy it from Dell.  You guys might want to worry a little less about how your buyout is going and more about building PC's.  Just sayin'.

So, that was Dell.  The good news is that HP fared better.  I opened the site without drama, and found a desktop right off the bat that would support the configuration I wanted. The system I wound up configuring used an Ivy Bridge processor instead of Haswell, but at least I could get something close.  Dell, I know you probably sell far more PC's to secretaries and call center workers than you do to developers, but if you don't hold the high ground, you're going to get your head handed to you on commodity systems.

Anyway, it's been nice knowin' ya, Dell.

Enhanced by Zemanta

In design, details matter

Have you ever experienced a cascading menu that seemed to run away from you as you navigated it?  This is one of those subtle usability failings that can lead to a disembodied hatred of a site or application.  Very few people will notice what's actually going wrong, let alone what should be done to fix it.

Galileo
Galileo (Photo credit: dglambert)

Ben Kamens noticed when Amazon got this right.  Not only did he notice, he wrote up what he found and developed a jQuery menu you can use on you own site to achieve the same fix.  The improved implementation, by the way, has its origins in noticing what direction your mouse is moving -- if it's headed toward a sub-menu, this implementation gives you a chance to catch it.

The moral of the story?  When you get this stuff right, most people never notice the details, but they'll notice the feeling of quality in the product.  Nobody loves an Apple iAnything because the edges are chamfered exactly so or because the icons are rounded just a bit, but they notice that it feels solid and sorted out.  I'll bet you'd have a hard time finding people who can tell you exactly why a BMW feels better than a Chevy, but most of them will agree that it does.

As a designer, it's important that you do, in fact, notice the little stuff, and that you understand how these details contribute to the quality of your product.  With any luck at all, you'll work for an organization that also gets this stuff, because you'll also find it pretty frustrating to try to explain details like this to a bean-counter that hasn't got any awareness of this relationship.

Enhanced by Zemanta

Microsoft still struggling to put pieces together

I've been a Microsoft developer for a lot of years now.  As such, I'm intrinsically motivated to want to see them succeed.  For that reason, it's painful to see what's become of the Microsoft juggernaut.  Office hasn't given us a meaningful improvement since somewhere around Office '97.  Windows fared a little better, probably due in no small part to the dismal showing of Vista, which made Windows 7 look like a breath of fresh air.  Despite this, I still think Microsoft fields the best set of developer tools, top to bottom, of anyone, and I'd love to keep developing solutions with them.

Microsoft Office Mobile on Windows Phone
Microsoft Office Mobile on Windows Phone (Photo credit: Wikipedia)

As a fan of Microsoft, then, I'd love to see Windows 8 take off -- on the desktop, tablets, phones -- everywhere, but it's not, and I don't have to look to far to understand why.  I recently upgraded three machines at home from Windows 7 to Windows 8, and I have to admit that the tablet features appear to have been duct-taped onto Windows 8 with little regard to optimizing the experience for either type of client.  I can only imagine what the phone experience is like.  I'm still finding myself at a loss for where various bits and pieces have wandered off to.  Thank God for search, or I very well might have downgraded by now.

Worst of all, there are signs that Win 8's problems are a bit more widespread than my own personal adoption headaches.  Well-known developer evangelist Rocky Lhotka wrote a post this week addressing licensing headaches that could very well keep enterprise customers from adopting WinRT for internal applications, and MVP John Petersen wrote about the continued lack of applications for Windows 8.  Are these problems affecting Microsoft's bottom line?  It may be too early to call, but reports indicate that Microsoft is cutting prices on Windows and Office, and that's not a good sign.

As far back as I can remember, Microsoft has been king of the "platform".  They've always understood that there's a synergistic relationship between OS, applications, developer tools, and users.  It's possible to be successful successful in one or two of these areas, but if you're able to leverage success in one area to grow in another, the leverage is tough to beat.  It's too late for Microsoft to win mobile by meeting Apple or Android in a heads-up battle.  Same goes for tablets.  If Microsoft hopes to be relevant again (let alone dominant), they need a holistic solution that blows open a market that Apple and Google don't already own.

So, what's left?  Unfortunately, there's very little obvious green field left, but the one real hunk of market where Microsoft actually holds the high ground is entertainment -- namely, Xbox.  Sadly, Microsoft has been running Xbox like its own little company since Day 1, so although it works really well with Windows, there's so much more synergy to be had in home computing and entertainment if Microsoft would merely re-assemble pieces and parts they already own into a platform that would actually add value in the home.

Curious?  Stick around -- next time, I'll lay out the product that could save Microsoft if they'd just break down some walls and build it!

Enhanced by Zemanta

Testing – It’s not just a good idea

Straight out of the business section, here's a story from CNN about a small business owner who came up with an innovative algorithm to generate t-shirt slogans.  Failure to test and monitor the output, however, led to a host of horribly offensive slogans, followed by a social media outcry and a blacklisting from Amazon.

Now, the owner of this business says the company is 'dead'.  Don't let this happen to you -- your business could be one headline away from the same fate if you don't manage risks carefully.

Pardon the inturruption

A while back, I experienced a small hacking disaster here, and although it took a while to get that all sorted out, I think I'm finally back up and running.  I honestly had no idea it'd been this long, though...

In any event, I hope to get back to illustrating some of my more colorful software journeys here, so if you're rejoining me after the break, welcome back, and if you're joining me for the first time, thanks for the visit!

Five keys to IT Operations

Any developer who builds a new system dreams of the day their software goes live, and real users start pounding on their application.  This is perhaps the most tangible validation of their work that some developers ever have.  On that day, it's pretty common for developers to either be the team responsible for running the system, or at least to be working shoulder-to-shoulder with the people who are running it.

For those lucky souls who "grow into" this role, though, it may be helpful to have a sense of perspective about what success in Operations means.  Although specific technical and design details are aligned with your implementation, your job ultimately boils down to some simple objectives:

1. Understand what's supposed to happen

Eugene F. "Gene" Kranz, provided by ...
Gene Kranz - Image via Wikipedia

While Development is all about designing, building, and testing, Operations is all about Execution.  Go rent Apollo 13, or better yet, pick up a copy of Failure is Not an Option by Gene Kranz.  Happiness in Operations means understanding what's going to happen before it happens, and there’s no such thing as a good surprise.  You can see this in the checklists that all the NASA engineers used, and when someone tries to tell you that you're too smart for checklists, remember that those guys were, in fact, rocket scientists.  Nuff said?

Virtually nothing in Operations is of any use whatsoever unless everyone who’s touching a keyboard knows the exact same stuff, which implies that checklists are written down in a way that ensures everyone’s got the same information.  If you don't have anything better to start with, get a whiteboard and write, "8am: Make sure the servers are still on," and then start filling in from there.

2. Know what's actually happening

This is the IT equivalent of watching the gauges on your car's dashboard.  Failure to pay attention to the gauges could prove costly when your engine starts shooting connecting rods through the hood.

In operations, you're watching for failures and performance problems -- hopefully in time to react to them before your customers start complaining, and you're watching for unusual activity that could indicate problems with other systems you interface with or even hacker activity.  When you get more sophisticated about what you're watching, you may even be able to provide design guidance on what features your customers are using the most, or whether there are parts of your application(s) that users seem to be struggling with, but please make sure you're covering the basics first – server uptime, exceptions, and application performance.

As usual, tools help here.  It’s a whole lot more efficient and effective to have a tool checking to make sure servers are responding properly.  Fortunately, there are all sorts of tools like this, including some free ones.

Important:  Be sure to understand the difference between IT Operations and Business Operations.  These can, in some cases, be co-resident, but remember that one is focused on your systems and the other is focused on the business.  These two aspects of Operations should communicate liberally back and forth, but it’s important to understand the difference between technical status and management and business status and management.

3. Communicate status

Since it’s Operations’ job to know what’s happening, they therefore serve as a fount of knowledge for other departments.  In a lot of cases, proactive communication is more effective than “pull” communication, and again, whenever you can drive decision-making out of the process, it’s a good thing.  Therefore, operations should know in advance what sort of events should trigger communication, and to whom they’d be communicating.  Some of this could, in fact, be automated.

Status is typically focused on what’s happening right now, but a complete understanding of status also includes a sense for whether measures are trending in one direction or another.  Data about how our system performs over time, for instance, can tell you a lot about whether a performance metric you’re seeing right now is a blip or part of a trend that’s moving steadily toward a big problem.  This sort of long-term information should also help us see performance or resource constraints in time to react to them before they affect customers.

4. Handle catastrophes

Sometimes, bad things happen to good applications.  When the sh*t hits the fan, it’s absolutely imperative that the cure isn’t worse than the disease.  Go watch Apollo 13 again.  Since everything that normally happens in operations should happen according to a checklist or procedure, it should be glaringly obvious to everyone (to the point of discomfort) that you’re now operating off-script.

I’ve heard pilots describe their jobs as “hours of boredom punctuated with seconds of sheer terror.”  This is when you want to open the cockpit door and see Sully sitting there.  Sully uses checklists, too, by the way.

5. Maintenance and planning

Since operations has done such a good job of ensuring our system is running like a Swiss watch, they’ve got some time left to plan for future improvements.  With any luck, this might include stuff like:

  • Preparing and managing hardware and/or virtual servers.
  • Planning infrastructure changes for upcoming software releases.  This is actually a very important form of developer support, because this is where operations and development work together to make sure you can deploy the things you’re building without any undue drama.
  • Tuning / tweaking system monitoring and management tools.
  • Analysis to assist development – where are your servers stressed, what custom tasks do you deal with today that might be built into the application, etc.

This list is just a start, of course, but it’s a pretty good start.

What tips would you add?

Enhanced by Zemanta

Reason #358 why I hate Flash

I use four PCs on a regular basis (two work PCs, plus a laptop and desktop at home), and all three run Windows -- one Windows Server 2003, one Server 2008, and two Windows 7.  All of these boxes are either on 24x7 or hibernated between uses, so the only time I reboot them is to install Windows updates.

And every... single.... time... I reboot any of these machines, I see one of these:

I typically go ahead and let Flash do what it wants to do, and yet it keeps coming back, over and over and over again.  Based on this, I'm forced to conclude that either (1) Flash isn't really updating correctly, or (2) it really does have a new update to install every single time I reboot.

Neither of these is acceptable.  Adobe, you're not building an OS here.  Get it right and get out of my way.  If there's  a *real* new version or a *real* security disaster, then let me know about it, but I just refuse to believe that there are really that many emergencies that you need to install something every single time I reboot.

If you're wondering why folks like Apple have made such a big stink about getting Flash off their systems, this is exactly the sort of issue they had in mind.

The bug “event horizon”

Black holes are fearsome astronomical phenomenon that are so dense and have a gravity well so deep that their infinitely-large mass is compressed into an infinitely small space.  Anything that approaches these monsters is almost certain to be sucked into the hole, and there is a region surrounding each of them where escape is impossible, even for light itself;  it is called the Event Horizon.

The supermassive black holes are all that rema...
Image via Wikipedia

Bugs are like that, too.

A Bug's Life

First-year software development students (and even MBA's) learn that bugs in software are more time-consuming (and thus, expensive) to fix the further along in the software development process they're caught.  The easiest bug to fix is the one you prevent in the first place with proper architecture and design.  Sadly, these quantum bugs are hard to quantify, so good design rarely gets to take credit for these precognitive fixes.

Of the bugs that are actually found, the ones caught during development (perhaps by TDD or unit tests) are wonderfully cheap.  In many cases, mere seconds pass between detection and eradication, the developers' fingers never lifting from the keyboard. These bugs, in fact, are the next best thing to the ones that never existed in the first place, since most of them are never recorded in a bug tracking system and are probably never seen beyond the desk of the developer who found and fixed them.

After code is checked into a version control system, bugs move through a build process and on to testers, while simultaneously being propagated to other developers' desktops via the source code.  Bugs found at this point are slightly more costly to fix, because you've involved other people, and there's very likely a process and tracking system engaged to help keep track of the critters at this point.

But there's another reason these bugs are more costly to fix -- context switches.  As soon as a developer checks in his code, he begins moving on to his next task.  He'll forget all about the code he was working on, including in some cases tearing down whatever virtual infrastructure existed to develop that code.  When he's just about eyeball-deep in his next task, that bug comes home to bite him -- a big context switch.  If the bug is urgent, he's going to have to drop everything and get back up to speed on that code again, leaving behind any progress he managed to make on the new project.

The New Normal

Given that a developer's job is to develop, you'd like him to be able to devote his full attention to creating solid designs and stout code.  In most cases, the work we're asking of these people is really fairly difficult to do well, and it can be nearly impossible if he's not able to concentrate.  When design is interrupted by a bug or two per week, there's very likely an impact in terms of productivity as the developer changes contexts, but concentration and software design shouldn't be adversely affected to a large degree.

As interruptions become more frequent and prolonged, however, there can be an impact in the quality of new code that's produced, as well.  Design becomes disjointed and inconsistent.  Code becomes sloppy.  More new bugs are introduced.  We've introduced a vicious cycle of downwardly-spiraling quality.

As bug counts grow, they can start to have an impact beyond development and QA areas.  Help desks can become buried in calls; bugs begin to go un-recorded and uncorrected.  The noise level caused by poor software quality can contribute further to the downward spiral of quality.

Singularity

When code quality suffers so completely that the organization cannot produce high-quality code, the transformation of the organization is complete.  Buggy code eventually results in even buggier code, and software collapses upon itself as if it was a black hole.

Is the Demise Inevitable?

Unlike a real black hole, the problem of software quality can be stemmed -- especially if it's caught early.  Much like bugs themselves, though, the later this issue is addressed, the more difficult and expensive it will be to fix the problem.  When you consider the cost of bugs in your organization, don't forget the cumulative effects of bugs on the quality of the work you're able to produce.

 

Enhanced by Zemanta