Testing – It’s not just a good idea

Straight out of the business section, here’s a story from CNN about a small business owner who came up with an innovative algorithm to generate t-shirt slogans.  Failure to test and monitor the output, however, led to a host of horribly offensive slogans, followed by a social media outcry and a blacklisting from Amazon.

Now, the owner of this business says the company is ‘dead’.  Don’t let this happen to you — your business could be one headline away from the same fate if you don’t manage risks carefully.

Pardon the inturruption

A while back, I experienced a small hacking disaster here, and although it took a while to get that all sorted out, I think I’m finally back up and running.  I honestly had no idea it’d been this long, though…

In any event, I hope to get back to illustrating some of my more colorful software journeys here, so if you’re rejoining me after the break, welcome back, and if you’re joining me for the first time, thanks for the visit!

Five keys to IT Operations

Any developer who builds a new system dreams of the day their software goes live, and real users start pounding on their application.  This is perhaps the most tangible validation of their work that some developers ever have.  On that day, it’s pretty common for developers to either be the team responsible for running the system, or at least to be working shoulder-to-shoulder with the people who are running it.

For those lucky souls who “grow into” this role, though, it may be helpful to have a sense of perspective about what success in Operations means.  Although specific technical and design details are aligned with your implementation, your job ultimately boils down to some simple objectives:

1. Understand what’s supposed to happen

Eugene F. "Gene" Kranz, provided by ...
Gene Kranz – Image via Wikipedia

While Development is all about designing, building, and testing, Operations is all about Execution.  Go rent Apollo 13, or better yet, pick up a copy of Failure is Not an Option by Gene Kranz.  Happiness in Operations means understanding what’s going to happen before it happens, and there’s no such thing as a good surprise.  You can see this in the checklists that all the NASA engineers used, and when someone tries to tell you that you’re too smart for checklists, remember that those guys were, in fact, rocket scientists.  Nuff said?

Virtually nothing in Operations is of any use whatsoever unless everyone who’s touching a keyboard knows the exact same stuff, which implies that checklists are written down in a way that ensures everyone’s got the same information.  If you don’t have anything better to start with, get a whiteboard and write, “8am: Make sure the servers are still on,” and then start filling in from there.

2. Know what’s actually happening

This is the IT equivalent of watching the gauges on your car’s dashboard.  Failure to pay attention to the gauges could prove costly when your engine starts shooting connecting rods through the hood.

In operations, you’re watching for failures and performance problems — hopefully in time to react to them before your customers start complaining, and you’re watching for unusual activity that could indicate problems with other systems you interface with or even hacker activity.  When you get more sophisticated about what you’re watching, you may even be able to provide design guidance on what features your customers are using the most, or whether there are parts of your application(s) that users seem to be struggling with, but please make sure you’re covering the basics first – server uptime, exceptions, and application performance.

As usual, tools help here.  It’s a whole lot more efficient and effective to have a tool checking to make sure servers are responding properly.  Fortunately, there are all sorts of tools like this, including some free ones.

Important:  Be sure to understand the difference between IT Operations and Business Operations.  These can, in some cases, be co-resident, but remember that one is focused on your systems and the other is focused on the business.  These two aspects of Operations should communicate liberally back and forth, but it’s important to understand the difference between technical status and management and business status and management.

3. Communicate status

Since it’s Operations’ job to know what’s happening, they therefore serve as a fount of knowledge for other departments.  In a lot of cases, proactive communication is more effective than “pull” communication, and again, whenever you can drive decision-making out of the process, it’s a good thing.  Therefore, operations should know in advance what sort of events should trigger communication, and to whom they’d be communicating.  Some of this could, in fact, be automated.

Status is typically focused on what’s happening right now, but a complete understanding of status also includes a sense for whether measures are trending in one direction or another.  Data about how our system performs over time, for instance, can tell you a lot about whether a performance metric you’re seeing right now is a blip or part of a trend that’s moving steadily toward a big problem.  This sort of long-term information should also help us see performance or resource constraints in time to react to them before they affect customers.

4. Handle catastrophes

Sometimes, bad things happen to good applications.  When the sh*t hits the fan, it’s absolutely imperative that the cure isn’t worse than the disease.  Go watch Apollo 13 again.  Since everything that normally happens in operations should happen according to a checklist or procedure, it should be glaringly obvious to everyone (to the point of discomfort) that you’re now operating off-script.

I’ve heard pilots describe their jobs as “hours of boredom punctuated with seconds of sheer terror.”  This is when you want to open the cockpit door and see Sully sitting there.  Sully uses checklists, too, by the way.

5. Maintenance and planning

Since operations has done such a good job of ensuring our system is running like a Swiss watch, they’ve got some time left to plan for future improvements.  With any luck, this might include stuff like:

  • Preparing and managing hardware and/or virtual servers.
  • Planning infrastructure changes for upcoming software releases.  This is actually a very important form of developer support, because this is where operations and development work together to make sure you can deploy the things you’re building without any undue drama.
  • Tuning / tweaking system monitoring and management tools.
  • Analysis to assist development – where are your servers stressed, what custom tasks do you deal with today that might be built into the application, etc.

This list is just a start, of course, but it’s a pretty good start.

What tips would you add?

Enhanced by Zemanta

Reason #358 why I hate Flash

I use four PCs on a regular basis (two work PCs, plus a laptop and desktop at home), and all three run Windows — one Windows Server 2003, one Server 2008, and two Windows 7.  All of these boxes are either on 24×7 or hibernated between uses, so the only time I reboot them is to install Windows updates.

And every… single…. time… I reboot any of these machines, I see one of these:

I typically go ahead and let Flash do what it wants to do, and yet it keeps coming back, over and over and over again.  Based on this, I’m forced to conclude that either (1) Flash isn’t really updating correctly, or (2) it really does have a new update to install every single time I reboot.

Neither of these is acceptable.  Adobe, you’re not building an OS here.  Get it right and get out of my way.  If there’s  a *real* new version or a *real* security disaster, then let me know about it, but I just refuse to believe that there are really that many emergencies that you need to install something every single time I reboot.

If you’re wondering why folks like Apple have made such a big stink about getting Flash off their systems, this is exactly the sort of issue they had in mind.

The bug “event horizon”

Black holes are fearsome astronomical phenomenon that are so dense and have a gravity well so deep that their infinitely-large mass is compressed into an infinitely small space.  Anything that approaches these monsters is almost certain to be sucked into the hole, and there is a region surrounding each of them where escape is impossible, even for light itself;  it is called the Event Horizon.

The supermassive black holes are all that rema...
Image via Wikipedia

Bugs are like that, too.

A Bug’s Life

First-year software development students (and even MBA’s) learn that bugs in software are more time-consuming (and thus, expensive) to fix the further along in the software development process they’re caught.  The easiest bug to fix is the one you prevent in the first place with proper architecture and design.  Sadly, these quantum bugs are hard to quantify, so good design rarely gets to take credit for these precognitive fixes.

Of the bugs that are actually found, the ones caught during development (perhaps by TDD or unit tests) are wonderfully cheap.  In many cases, mere seconds pass between detection and eradication, the developers’ fingers never lifting from the keyboard. These bugs, in fact, are the next best thing to the ones that never existed in the first place, since most of them are never recorded in a bug tracking system and are probably never seen beyond the desk of the developer who found and fixed them.

After code is checked into a version control system, bugs move through a build process and on to testers, while simultaneously being propagated to other developers’ desktops via the source code.  Bugs found at this point are slightly more costly to fix, because you’ve involved other people, and there’s very likely a process and tracking system engaged to help keep track of the critters at this point.

But there’s another reason these bugs are more costly to fix — context switches.  As soon as a developer checks in his code, he begins moving on to his next task.  He’ll forget all about the code he was working on, including in some cases tearing down whatever virtual infrastructure existed to develop that code.  When he’s just about eyeball-deep in his next task, that bug comes home to bite him — a big context switch.  If the bug is urgent, he’s going to have to drop everything and get back up to speed on that code again, leaving behind any progress he managed to make on the new project.

The New Normal

Given that a developer’s job is to develop, you’d like him to be able to devote his full attention to creating solid designs and stout code.  In most cases, the work we’re asking of these people is really fairly difficult to do well, and it can be nearly impossible if he’s not able to concentrate.  When design is interrupted by a bug or two per week, there’s very likely an impact in terms of productivity as the developer changes contexts, but concentration and software design shouldn’t be adversely affected to a large degree.

As interruptions become more frequent and prolonged, however, there can be an impact in the quality of new code that’s produced, as well.  Design becomes disjointed and inconsistent.  Code becomes sloppy.  More new bugs are introduced.  We’ve introduced a vicious cycle of downwardly-spiraling quality.

As bug counts grow, they can start to have an impact beyond development and QA areas.  Help desks can become buried in calls; bugs begin to go un-recorded and uncorrected.  The noise level caused by poor software quality can contribute further to the downward spiral of quality.


When code quality suffers so completely that the organization cannot produce high-quality code, the transformation of the organization is complete.  Buggy code eventually results in even buggier code, and software collapses upon itself as if it was a black hole.

Is the Demise Inevitable?

Unlike a real black hole, the problem of software quality can be stemmed — especially if it’s caught early.  Much like bugs themselves, though, the later this issue is addressed, the more difficult and expensive it will be to fix the problem.  When you consider the cost of bugs in your organization, don’t forget the cumulative effects of bugs on the quality of the work you’re able to produce.


Enhanced by Zemanta

CodeBetter.Com wants to send you to Redmond!

Here’s a great way for you to get to Redmond this fall for one of the most popular conferences of the season.  CodeBetter.Com is giving away one conference pass to Visual Studio Live! in Redmond (October 17th – 21st, 2011), plus $500 toward travel / hotel costs.  To enter, go visit CodeBetter.Com wants to send you to Redmond! and comment / tweet / trackback to that post.  Note: you can trackback here if you want, but that’s not going to help you win the contest!  They’re going to pick a winner and announce the final recipient this Tuesday, September 20th at 12:00 EST.

Saving Microsoft

The last few years have been trying times for Microsoft.  Late to jump on the web bandwagon, they’ve never owned that platform the way they owned the desktop.  Internet Explorer remains a decidedly un-sexy choice for web browsing, and Microsoft might never recover from the black eye that was Vista.  Office now faces some really credible online competition, and Bing is still light years behind Google in the search engine war.

But the most unkindest cut of all has to be watching Apple ascend to become the world’s most valuable company.  I mean, it wasn’t enough to watch the Mac chip away at Windows (many Windows developers, in fact, claim that Macbooks are the best portable Windows development boxes).  The iPod never even flinched when the Zune came along, and the iPhone, of course, completely decimated Windows Mobile, which was already under heavy pressure from Palm.  The coup de grâce might just be the iPad, which now has some declaring the death of the traditional PC.  While it remains to be seen if (or when) PC’s are really dead, there’s no denying that there’s already been a noticeable impact on PC sales, and it’s very possible that this was a factor in HP’s decision to get out of the PC business this week.

Will the last one to leave Redmond…

So is that really the end for Microsoft?

Not necessarily.  I don’t think we’re ever going to see the heady days of near-monopoly that Microsoft enjoyed in the early 90’s, but Microsoft is still huge, they make a lot of money, and you can still find their software on most business computers.  The Xbox is doing well, Bing refuses to give up, and the newly-reborn Windows Phone 7 seems to be winning fans every day.  Microsoft’s Skype acquisition could give WP7 another shot in the arm.

But there’s no mistaking the fact that “business as usual” isn’t getting the job done for Microsoft.  The Windows folks are now hard at work on Windows 8, but early discussions about HTML5 support in Windows 8 has caused quite a lot  of anxiety among Silverlight developers, who fear they’re now stuck on a legacy platform.

Mark my words:  If Microsoft’s Windows 8 strategy causes more developers to flee to the iOS or Android platforms, you can go ahead and cue the fat lady.  You see, since the very earliest days of Windows, it’s been the growth and productivity of Microsoft’s development platform that’s attracted developers, who built the applications, which attracted the users.  Ballmer had it right way back in 2000 – they’re in big trouble without “developers, developers, developers.”

What developers want

Despite the momentum of iOS and Android, neither of these platforms can touch Visual Studio for developer productivity.  All things being equal, this should make Visual Studio the automatic winner, and when Windows ruled the desktop, it was.  Now that Windows is no longer the gorilla it once was, though, VS can’t win on productivity because you can’t use Visual Studio to produce apps that run on all the devices you want to support.

That’s the “aha” moment.

Forget Windows vs. WPF vs. Silverlight.  Visual Studio needs to support development of applications that run on all those platforms, plus iOS, plus Android.  There’s no reason Microsoft can’t deliver this functionality in Visual Studio, and nothing short of this will be remotely close to good enough.

Don’t take my word for it

About a week ago, Charlie Kindel announced that he was leaving Microsoft to go do his own thing.  Charlie was the GM of the Windows Phone Developer Ecosystem, and before that, he led the Windows Home Server team.  Although this background admittedly makes Charlie a little biased, in an interview on Geekwire, Todd Bishop asked Charlie an interesting question about mobile platform development:

If you build an app for your new company, which mobile platform will you target first?

Kindel: Hypothetically, if my new company were to build mobile apps, we’d target WP7 first. You know the old saying “Code Talks”:  I know I can build a beautiful and functional WP7 app in a fraction of the time it would take to build an iOS or Android app. Startups are about executing quickly. But I’m sure we’d quickly take what we learned there and apply it on all the popular devices.

Right there, you have the value proposition for a cross-platform development tool, because although I think Charlie is right about the productivity gains on Visual Studio, I’m skeptical that most startups are really going to target WP7 ahead of iOS or Android.  In fact, we’re now on the  verge of HTML5 being that go-to platform, and right now, HTML5 development tools are so immature that Visual Studio just doesn’t have a productivity edge vs. anything else.

This is a big image problem for Microsoft now, and as more business apps need to target multiple platforms, it’s going to start costing Microsoft more market share and more profits from its last real stronghold: businesses.

So, what do I want to see?

Ok, here’s the todo list, Microsoft:

  • It’s beyond ridiculous that WPF and Silverlight continue to be separate.  I understand that you can do more stuff on the desktop than you can do when you’re deployed as an internet app.  Fine.  You don’t need a whole other client framework for that.  Stop it. Now.  Thank you.
  • I want to use the VS2010 layout tools I’d use for Silverlight development to build an HTML5 application.
  • I want to have all the declarative validation I create using data annotations create Javascript for those HTML5 applications.  We’ve seen hints of this in MVC already.
  • While we’re at it, I want to deploy the same application to the Windows desktop (or tablet) or a Silverlight client, or an HTML 5 client, or even a native iOS or Android client.  All of these clients use declarative, hierarchical UI layout frameworks, and all of them can support some form of .Net via mono.
  • Xbox, too — HTML5 would be fine, but I want to run the same apps there, too.
  • If you can’t (or won’t) make Silverlight cross-platform on the client, then can it and double down on HTML5.
  • End the fractured development tool practices that have plagued Microsoft.  Silverlight vs. WPF is the classic example, but there have been countless examples of competing data access technologies and other frameworks, too.  It feels confused and disjointed, and it’s not helping.

With these things in place, Microsoft would have a real shot at being the premier development platform for business applications for another generation.  I know that the position in the past has been to tie Microsoft development to Microsoft deployment platforms, but that fight is lost, and it’s time to garrison the last thing that Microsoft still does better than anyone else on the planet.

With developers in-hand, there’s no reason Microsoft can’t fight to take back OS platforms on phones, tablets, and so on, but if they lose the battle for developers, the flow of apps will dry up, and there will literally be no way for stop the hemorrhaging.


Enhanced by Zemanta

Dangerous Generalizations

I read an article a couple weeks ago on ReadWriteWeb pondering, Are Indian Developers More Skilled Than Americans?, and I just couldn’t help cringing the entire time I was reading the article.

This article was ridiculous on so many levels it’s hard to count, but the two biggest problems I saw were (1) lumping all “American” and all “Indian” developers together and (2) assuming that “better at C” or “better at SQL” means squat when it comes to writing applications that meet the needs of users.

If you’re in a position where an article like this might potentially influence hiring or management decisions, please be sure to apply your own “BS” meter to these ideas.   While you’re at it, be sure to question whether someone listing a skill on a resume really means anything about what that person knows, and whether it’s more important to know the latest framework or to have the grey matter to know if that framework makes sense for your business.  Oh, and by the way — if someone lists every new technology you’ve ever heard of on their resume, you’d be wise to question how deeply he knows any of them.

Just sayin’.

Enhanced by Zemanta

Did you test that SQL?

Although IT is a relatively young field relative to accounting or finance, I think a pretty fair number of people have picked up on some of the “big picture” ideas that drive modern development practices.  Whether you’re working with a waterfall process or a more agile process, for instance, most people understand that you just don’t make code changes in live production environments.  We put testing environments in place, devote time from QA staff, and take care to plan installations for low-volume, low-impact times.

Yet how many of those same people are more than happy to let someone with a SQL window reach into their production database and fiddle with data to their heart’s content?  Even though that’s a big ol’ loaded gun pointed straight at their foot, most people don’t recognize it that way — starting with the people with their fingers on the keyboard, and working all the way up to the corner office.

Really? A loaded gun?

How can a data change ever be anywhere near as dangerous as a code change?  Let’s consider an easy example – you update a credit card code field to be “VS” instead of “VISA”, and all your Visa orders fail.  “Nobody’s going to make a mistake like that,” you argue, and you’d probably be right.  But by the same logic, nobody’s going to go into the code for a production server and write “if (cardtype = card.Visa) throw new Exception”, either.  Real bugs are just a little bit trickier than that.

The data you’re really going to change is much more likely to have lookup values, effective dates, disabled flags, and so on, but the end result of screwing this data up is exactly the same: your system becomes degraded in a big hurry.  And unlike changes to source code, you’re very likely not to have a repository that’ll show you the last umpty-seven changes with comments from the guy who made the changes.

Yet, because we’re conditioned to think about code as the dangerous bits, and data as an innocent bystander, we tend to just gloss right over procedures that we wouldn’t dream of bypassing for code changes.  It’s a big mistake.

Types of data

To begin with, it turns out that all data isn’t created equal.  SAP R/3 classifies three different types of data: configuration data, master data, and transactional data.  In this usage, configuration data is limited to customizations to R/3 itself — changing labels, terms, and so on.  There’s an excellent chance that your system doesn’t have any data like this.  Master data is the scaffolding that makes our systems run.  It’s the customer types table and the status codes table and the rate plans table.  If you don’t have any values in those tables (or the right values), your system isn’t going to do squat.  Transactional data is the meat in the sandwich — it’s the reason you’ve got a system in the first place, but it’s also much less interesting and dangerous than master data.

It would appear, then, that master data is the place to focus with respect to controlling change, and that’s true to a large degree.  Your organization would be doing far better than average just by identifying its master data and putting procedures in place to manage changes to those tables, but does than mean that transactional data is completely inert?

Breakin’ the law

Not so fast, cowboy.  Although the amount of damage you can do by messing with transactional data is small compared to master data, it’s important to recognize that twiddling with transactional data with a SQL prompt is also an unnatural act.  For many lookup fields and references, your database should have foreign key constraints that prevent really bad mistakes.  The insidious bugs in this case come from the fact that you’re bypassing business rules that exist in your online system.

If an order is intended to move from status 1 to statuses 2, 3, 4 and 5, for instance, you’ve probably got some state-machine logic somewhere that defines which transitions are legal, and more importantly, what stuff happens during those transitions (updating order line-items, refreshing inventory numbers, etc.).  If you update a transaction to a different status without regard for these rules, you can introduce all sorts of “interesting” downstream behaviors.

In fairness, I hope you wouldn’t be considering SQL updates if your system was already operating exactly the way it should, so you probably have a pretty good reason for feeling that you’ve got to make changes to put the system back on the rails.  Fair enough.  Just be hyper-aware that you’re taking chances with this data, and if there’s a choice that gets you moving again with less manual intervention, it’s probably the right choice.


Do you know what happens if you highlight the UPDATE part of an update query, leaving off the “WHERE” statement, and then run it?  ‘Nuff said, I hope.  You have the capacity to do a staggering amount of damage in a very short amount of time.

A better approach

If any of this stuff sounds like it applies to your organization, you might be wondering what you can do to bring a little law and order to your database management world.  Good news!  If you’ve made it this far, you’ve taken your first step: awareness.  Simply understanding that you need to exercise care when making ad-hoc data changes is a great place to start.

Next, try to understand which of your data is transactional and which is master data or configuration data.  This may be a large task, and you might find tables for which the answer isn’t very clear, so start with some of your most visible or notable tables and work through the rest as you’re able.

Once you’ve identified your master data and transactional data, fine-tune your processes for updating these tables.  You should consider making master data changes in a test environment and promoting these to production just as you’d handle a code change, but regardless, be sure to review how you’re managing these changes, including tracking the content and context of any changes.

If your organization has any tips for managing master data, be sure to let me know!


Enhanced by Zemanta