May 2011 – Lex Parsimonae

Although IT is a relatively young field relative to accounting or finance, I think a pretty fair number of people have picked up on some of the "big picture" ideas that drive modern development practices. Whether you're working with a waterfall process or a more agile process, for instance, most people understand that you just don't make code changes in live production environments. We put testing environments in place, devote time from QA staff, and take care to plan installations for low-volume, low-impact times.

Yet how many of those same people are more than happy to let someone with a SQL window reach into their production database and fiddle with data to their heart's content? Even though that's a big ol' loaded gun pointed straight at their foot, most people don't recognize it that way -- starting with the people with their fingers on the keyboard, and working all the way up to the corner office.

Really? A loaded gun?

How can a data change ever be anywhere near as dangerous as a code change? Let's consider an easy example - you update a credit card code field to be "VS" instead of "VISA", and all your Visa orders fail. "Nobody's going to make a mistake like that," you argue, and you'd probably be right. But by the same logic, nobody's going to go into the code for a production server and write "if (cardtype = card.Visa) throw new Exception", either. Real bugs are just a little bit trickier than that.

The data you're really going to change is much more likely to have lookup values, effective dates, disabled flags, and so on, but the end result of screwing this data up is exactly the same: your system becomes degraded in a big hurry. And unlike changes to source code, you're very likely not to have a repository that'll show you the last umpty-seven changes with comments from the guy who made the changes.

Yet, because we're conditioned to think about code as the dangerous bits, and data as an innocent bystander, we tend to just gloss right over procedures that we wouldn't dream of bypassing for code changes. It's a big mistake.

Types of data

To begin with, it turns out that all data isn't created equal. SAP R/3 classifies three different types of data: configuration data, master data, and transactional data. In this usage, configuration data is limited to customizations to R/3 itself -- changing labels, terms, and so on. There's an excellent chance that your system doesn't have any data like this. Master data is the scaffolding that makes our systems run. It's the customer types table and the status codes table and the rate plans table. If you don't have any values in those tables (or the right values), your system isn't going to do squat. Transactional data is the meat in the sandwich -- it's the reason you've got a system in the first place, but it's also much less interesting and dangerous than master data.

It would appear, then, that master data is the place to focus with respect to controlling change, and that's true to a large degree. Your organization would be doing far better than average just by identifying its master data and putting procedures in place to manage changes to those tables, but does than mean that transactional data is completely inert?

Breakin' the law

Not so fast, cowboy. Although the amount of damage you can do by messing with transactional data is small compared to master data, it's important to recognize that twiddling with transactional data with a SQL prompt is also an unnatural act. For many lookup fields and references, your database should have foreign key constraints that prevent really bad mistakes. The insidious bugs in this case come from the fact that you're bypassing business rules that exist in your online system.

If an order is intended to move from status 1 to statuses 2, 3, 4 and 5, for instance, you've probably got some state-machine logic somewhere that defines which transitions are legal, and more importantly, what stuff happens during those transitions (updating order line-items, refreshing inventory numbers, etc.). If you update a transaction to a different status without regard for these rules, you can introduce all sorts of "interesting" downstream behaviors.

In fairness, I hope you wouldn't be considering SQL updates if your system was already operating exactly the way it should, so you probably have a pretty good reason for feeling that you've got to make changes to put the system back on the rails. Fair enough. Just be hyper-aware that you're taking chances with this data, and if there's a choice that gets you moving again with less manual intervention, it's probably the right choice.

SQL = WMD

Do you know what happens if you highlight the UPDATE part of an update query, leaving off the "WHERE" statement, and then run it? 'Nuff said, I hope. You have the capacity to do a staggering amount of damage in a very short amount of time.

A better approach

If any of this stuff sounds like it applies to your organization, you might be wondering what you can do to bring a little law and order to your database management world. Good news! If you've made it this far, you've taken your first step: awareness. Simply understanding that you need to exercise care when making ad-hoc data changes is a great place to start.

Next, try to understand which of your data is transactional and which is master data or configuration data. This may be a large task, and you might find tables for which the answer isn't very clear, so start with some of your most visible or notable tables and work through the rest as you're able.

Once you've identified your master data and transactional data, fine-tune your processes for updating these tables. You should consider making master data changes in a test environment and promoting these to production just as you'd handle a code change, but regardless, be sure to review how you're managing these changes, including tracking the content and context of any changes.

If your organization has any tips for managing master data, be sure to let me know!

Month: May 2011

Did you test that SQL?

Really? A loaded gun?

Types of data

Breakin' the law

SQL = WMD

A better approach