Automated patterns considered harmful

A couple years ago, if you read some of the "best practices" stuff coming out of Redmond, you'd have thought that software factories were going to transform software development.  Thankfully, this turns out not to have been the case.  I never met a software factory I didn't detest almost immediately, and I'm glad the idea hasn't really caught on any more than it did.

Software factories generally consist of some tools to assist software development, but a central theme of these tools is that they generate code for you.  I've always felt that automation in software development is a very good thing, but it's vitally important that you understand what you're left with when you're done.  If automation results in code that you never have to touch or see, then you're probably more productive as a result, but if you're generating code that you're going to have to maintain, there's a very good chance that you're taking one step forward and two steps back.

Most developers are familiar with "Don't Repeat Yourself" (DRY).  The objective of DRY, of course, is not just reduction in code (and thus, effort) when developing new software, but a centralization of logic that pays dividends as you maintain software.  Code generation often accomplishes a deceptively attractive initial productivity for new development, but the generated code is typically littered with repetitive code.  This code is "free" when generated, but it's an anchor around your neck every time you have to maintain the code, or even when you step through it while debugging.  It clutters your project, reduces readability, and inhibits your capacity to maintain, revise, and refactor your software.  There's a reason why repetitive code is usually right at the top of Code Smells lists.

Given my strong preference for software craftsmanship, it shouldn't be surprising that I consider software factories that barf out projects and classes that you're supposed to maintain an absolute blight upon the landscape of software development, but there are any number of other automation tools available to us.  Here are some thoughts on a few of these:

  • Designers.  These remain the best example of code generation done right, in my opinion.  Designers, when properly executed, produce wholly-standalone code files that we can largely ignore.  Classes are declared as "partial" so that if you need to modify them, you can do so without touching the generated code.  Some designers will add [DebuggerStepThrough] attributes so you don't see this code when debugging.  All of these things help the generated code disappear unless we're specifically looking for something, and that's a very good thing.
  • Snippets.  I remain mixed on these little gems.  When used correctly, they can be a big help, but in practice, they're almost always a sign that you should be doing things differently.  Whether or not you're using a snippet to generate code, you don't want to end up with code that violates DRY, and this means that the opportunities to use snippets effectively are few and far between.
  • T4.  The best usage I've seen of this generation tool (built into Visual Studio, by the way), is Rob Conery's SubSonic data access project.  T4 creates code from templates written in an ASP-like syntax, and is great for iterating over a database or other object structure to crank out code in for...each loops.  Like designers, this code is intended to be read-only (you shouldn't modify the generated code), and it can be marked with [DebuggerStepThrough].
  • Reflection.  Yes, it's a little bit of a stretch to call reflection an automation tool, but creative use of reflection can help you achieve some Ruby on Rails-like productivity by adding behavior dynamically.  Reflection is commonly maligned as a performance-killer, but Rocky Lhotka has been doing great things with reflection in CSLA for years with minimal impact on performance.
  • Copy-paste coding.  If you're not even using a tool to help you with your unnecessary code duplication, you're definitely doing it wrong.  'Nuff said, I think.

In addition to these "automation" tools, there are language features and frameworks available to us today that can serve the same productivity objectives without resulting in tons of repeated code:

  • Attributes.  Typically used with reflection to act on classes at run-time, attributes can make your code much more expressive by declaring behavior rather than implementing it over and over.
  • MVC.  Since MVC is a framework, it doesn't really do anything at all to enforce DRY (or any other coding practice), but it encourages a declarative style of Model development that's very consistent with the ideas I've been discussing here, and most MVC examples use a very expressive, compact Model syntax.
  • Model-driven development.  Microsoft's data modeling bits (known at various times as "Oslo") consist of tools, modeling syntax, and extensions that create a very dynamic metadata-driven application environment.  We're still looking at the early stages of these tools, but the broad objective is to make object behavior completely declarative and dynamic.  There's a danger that we could trade unmaintainable code for unmaintainable configuration data, but I think that as this technology matures, it's going to move us in the right direction.

In addition to the options available to us today, we'll continue to see innovations in the future.  When you're reviewing and evaluating these options, though, remember to always add lightness, because less code is better.

Enhanced by Zemanta

4 Replies to “Automated patterns considered harmful”

  1. Not sure I understand your post.

    Your primary argument seems to be that generated code that does not have to be edited is good, while generated code that must be edited is bad. While blanket statements like this one are rarely water tight, I largely agree with you. Generated code should not have to be edited. However, there are some cases where it can be useful to generate starting points from templates that must then be further developed. For example, I do like the way VS generates solution starting points containing one or more projects from templates.

    However, you also assert that factories spew out repetitive generated code that must be edited, and you contrast them with designers, shippets, T4, reflection, attributes, MVC and MDD, which you claim are good because they don't promote repetition.

    What confuses me about this assertion is that factories are not required to generate any code, or to perform any particular type of automation, for that matter. The essence of a factory is that it provides guidance on how to build a particular type of deliverable, along with useful resources that support the guidance. There is no requirement that if must provide code generators, much less code generators that spew out repetitive code that must be maintained by hand.

    A factory could provide collections of patterns and guidelines. It could supply a framework, or a set of libraries. It could supply a designer, or a set of integrated designers. It could supply snippets or templates. More importantly, it is up to the factory developers to decide what resources to supply, and how they should be used.

    Also, factories introduced model driven development (MDD) with domain specific languages (DSLs) and T4 at Microsoft, and the factory initiative spawned Oslo (not that I'm all that happy about how it's turned out, but that's a topic for another post), so factories can certainly use these technologies judiciously for good effect. They can also abuse them, if the factory developers so choose.

    Sounds like you've been looking at a poorly designed factory. The factory form factor is no more to blame for the improper use of code generation than the library form factor is to blame for poorly factored APIs that make testing difficult.

    1. Thanks for the great comment, Jack. By way of clarification, the stuff I’m objecting against most strenuously is generated code that developers are expected to maintain. You’re right about the fact that nothing in the definition of a factory says it has to operate that way, but I’ve seen some that do. I’ve also seen developers who take a pattern and copy-paste it all over the place without ever stopping to refactor the code. The common thread in both of these is the unnecessary volume of code.

      I believe that factories are getting better. The early ones that I saw were really pretty lousy at handling full life-cycle development — an early release of the Web Services Factory comes to mind. That factory was great at generating tons of code, but it had a really hard time with maintenance, to the extent that we gave up and did the maintenance by hand. I believe the current state of the art is greatly improved over those early examples, but I don’t want to lose sight of the fact that if you’re going to have to maintain code by hand, you really don’t want it to be ugly, repetitive code.

      Probably the worst contemporary example of this (note – it’s not a factory) is the Open XML Document Reflector. This tool is great for getting started with Office automation, but the code it emits is really ugly — sort of the ultimate de-normalized code representation of a document.

      The point of the post isn’t to pick on factories (sorry if I gave that impression) — it’s really more of a call to be vigilant about the quality of any generated code that you’re going to have to maintain so that you’re not stuck with a mass-generated mess.

    2. Thanks for the great comment, Jack. By way of clarification, the stuff I’m objecting against most strenuously is generated code that developers are expected to maintain. You’re right about the fact that nothing in the definition of a factory says it has to operate that way, but I’ve seen some that do. I’ve also seen developers who take a pattern and copy-paste it all over the place without ever stopping to refactor the code. The common thread in both of these is the unnecessary volume of code.

      I believe that factories are getting better. The early ones that I saw were really pretty lousy at handling full life-cycle development — an early release of the Web Services Factory comes to mind. That factory was great at generating tons of code, but it had a really hard time with maintenance, to the extent that we gave up and did the maintenance by hand. I believe the current state of the art is greatly improved over those early examples, but I don’t want to lose sight of the fact that if you’re going to have to maintain code by hand, you really don’t want it to be ugly, repetitive code.

      Probably the worst contemporary example of this (note – it’s not a factory) is the Open XML Document Reflector. This tool is great for getting started with Office automation, but the code it emits is really ugly — sort of the ultimate de-normalized code representation of a document.

      The point of the post isn’t to pick on factories (sorry if I gave that impression) — it’s really more of a call to be vigilant about the quality of any generated code that you’re going to have to maintain so that you’re not stuck with a mass-generated mess.

Comments are closed.