Last week, while at Tech Ed, Orlando, I made a post regarding a few feelings around “Code Generation” and “ORM”. It was an intentional move on my part to solicit some discussion and see where people stood - and I got some very good and insightful comments!
While not really technically related, I kinda view Code Gen and ORM in the same light and I can really appreciate them. When I say that I don't quite subscribe to their precepts and concepts I do not insinuate that I don't see a place for them nor understand their importance. And yes, I too feel that Scott is an excellent developer and has some really insightful perspectives on things and a very deep understanding of many aspects of software development. My previous commentary was, in no way, any disrespect to him, but rather a perspective on somes issues that were discussed.
When presented with a pattern, practice, concept, or a principle which I don't currently implement, I generally take a reserved and cautious view initially (this applies in everything - from software to my personal life). I am not one to jump on the bandwagon simply because 1, 2, 50, 100 or even 10,000 people say it's right. If I did that, I'd just be going with the flow, following the stream of jargon simply because something is deemed as cool or modern, and not making my own decision based on my own understandings, experiments, and/or implementations. Sometimes the old school teachings and principles still have merit. Majority doesn't make something right, it just makes it popular.
Then again, I'm not saying that neither Code Gen nor ORM are 'wrong' per se, I'm simply stating that I don't currently implement them in my development process to the extent that others do. This isn't because I love CRUD programming, but because I take a slightly different approach to writing software. I hope to explain a bit more below.
Code Generation
Of the two, I am more reserved about code generation. Sure, writing boilerplate code is pretty tedious, repetitive, prone to errors, and much more, but how often is data access code truly boilerplate? If all your code is doing the same thing (e.g. creating a connection, opening it, running a command, disposing of resources, etc), perhaps you should revisit that. It might be helpful to have some centralized data access module or library. Then again, if you do that, one could argue that it's more the case where code gen is feasible, and I can understand that too. As I see it, the most boilerplate code would exist not in a DAL (because of library autonomy), but in domain objects where properties map more directly to data fields and the properties are simply getters and setters to an underlying field.
There sure is a tendency within the industry to promote code generation and it's definitely not lost on Microsoft. In fact, as Matt mentioned in his comment earlier, there are several new enhancements in the VS 2005 timeframe that really seem to promote this: DSL Tools, GAT (Guidance Automation Toolkit), Software Factories, and partial classes. Things which are frankly pretty awesome!
I believe that there is an inherent brittleness in code-generated objects. Business requirements change. Data storage mechanisms change or vary between objects. Actions pertaining to object vary widely. I would be lying if I said that all my business and/or data objects behaved the same. I find myself decorating classes with attributes (e.g. GuidAttribute, ClassInterfaceAttribute, SecurityRoleAttribute, DescriptionAttribute, and many more) that vary on a class-by-class basis or a method-by-method basis. Almost ALL of my programming is interface-driven programming (especially when I'm creating EnterpriseServices-based (COM+) classes). There is code within each method that validates parameters passed from the business tier, performs operations on them, calls stored procedures, builds objects, and much more. The stored procedures also validate data, update multiple tables, call other stored procedures, and much more.
I can't see these objects being generated automatically for me. Far fewer than 5% of my procs are as simple as SELECT ID, Name FROM Customer or UPDATE Customer SET Name = 'xxx' WHERE ID = x or DELETE FROM Customer WHERE ID = x. Were that the case, I'd have hundreds upon hundreds of stored procedures that all look alike, are not maintainable, and don't really buy me anything. I could simply use ad hoc queries instead. Additionally, each stored proc has security settings, permissions granted to specific applications and/or users.
The VAST majority of my procs are actions that are performed on the domain objects such as 'Complete Task' or 'Place Order'. I cringe at having to retrieve a Task object, setting its Completed property, and then calling Update to save it to the data store - such a method might require a bunch of business logic to run or a multi-step database operation.
Perhaps code generation is more template based than I'm willing to let on. Template-based code generation, in my mind, has must more merit and is much more attractive than what I'll call 'reflective code generation'. I have seen some really fantastic XSD-based and WSDL-based templates that DO allow for some pretty tight classes to be generated (something that Scott did illustrate in his presentation)...I will be looking into that more here in the future.
All said, however, I truly see a use for code generation and will probably employ it in the very near future. Heretofore, I could not honestly get behind it 1) for fear of losing any tweak or change I made the next time the code is regenerated and 2) perceived lack of control over the resulting classes. With the advent of partial classes in .NET 2.0 I can really see its practical usage.
ORM
I tend to view ORM implementation in the same camp as Code Generation because of the tools that automate the creation of the objects. The chief tenets behind ORM, however, I can definitely get behind. Having some form of a 'mapper' class that can create, update, and manage domain classes is paramount to almost all development projects.
My object hierarchies don't frequently mirror my database relationships, though similarities often exist. Generally speaking, I don't believe it's a good practice to push the database organization onto the business tier or the UI tier. Sure, sometimes object hierarchies and relationships are understood and rather apparent (e.g. Customer -> Orders -> Order Details) and lend themselves to an object hierarchy, but that, in my mind, is an operation of the UI and business layers - I shouldn't have to knowingly navigate an object hierarchy to simply get the single order detail I want. I should be able to go right to it if I want. Likewise, when I commit a change, I shouldn't have to send an entire object graph to a data layer to have the entire thing analyzed to simply update a misspelling somewhere deep in the hierarchy.
Each application has its own solution, and for some ORM is the right solution; I tend believe it's more the exception than the rule.
Though I do practice several ORM principles in my code, I can't honestly say that it's pervasive, and it's not generated. I can definitely get behind the ORM bandwagon more easily than I can the Code Generation one. In the days to come I will be experimenting more with it to gain a deeper understanding and will report back with my findings.