ASP.NET MVC 2: validation and binding issues with rich-text input, DataAnnotations, view-models, and partial model updates

by Stephen M. Redd 3. July 2010 06:21

Here I'm going to tackle a series of validation related annoyances with MVC 2 that tend to come up rather frequently.

This will include:

  • Input Validation: Potentially Dangerous Request even when using [ValidateInput(false)] attribute.
  • DataAnnotations validation warnings with partial model updates.
  • Dealing with View-Model binding AND DataAnnotations validation warnings with partial model updates together.

OK... so, you are writing a page on the asp.net MVC 2 framework that creates a record in the DB for you. You have a view. Nice!

You have a controller. Also nice!

The view needs select lists and stuff too, so you have a view-model. Sure thing!

Your controller relies on a model class to handle business logic like auto-populating values on the entity and similar. You bet!

And the entity itself is an EF 4 generated class. Nothing special there!

And you created a meta-data buddy class for the entity so you can flag fields with attributes and get MVC to automate your validation. Absolutely!

So you type in your values into the page, hit submit, and the page blows up!

Fuck!

Now, since you were using a view-model, your controller looks something like this:

[Authorize]
public virtual ActionResult Create()
{
    Ticket ticket = new Ticket();
    var model = new TicketCreateViewModel(ticket);
    return View(model);
}

[Authorize]
[HttpPost]
[ValidateInput(false)]
public virtual ActionResult Create(FormCollection collection)
{
    try
    {
        Ticket ticket = new Ticket();
        UpdateModel(ticket, collection);
        if(TicketService.CreateTicket(ticket))
        {
            RedirectToAction("Success");
        }
        else
        {
            return View(new TicketCreateViewModel(ticket));
        }
    }
    catch { return View(new TicketCreateViewModel(ticket)); }
}

The first error you are likely to come across will be a potentially dangerous request error. This will happen if your view tries to submit a field that contains HTML characters, like with a rich text editor.

Now, you used to be able to work around this by just flagging the action method with [ValidateInput(false)] and all was well; but not with MVC 2.0 you don't!

In .NET 4.0, the MS dev team decided to overhaul (badly) the input validation mechanism. They wanted to have the same mechanism work for asp.net, web services, and just about any other kind of request too. So they now have a new input validation system that operates so high-up in the request pipeline that it dies long before it actually gets to your controller to see if you've overridden the default behavior. More info in this whitepaper.

Basically, to work around this you need to tell ASP.NET to use the old .NET 2.0 validation mechanism instead of the fancy new one. You do this in web.config by adding this to the system.web section:

<httpRuntime requestValidationMode="2.0"/>
  
  

Basically, this change is so retarded, that any application with a rich text editor ANYWHERE has to turn off the new security feature for the entire application. The good news is that the old mechanism still protects asp.net pages, but you are on your own again for web services and other request types.

Good job guys!

Now, once you've fixed that up the next problem you'll likely run across is that the UpdateModel method has trouble populating the data into the right properties. This is because you are using a view-model class, but when the form is posted back you are trying to use UpdateModel on just an entity.

Generally this is pretty easy. You just tell the UpdateModel method the "prefix" for the values in the form that should match the entity you are updating. In my example here, the property in the view-model that had the ticket fields was called "NewTicket"... we we just alter the controller to supply the prefix "NewTicket" and the UpdateModel method can figure out what properties in the form data should go get shoved into our entity.

[Authorize]
[HttpPost]
[ValidateInput(false)]
public virtual ActionResult Create(FormCollection collection)
{
    try
    {
        Ticket ticket = new Ticket();
        UpdateModel(ticket, "NewTicket", collection);
        if(TicketService.CreateTicket(ticket))
        {
            RedirectToAction("Success");
        }
        else
        {
            return View(new TicketCreateViewModel(ticket));
        }
    }
    catch { return View(new TicketCreateViewModel(ticket)); }
}

This is also how the standard example apps for MVC typically do things. And it works fantastic as long as you are posting back ALL of the properties from the view, or at least all the ones that have validations via DataAnnotations attached to them.

But when you have a [Required] attribute from DataAnnotations on entity properties that are NOT being passed back from the view, then you run into the third major problem... the UpdateModel will blow up with validation errors for the fields that your view didn't post to the controller.

This is also due to a late-breaking, and very unwise, change in behavior that the asp.net team made right before MVC 2.0 was released.

The MVC validation stuff in 1.0 only validated properties that were posted up; a mechanism called input based validation. The problem with it is that a hacker could, in theory, get around triggering validation errors by hacking out required fields from the HTTP post. So, they decided to switch to "model based validation" where the validation would check the validity of ALL properties in the model even if they weren't included in the post.

Again... this is an ok idea, but the asp.net team should not have implemented such a breaking-change without giving you some way to easily override the behavior too... after-all, doing a partial model update from a form is NOT exactly an edge case scenario. It is damned common!

Now... Steve Sanderson (smart guy!) posted about a really slick action filter attribute way to handle partial model updates. In short, his mechanism allows you to flag an action method with an attribute, and an action filter will automatically loop in after the model binding is done and remove any model state validation errors for fields that weren't included in the post.

Now... suppose you were doing this instead of what we were doing above:

[Authorize]
[HttpPost]
[ValidateInput(false)]
public virtual ActionResult Create(Ticket newTicket)
{
	// stuff
}

With Steven's simple attribute, you could get the desired partial model update to work, without barfing on the data annotations by just flagging the action with one more attribute; like this:

[Authorize]
[HttpPost]
[ValidateInput(false)]
[ValidateOnlyIncomingValuesAttribute]
public virtual ActionResult Create(Ticket newTicket)
{
	// stuff
}

Fantastic!

Except that it only works if you are using a model binder, but we weren't. Remember, in our case we're using a view-model. We don't want to bind to the view-model on the postback... we just want to bind to an instance of the entity we are trying to create.

The UpdateModel method is NOT a model binder though largely it does the same thing as one. So the action filter attribute mechanism Steven describes doesn't work when using the UpdateModel method.

sigh...

OK, now I could just put similar code to what Steven's attribute was using in the controller itself. All his attribute does is loop through the model state and remove errors for fields that weren't in the posted form collection. But honestly, that's an ugly beast and I HATE having my controllers marshaling values around like that.

Now... this example is actually simplistic, and the easiest way to work around this is to use the default binder and bind to the model automatically. This can be done simply with the Bind attribute, which allows you to supply a "prefix" to use for the binding. It looks something like this:

[Authorize]
[HttpPost]
[ValidateInput(false)]
[ValidateOnlyIncomingValuesAttribute]
public virtual ActionResult Create
(
	[Bind(Prefix = "NewTicket")] Ticket ticket
)
{
	if(TicketService.CreateTicket(ticket))
	{
		RedirectToAction("Success");
	}
	else
	{
    	return View(new TicketCreateViewModel(ticket));
	}
}

This works fantastic for simpler cases like this one. We just supply the prefix to use and the binder takes care of it.

But, for reasons I don't want to delve into here, my code was a tad more complex and this technique didn't quite work out for me...

So... what I did to work around this when the Bind attribute wasn't enough, was to create a custom model binder that could do the same thing we're doing with UpdateModel. Anyway, once we have a custom binder we can then modify the controller action to use the custom binder AND Steven's attribute both.

So... here was my custom model binder:

public class NewTicketModelBinder : DefaultModelBinder
{
    public override object BindModel(ControllerContext controllerContext, ModelBindingContext bindingContext)
    {
        bindingContext.ModelName = "NewTicket";
        return base.BindModel(controllerContext, bindingContext);
    }
}

Simple enough. All it does is supply the "prefix" to the binding context to achieve pretty much the same results as our UpdateModel method used to. Otherwise it uses the DefaultModelBinder's behavior as-is. Again though, this example is omitting some details that just aren't necessary to describe here, but my custom binder has a few other overloads besides what's shown here.

Now the controller looks like this:

[Authorize]
[HttpPost]
[ValidateInput(false)]
[ValidateOnlyIncomingValuesAttribute]
public virtual ActionResult Create
(
	[ModelBinder(typeof(NewTicketModelBinder))] Ticket ticket
)
{
	if(TicketService.CreateTicket(ticket))
	{
		RedirectToAction("Success");
	}
	else
	{
    	return View(new TicketCreateViewModel(ticket));
	}
}

All I had to do here was declare what binder to use and specify Steven's [ValidateOnlyIncomingValuesAttribute]. I was also able to clean up the controller a bit, because now I don't need the try/catch for binding failures either.

Now... one thing I still didn't like was that the custom model binder here is specific to each view model. There isn't a clean way to supply the "prefix" to the binder without some ugly reflection or such. But I didn't like the idea that every time I needed to bind this way I'd have to make a new custom model binder.

So I settled on a "convention" for this and created a single custom model binder that could be used with any view-model, as long as it followed the convention:

public class ViewModelBinder : DefaultModelBinder
{
    public override object BindModel(ControllerContext controllerContext, ModelBindingContext bindingContext)
    {
        bindingContext.ModelName = bindingContext.ModelType.Name;
        return base.BindModel(controllerContext, bindingContext);
    }
} 

Here, the binder requires that the property on the view-model be named the same as the entity's type. So if we want to expose a Ticket entity, we need to name the property in the view-model "Ticket" too.  Previously in my view-model I was using the property named "NewTicket", so I just change that to use property name "Ticket" instead. This way the ViewModelBinder can always find the right prefix name to supply for the default binder.

This still seems like an awful lot of work for a common scenario. I sure hope the MVC team gets all this worked out in the next release of the MVC framework to support this kind of situation more elegantly.

Using T4 templates to generate DataAnnotations buddy classes for EF 4 entity models

by Stephen M. Redd 1. July 2010 06:40

One of the more frustrating things for me lately has been leveraging the ASP.NET MVC 2 validation support using DataAnnotations with a generated Entity Framework 4 model.

Wow! That's was a mouthful! 

Anyway, it's great that the MVC Framework 2.0 can leverage DataAnnotations for both validation, as well as stuff like generating label text and such. Fantastic! But, for reasons that remain retarded, Microsoft chose not to include DataAnnotations attributes with generated EF 4 entity models. You can generate the models, and the models have attributes that mirror many of the ones DataAnnotations uses, but the attributes EF 4 generates are useless for use with MVC 2.0's validation features.

DataAnnotations does have a rather interesting mechanism that you can use to add the annotations to your generated model though. You can't add the attributes directly to the generated code of course, they'd just get blown away next time the code gen ran. And you can't add the annotations to a custom partial class that extends the entities because the partial classes cannot each define two properties with the same name.

Instead, what you are supposed to do is this:

  • Create a custom "meta" class (sometimes called a buddy class) that mimics the entity's public properties
  • Annotate the public properties in the custom meta class using attributes from Data Annotations
  • Create a partial class that extends the generated entity you are annotating
  • Flag the custom partial entity class with an attribute called MetadataTypeAttribute to link the entity class to the meta class where the annotations live    

Assuming you have a generated EF 4 entity named "Ticket", this will look something like this:

[MetadataType(typeof(TicketMetadata))]
public partial class Ticket
{
    //whatever extensions I might want to the entity

    /// 
    /// Class that mimics the entity so you have a place to put data annotations
    /// 
    internal sealed class TicketMetadata
    {
        [DisplayName("Ticket Id")]
        [Required]
        public int? TicketID { get; set; }
    }
}

So, we have a way to add the annotations to our generated model, but you have to manually code this up... which is a BIG pain in the ass honestly, especially for larger entity models. 

This is a job for T4, but I have no interest in modifying the core T4 templates that generate the actual EF 4 entity model. Fortunately Raj Kaimal was kind enough to blog his custom T4 template for generating metadata buddy classes.

This is a pretty slick template that auto generates meta classes by just looking at an EF 4 generated edmx file. The template is super simplistic. It doesn't account for complex types, and there isn't a built-in mechanism where you can custom override or control how the attributes are generated. For example, the T4 generates the display name by just word-splitting at the capital letters in a Pascal cased property name. So if you want a different display name, there isn't a way to override the generated display name. The template does generate the meta classes as partials, so you can implement your own extension to catch any of the complex types that the template cant automatically generate attributes for.

But because this template is so super-simple, it is a great place to start if you wanted to create your own template that can account for more complex needs specific to your own application.

In my case, I found the template useful for generating the buddy classes initially, but then I just removed the T4 template and customized the code it had generated directly. That way I could make my customizations without worrying about the T4 over-writing my changes later. The T4 saved me a boat-load of time though by just doing that initial code generation, and the code it produced was about 95% sufficient for the final product's data annotation needs.  

The annoyance of using data annotations with EF models comes up so often for me that I'm seriously considering writing a Visual Studio Extension, or my own complex T4 template to deal with the problem. Sadly, I doubt I'll ever find the time to actually write the code though. 

Tags: , ,

Filed Under: Code

Don't let unit testing push you around!

by Stephen M. Redd 22. April 2010 08:26

Test Driven Development (TDD) and the ever increasing trend towards Unit Testing as a religion among developers is probably a good thing for software quality in general. I've just  spent a couple of weeks trying to employ the recommended unit testing techniques with Entity Framework 4, and this has reinforced my misgivings about the wisdom of some of the  basic concepts behind how unit testing is supposed to be done.

I've always been somewhat lukewarm towards unit testing. I certainly don't do as much testing as I probably should, but I do agree that some unit testing is essential to developing a quality application. So, I always do some unit testing, but only to the degree that I feel the testing is appropriate or necessary.

But unit testing has grown in popularity over the last 10 years, to a point where it appears that common sense is not a part of the process anymore.

The source of my misgivings about this blind commitment to unit testing deals with one of the fundamentals; that a unit test should only test logic within the specific block of code that is under test.

I don't disagree with this principal, but the sticking point is that you have to fake your code's dependencies, especially the external ones like real databases and external services. In most modern applications, external dependencies are pervasive and sometimes very hard to avoid.

To support testing without hitting external dependencies, you generally design your code in a way that allows for dependency injection. There are several ways to do dependency injection, but the different techniques can range in complexity a lot. On the simpler side you can just use overloaded constructors or write methods so they take their dependencies as parameters. On the other end, there are horribly complex programming acrobatics such as the Inversion of Control (IoC) design pattern.

No matter which techniques you use, the overall point is that your unit tests are able to supply their own versions of any dependencies instead of using real ones. These mock (fake) dependencies will mimic the behavior external systems or other complex code routines.

I'm not opposed to making some concessions in my code in the name of enhanced testability. Indeed, many of the simpler dependency injection techniques lead to better code in general. But there are cases where simple techniques just don't work. This is where I start to have serious misgivings about the wisdom of unit testing.

I've always had the opinion that the use of fancy OOP tricks, complex inheritances in particular, should only be used if there are direct benefits to the application. Likewise, employing advanced design patterns is only worth the effort if the patterns solve issues within your application's actual problem domain.

To some degree, facilitating better unit testing can be considered a good reason to employ some of these more advanced techniques. Indeed, the TDD guys will usually tell you that unit testing is a part of the application's problem domain, and that a full suite of unit tests must be part of the delivered product. I do see their point, but I do believe that there are cases where it is misguided --and unwise.

In my opinion, you should never compromise your application's primary requirements just for the sake of unit testing. Testing is at best a secondary concern; an optional requirement when it doesn't interfere with other priorities.

But that doesn't stop developer after developer from doing insanely stupid shit to their code for the sheer sake of testability. I've seen so many otherwise simple applications become bloated and convoluted beyond reason, and often to the detriment to both developers and the application's functionality.

The most recent case of this that I've run into was with the next version of TicketDesk. I have a class library to encompass the application's core domain logic. Here I had planned to auto-generate my entity model from my database, add some data annotations and custom logic of my own, then just create a simple service API to expose the model to my web application; and in the future potentially other clients such as Silverlight.

All-in-all this is a pretty standard design. My entire reason for choosing the Entity Framework was to leverage the more advanced features, both within the library as well as within the client code consuming it.

After spending a solid two weeks exploring how to make Entity Framework 4 testable the correct way, I'd ended up with an insanely complicated hand-hacked variation of a standard generated EF model. But to support the necessary dependency injections, especially with the problematic EF ObjectContext, I had produced gobs of abstract interfaces and classes. Return types at the service API had become so abstract that the consuming clients had lost access to all of the special functionality of EF models; the very features that had made EF an appealing choice in the first place.

I got far enough into this to prove that achieving a useful level of testability could be done, even with the dreaded ObjectContext. But well before I reached that point, it was apparent that doing so would make my code into a complex and time consuming maintenance nightmare.

I still needed to be able to unit tests my custom logic and service layer though, but I was done with doing it "the right way". Instead, for my tests I pointed the ObjectContext at a real database server, used the built-in CreateDatabase method to generate a temporary fake database schema from the EF model, and wrote my tests against it.

My tests are able to add any fake data needed directly to the temporary database via the strongly typed and auto-generated ObjectContext. With this approach, I can still test the relevant code blocks, but without jumping through hoops to dodge a dependency on a database server.

Sure, it isn't technically the correct way to do unit testing... and yes, many of my tests do spend a bit of time talking to the external database, or managing the test data within it; but all of the extra complexity is encapsulated directly within the unit tests or helpers classes within the test project; not junking up my actual application.

I'm sure the more fanatical unit testing advocates, especially the TDD guys, would foam at the mouth at this approach, but I have nearly 100% code coverage in my tests (excluding generated code). And for me, the biggest advantage is that my application code is simple, elegant, effective, and highly maintainable...

Isn't that the whole point of unit testing?

Tags: ,

Filed Under: Code

ADO.NET Entity Framework: Impressive! Powerful! Useless!

by Stephen M. Redd 22. August 2008 18:57

The new Microsoft Entity Framework is the latest in a long line of very impressive, yet tragic failures in Microsoft's data access strategy...

The basics of ADO.NET are great. SqlCommand, SqlConnection and their relatives for other platforms... awesome. But almost every single version of ADO.NET has failed when it comes to useful higher level abstractions. To be sure, they each demo well and they each have uses with simple applications (like those you'd be shown a demo of).

But whenever it comes to complex applications, the abstractions tend to become cumbersome, restrictive, an inflexible. The result is that most serious application end up using 3rd party frameworks or custom abstraction layers instead and under-the-hood they tend to stick to the basic ADO.NET SqlConnections and SqlCommands to do the dirty work.

In .NET 1.x it was DataSets and SqlAdapters. In .NET 2.x it was DataTables and TableAdapters. And now we have the ADO.NET Entity Framework (EF).

I had high hopes for EF. MS had clearly recognized that a radical new approach would be needed if they were to achieve a useful abstraction without having to re-invent the same wheel over and over again every few years. They had also set some pretty good goals in terms of making EF useful in support of other data stores outside the classic relational database.

Sadly, what has been delivered in the 1.x version of EF is hopelessly crippled by the deliberate lack of implicit lazy loading.

Here is an example of what this means. Assume we have two logical EF entities that map more-or-less directly to physical tables in a database. One is the Order object and the other the Customer. These entities have a navigation relationship between each other (which is analogous to the physical database's foreign key).

If you get a reference to an Order object it will have a property called Customer. This property is how you'd navigate the relationship between the entities.

So you'd expect that if you look at MyOrder.Customer you'd get back a reference to an instance of the Customer entity... But you would be fucking wrong!

The Customer property on the instance of Order may not have been fetched from the database automatically when you obtained your reference to the Order...

Instead of implicit lazy loading, EF has "Deferred Loading"... or if you prefer you can call it "Explicit Lazy Loading". The idea is that you can check to see if the Customer has been loaded for an Order, and if not then you can explicitly load it when and if you need it. But it will not automatically load the data for these properties and related entities unless you explicitly tell the framework to do so (which is unlike most ORM frameworks, LINQ to SQL, etc.).

What happens in real applications is that you never know what has and has not been loaded. So your code is chock-full of bullshit like this:

if(!someOrder.CustomerReference.IsLoaded)
{
	someOrder.CustomerReference.Load();
}
string customerLastName = someOrder.Customer.LastName;

This allows you to do what you need to be doing in your code, but the price is that you HAVE to do this all over the fucking place... every time you want to access a property that traverses a relationship. You end up with more checks for data than you do data in the first place.

Even worse than that though, when you code something to use your EF model, now you now have to somehow magically "know" which properties on your entities are going to traverse a relationship and which don't.

If you know in advance that you are going need the related data later on, then when you fetch your entity you can a technique known as "Eager Loading" to tell EF to go ahead and load up the related data in advance.

This looks like this:

var x = entities.Orders.Include("Customer");

Again, this allows you to do what needs doing, but if you are making the fetched order entity available to other classes (like as a return value from a public method)... then the caller isn't going to know if you pre-loaded the relationships or not... so they'll still have to do the whole "IsLoaded" anti-tard checking shit anyway.

In his response to a nasty online petition called the "Vote of No Confidence", Tim Mallalieu defended the lack of implicit lazy loading with this statement:

"We took a fairly conservative approach in v1.0, because we wanted developers to be aware of when they were asking the framework to make a roundtrip to the database... our take on 'boundaries are explicit'."

That has to be the most depressing statement that I've ever read regarding ANY data access technology ever!

Hey guys!

The entire point of an abstraction layer is so that developers using that layer DON'T have to be aware of the damned internal workings under the abstraction layer!!!!

But most offensive to me is the overall fact that I cannot "trust" the EF model. For example... if I have a Customer entity then I have no way to know if Orders property contains an empty collection because there aren't any orders of if it is empty because the framework hasn't loaded data. Instead of being able to trust the entity model to be accurate I have to baby sit it and constantly ask "are you sure you loaded data for this already?".

Fuck that!

Then we get into the other side effect. All the mechanisms needed to do the paranoid checking-up after EF use some counter-intuitive techniques. Before I check the Customer property on an Order entity, I have to first check up with a CustomerReference property on my Order to get information about the state of the contents of the actual Customer property? Huh?

Yeah.... that's really slick there!

The eager load technique pisses me off even more!

So I can tell the EF to go ahead and load relationships... but to load them I have to use a method that takes a fucking string as an argument?!

So now I also have to be an expert on exactly what each navigation property in my EF model is specifically named... and that without strong type checking or intellisense? Sure... I can do that, but it slows me down and is just begging for a runtime bug (typoed the name or messed up the capitalization). That means I am constantly having to refer to the damned diagram all the time too which just slows me down and annoys the shit out of me at the same time!

Using EF without lazy loading is a good way to drive yourself into becoming so paranoid you'll need to remember to take your anti-psychotic meds before you even open Visual Studio!

There are a few 3rd party attempts out there to get implicit lazy loading features with the current version of EF. These are clever hacks, and I even tested out one of those. Overall, the hacks give you a much better experience than using the stock Entity Framework as is, but this is also code that will be hard to update to use any future releases of EF too, and these impose other limitations on your code too. I suppose though that if you HAD to use EF, you'd still be smart to use one of these 3rd party techniques to get the implicit lazy loading anyway.

LINQ to SQL may lack the ability to provide a true logical abstraction for your physical data model, or do fancy inheritance, or even handle some of the more unusual data mappings...  but at least you can TRUST that properties in a LINQ to SQL Entity will actually contain data that reflects what is in the real database. Plus the overall usage pattern of LINQ to SQL is much clearer and simpler.

Until EF gets built-in implicit lazy loading, screw it... I'll just use LINQ to SQL.

Tags: , , ,

Filed Under: Code

Powered by BlogEngine.NET 1.6.1.6
Theme by Stephen M. Redd