Sunday 20 March 2016

Why I Really Don't Like DataSets

Starting my career as a C# developer in a small software house with very little experience of the C# language and the .NET framework (as an undergraduate, most of my experience was with Java and a few PHP projects on the side), I often needed to get things up and running in a hurry. None of us were familiar with the best practices for .NET so we generally opted for the quickest solution. When connecting applications to databases, we generally used DataSets.
At the time, they seemed like a good idea. We could automatically build our data access layer and they helped to hide some of what was going "under the hood". Even in my current job, we still have a legacy code that based around DataSets, mainly for providing data to UI controls. The advantages they provided over plain old ADO.NET were:

Quick to Set up
Particularly if you have a large database, DataSets provided a quick way of setting up access to our application database. Today there are better ORM frameworks available, such as Entity Framework (which has improves significantly since the initial versions).
We could point the DataSet the relevant stored procedures and, hey presto! You've got a good chunk of your data access layer in place.

Winforms/WebForms Binding
DataSets made it pretty easy to bind data to controls. This was useful where we needed ComboBoxes that were populated from the database (e.g. list of types/statuses) or for DataGridViews that displayed search results.


Ultimately, these justifications are laziness more than anything else. If you're not comfortable with alternative mechanisms/frameworks, it's too easy to stick with what you know.
Now, I try to remove DataSets where possible and here are a few reasons why:

Clutter
Have you ever seen a DataSet with 50 or more TableAdapters? I have! Believe me, it wasn't a pretty sight. DataSets can become a real headache to manage when you have a large number of Table classes and TableAdapters.
If you want an example of how bad this can be, try mapping everything from the AdventureWorks2012 database in to a DataSet!
I've worked on DataSets that have this level of... clutter and they are not that easy to manage.
OMG! 70 TableAdapters? How is that even possible?
Storing The Connection
For a small application, this can make a lot of sense. An application scoped setting that any other datasets (or anything else pointing to that database, for that matter) can use.
The problem comes when you:

  1. Have multiple projects (in which case you have duplicated connections/connection strings)
  2. If you want to change connection based on things like build configuration
It's a problem you can't escape with auto-generated code, you're surrendering a certain degree of control

Extensibility
One of the biggest problems I've found with the Extensibility of DataSets (or, specifically, TableAdapters) is when you need to change the default timeout. In a connection string, you can set the connection timeout, but not the command timeout. This will default to 30 seconds. This poses something of a problem when you have complex searches taking place that may run for longer than this. Making changes to auto-generated files is not a good idea as your changes will be undone next time someone opens the designer and makes a change..
Changes to this file may cause incorrect behavior and will be lost if the code is regenerated.
Fortunately, partial classes can help. A solution I found a few years ago on Stack Overflow suggested the following:
By providing a property that will set the SelectCommandTimeout on all commands in the command collection, we can set the timeout to whatever we want. The only problem is that the same (boilerplate) code is required for every TableAdapter. Not happy with this duplication, I was able to clean it up slightly.
An extension method can apply the override to the command collection:

My biggest frustration is that TableAdapters only inherit from System.ComponentModel. My preferred solution would be to call an extension method on the tableadapter. If there were a common base class that that TableAdapter classes inherit from or a common interface they implement then it would probably be easier (although with the command collection not being exposed, I'd probably have to do something with reflection to find it)

These days I now avoid ORM Frameworks. In theory, they're great. In practice, they can cause headaches. My preferred solution is to use a wrapper (that we wrote ourselves) around ADO.NET to provide a things like mapping of results to classes that represent results (dealing with collections of objects rather than data tables).
It may take longer to write the classes, but it's all easier to maintain (besides, the code generation of ReSharper can speed up things like constructor or property generation)

Another advantage of using plain old ADO.NET is that passing interfaces (e.g. ICommand, IConnection etc instead of SqlCommand, SqlConnection) will make it a little easier to change database provider (e.g. going from Sql Server to Oracle)