Wed 7 Jun 2006
But it looks and sounds like a duck
I recently read an article by Bruce Eckel which was published in the very enjoyable and immensely entertaining Joel Spolsky’s The Best Software Writing I in which Bruce referred to Duck Typing. Bruce was talking about Python, but I think the same thing applies to some other languages such as Ruby and Smalltalk, and perhaps some others. The point is that they don’t rely on strong typing, but rather Duck Typing, as in If it looks like a duck and behaves like a duck then we can treat it like a duck.
Now that is all well and good, and I have no argument. But it got me thinking about datasets in .NET. Microsoft loves datasets, and so does almost every author of books about ADO.NET.
But not me. I don’t like them and I don’t use them. And I gather that I am not alone – it seems that many (is it most?) application programmers get their data using some mechanism other than datasets.
I am not going to delve into the mysteries of datasets here, suggest that they are necessarily evil, or even
What I will do, however, is point out that the dataset is really a goose, camouflaged to look like a database, and therein lies the problem, or at least, one of its problems.
Microsoft describes the dataset like this The DataSet is an in-memory cache of data retrieved from a data source. And that is correct, that’s exactly what the dataset is. And the good thing about it is that, unlike its predecessor, the Recordset found in DAO and ADO, it is disconnected from the database.
So when you fill a dataset with data, usually using the DataAdapter’s Fill method, you are creating an in-memory copy of the table or tables, or relations contained in the database. But being a disconnected copy, anything you do to the dataset is not copied or merged into the database until you explicitly tell it to merge, using a merge method or something similar.
Now don’t get me wrong here. This is a Good Thing. Disconnected datasets are a much better way to deal with data than a connected recordset.
The problem is the duck. It looks like a database table, and behaves like a database table, and in fact, it is a table. But it is a copy of the table in the database at the time you created the dataset. If you keep it around too long the dataset and the underlying datatables can get out of sync.
Of course, this is a problem with all concurrent systems, but the difficulty with the dataset is that it looks so much like a duck that you can easily forget that it is really a goose.