Nearly all my programming accesses databases. In fact, the only application I have written in the past couple of years which didn’t grab data from a formal database was a tiny application for a client, which, unlike Rome, was built in a day. But even this application used some data. It was just that, in this case, the amount of data was very small and was unlikely to change often, if ever. But there was the possibility that it might change and so it needed to be accessible and not hard coded.
I can’t think of any reason, off the top of my head, to hard code data.
So, for this application, the data was put in an XML file. There were less than a dozen nodes. As I said, it was a very small amount of data.
But most commercial applications , and probably most other types too, rely upon a lot mor data than this. These days that data is usually kept in a relational database and the application has to have some mechanism for accessing that data. CRUD is the technical term – Create, Read, Update, Delete (although some people say the R stands for Retrieve) – and it is fundamental to most commercial applications.
The question then arises Which database should I use?
There are many databases and RDBMSs (Relational Database Management Systems) on the market. Somehow you have to choose one.
I have probably used MS Access more than any other, and for a number of reasons. First, I am used to it. I have been using Access since version 1.0, although I didn’t use it for long. It was awful. Version 1.1 was a big improvement, and each subsequent version has added something extra.
The other reason that I use it is that my client’s use it. Most of my clients have Microsoft Office installed. They already know, or else I can show them, how to use Access data in a Word document or an Excel spreadsheet.
There are two downsides to this. The first is compatibility. I have been programming in Visual Basic since Access was first launched. Each version of Access has its own version of the Jet database engine and they are not compatible. You can upgrade a database from one version of Access to the next, but going back is usually impossible. So if a client upgrades their version of Access and then updates the database then the application probably won’t work because it is using the wrong version of the Jet engine. This can be fixed by attaching the correct engine to the program and recompiling.
The other problem is that the client can change the database from within their version of Access. You can secure the database with various protection methods but the bottom line is, the database – its structure and its data – belong to the client. They can do what they like with it. Who am I to keep them out?
However, when choosing a database, I think the decision really boils down to the type of structure, the amount of data and the number of users. Generally, my view is that small is good for Access, big means a proper RDBMS.
Isn’t Access a proper RDBMS? Well, no, not exactly. And herein lies the difference. Access is, in fact, a flat file database. If you look at an MS Access database you will see that all the data, the forms, the reports, the queries, are in one file. You can separate the files by having another database and attaching it the first. And a number of Access applications are written this way. The data is kept in one file and the forms and reports in another.
But even in this case the data is all in one file. Access uses what is called an Indexed Sequential Access Method, or ISAM, file. Notice the word Sequential. That means that if you want to read the file you start at the beginning and keep going until you find what you are after. That is horribly inefficient.
So the file is Indexed. All you have to do now to find some data is look up the index, and this is much quicker, provided the index is arranged efficiently. Access uses a BTree, or Binary Tree to index data. A BTree cuts down on the number of searches you must make through your index file to find what you are after. As far as speed goes the number of searches is a logarithmic function of the number of items. This means that 100 items will take twice as long to search as ten, and a million will take 6 times as long. Note that the B in BTree stands for Binary, so the number of searches is based upon 2 rather than 10, but you get my drift.
Proper RDBMSs are not ISAM files and the way that data is accessed is quite different.
What this means for you is that the method used within a program to access data should be different depending upon the type of database.
When I use an Access database I grab one record at a time. (I will often get a lot of records and fill a list or combo box.) If I want to get data from two tables I get the key to the second table with a read of the first table’s data, then I send a second call to the database to read the second table. There are occasions when I do it slightly differently, but this is the usual method.
If I am getting data from say, SQL Server or MSDE, the Microsoft Desktop Engine, I am more likely to grab a bunch of data with the one call. For filling list boxes this isn’t any different than what I do with Access. But, if want data from more than one table I will construct the join in the program and let the RDBMS select the data.
Updates in Access are nearly always done one record at a time. In SQL Server I am more likely to make changes to a dataset and upload all the transactions in one call.
As I said earlier, for small databases MS Access is often the easiest way to go. But hopefully your business won’t stay small forever. As it grows your data, and hence your RDBMS, needs will change. To get the most out of your data you need to consider the most efficient way to process it.