Information Technology Consulting & Solutions

Home
Up
About Us
Our Clients
FAQ
Recent News
Tech Tips
Links
Mailing List
Contact Us
Client Access

Knowledge Base

DBMS 101

Copyright 2000 by Data Systems Consultants, Inc.

Here are the basic concepts, terms and programming practices necessary for programming database management system (DBMS) applications. You’ll find quite a few code examples here. These samples are written in Visual Basic syntax and use MS Jet objects for the database components. Nevertheless, the concepts provided apply to any language and most database engines. First the basics…

What Is A Database?

Simply put, a database is a collection of information that is arranged in accordance with rules that are both, specific to the information stored in the database and that generally apply to all databases. These rules determine the structure of the data and more or less determine the ease with which the data may be accessed and utilized. Based on this definition, a database could be a file cabinet full of documents or even a pile of paper on your desk. Of course it can also be a collection of files stored on a computer’s hard drive.

A database is managed by the data engine. The date engine is a software program that is responsible for all interactions with the database based on a set rules that you provide as well as rules that apply to all databases serviced by the engine. You can think of the data engine as a file clerk in charge of a collection of files. As with a real life file clerk, only the data engine is allowed direct access to the files. Individuals who need access to the information stored in the files make requests to the engine which then serves the request.

If the file clerk is attentive to her responsibilities, and you provide her with a quality set of rules, the file cabinet database will be arranged based on logical rules that allow for quick and complete access to the information stored in the cabinets. That pile of papers on the other hand has virtually no rules and finding information will be a bit more difficult and inefficient.

When defining databases, our goal is to make sure that the set of rules we establish for the data meets the following criteria:

  • The database must be arranged in logical units that allow complete and quick access to the information. In its simplest form, this might be a simple alphabetical arrangement such as might be done in our file cabinet example. However, it is very rare indeed that we have such a simple model to work with.
  • The database must be secure. This has three meanings in our context. First we must make sure that the data cannot simply be deleted at the file level. Second, we must make sure that only those personnel that are authorized for access are allowed that access. Third, we must make sure that the rules established for the database are not broken when data is added or modified.
  • The database design must be extensible and versatile. We usually cannot predict all of the needs that the database must meet during the initial design. This means that we must build in the ability to enhance and extend the capabilities of the data as new needs are identified.
  • Each discreet data set must be unique based upon the rules for the database as a whole. This means that each piece of information stored in the database must be accessible by a set of criteria that will result in one and only one piece of information being retrieved. This is critically important as you will see.
  • In general, no single piece of information should be stored in more than one place on a persistent basis. This means that the rules we structure for the database must allow for access to all data in a variety of different ways so that no single piece of information will have to be stored in more than one place.
  • The database designs and systems should be fault tolerant. This means that we add code and structures for automatic error and crash recovery.

When these criteria are met, we have normally created a relational database model and the system used to access the model is known as a relational database management system or RDBMS.

The Components of a Database

If we take a look at our file cabinet example, we might notice that there are several items used with the file cabinet that allow items to be stored in the cabinet. Let’s take a look at the components and relate them to the equivalent components in a computer database.

  • Database – Technically, the database is all of the components combined but we can also think of the database as the file cabinet without the drawers, files and documents. With computers, the database could be a container file such as the MDB files provided with Access or it may simply be a directory on a hard drive somewhere.

In reality, even this is a limiting description. There may not be a structured database at all. It may be comprised of several directories or MDB files and could even be on separate computers. The same is true of the file cabinet. A single file cabinet might contain all of the information that relates to a database but, more often than not, it is used in conjunction with other file cabinets to include a variety of different information. This comprises the true database. Indeed, the cabinet itself is merely a convenience in that it’s a nice place to put the drawers. But for now, when you think of a database, think of the file cabinet.

  • Table – A table is equivalent to a file drawer in our example. The table is used to store a collection of discrete but similar pieces of information. For instance, a file clerk might designate one drawer of a file cabinet to store letters sent to customers. Each letter is a discreet piece of information but it is similar to all of the other letters stored in the drawer. They all have the same purpose of relaying information to a customer and are all in a similar format.
  • Record – A record can be thought of as a file folder in our example. Just as a file folder can hold more than one letter, a record can hold more than one piece of information. Further, each piece of information is different but they all relate to each other within each discreet record.
  • Field – A field can be thought of as one of the letters in our example. Each field is different but all of the fields in a single record are related. In the same way, all of the letters in a single file folder are different but they are related in that they all have the same customer as the recipient. We’ll discuss the table, record and field relationships more below.
  • Structure – This usually refers to the structure of a record. A record is made up of fields that are of a certain type and maximum length. There can be a variety of other characteristics and types differ depending on the database format and engine in use.
  • Sort Order – A sort order is a set of rules that sorts the records in a particular way. For instance, our file clerk might arrange her files, the records, in alphabetical order by the customer’s name. The sort order in this case is the rule that says that files for customers whose names start with "A" go before files for customers whose names start with "B". Note that sort orders physically order the records, or files, in our file cabinet. This is an important distinction as you will see.
  • Index – An index is similar to a sort order in that it also establishes rules for the order of access to the records but it has some unique and important differences. First, an index does not physically sort the records in the database. Rather it uses an index key to determine the next record to access based on the rules of the index.

Let’s take a closer look at indexes. Suppose our file clerk needs to access her files in two ways, both by customer name and by customer number. She starts out by using her sort order tool and physically resorts the files in her cabinet in name order when she needs to access them by name and then resorts them in customer number order when she needs to access them by customer number.

Very quickly our file clerk realizes this is just not going to work. She’s spending all of her time resorting files instead of providing the information her boss needs. Since resorting the files is extremely time consuming, she knows she’s got to come up with a better way. In our computer world, physically sorting records in a database is also very time consuming so we need a better way as well.

Our file clerk is pretty sharp. She first pulls all of the labels off her files and replaces them with a unique file number. Next, she sorts her files in file number order. Finally, she makes two lists on a spreadsheet in her computer. One list orders the file numbers in alphabetical name order and the other list orders the file numbers in customer number order. Here’s part of her lists.

Customers by Name

Customer Name

File #

Acme Supply

102

Baker Components

765

Boston Iron Works

093

Cash and Porter

565

Customers by Number

Customer Number

File #

00001

254

00002

925

00003

645

00004

821

Now, when her boss asks for a letter from the Cash and Porter file, she refers to her list by customer name and finds that this is file number 565. She quickly opens the drawer and locates the file for her boss. Likewise, when the accounting department up the hall request all letters for customer number 00003, she consults her list by customer number and locates file number 645. She never again needs to resort files. All she has to do is keep her lists up to date as she adds new and removes old files.

What our resourceful file clerk has done is create two indexes. The first index has an ascending order based on the index key, customer name. The second index has an ascending order based on the index key, customer number. She still must resort or reindex her lists from time to time, but it’s much faster to sort the lists in her spreadsheet then it is to reorder the files in her cabinet.

Also, she can scroll down her lists much faster than she can go through the file cabinet drawers. She finds that it’s much quicker to glance at the entries in the list than it is to dig through those drawers. This is also true in our computer world. The computer must process much less data when "scrolling" through an index than when scrolling through records. Indexes contain much less data, only a key and a file number. In our case, that file number is a record number or bookmark. The term used for these unique bookmarks vary from engine to engine but, in all cases, the bookmark is a unique identifier used to specifically identify a single record. Note however, that we cannot rely on the bookmark to be persistent. In Jet, the bookmark is almost always different each time the table is opened.

Back to our file room. Our clerk’s index lists works for a while, but soon the lists are getting pretty long and it’s taking some time to find entries in them. She also notices something else. Usually, when a request is made for information, the files she needs to pull are related in some way. For instance, just the other day, she was asked to pull all of the files for heavy industry customers for the sales department. She realizes that if she were to use her master lists to create some secondary lists, she could speed things up a bit. So she creates a new list like this that is based on her master name list.

Customers by Name – Heavy Industry Only

Customer Name

File #

Acme Supply

102

Baker Components

765

Boston Iron Works

093

Cash and Porter

565

Acme Supply is an office supply company and Cash and Porter is a law firm so neither is a heavy industry and they are scratched from her secondary list. She now has a much shorter list to go through when finding heavy industry customers in name order. What our file clerk has done is to create a filtered index. Not only is the index ordered by the index key of customer name, but it’s also limited to only certain customers based on the customers’ type. She realizes that if this works for her name order list she can also create secondary lists for her customer number list. In fact, she can create secondary lists for other customer types too.

Once our file clerk completes these lists, she notices something else. While the filtered index has saved her time, she now has to maintain several lists, one for each filter. She realizes that she could accomplish the same thing if she were to add a sort criteria to her master list for industry type. So she goes back to her master list and adds an industry type classification for each entry. Then, she resorts first based on the industry type and second based on the name or number of the customer. Here’s part of her name order list with the new classification.

Customers by Classification and Name

Customer Name

Type

File #

Baker Components

HVY

765

Boston Iron Works

HVY

093

Cash and Porter

SRV

565

Acme Supply

SUP

102

Note that the names are no longer in overall order but are within each type classification. However, they are in name order within each type classification. Our file clerk has created a segmented or multi-key index. That is, there are at least two field values used to establish the index sort. In this case, the records are first sorted by Type and then by name.

Okay that’s a quick look at indexes. Let’s move on.

  • Relationships – Relationships relate two or more tables together based on a common piece of information. As you’ll see, there are a variety of different relationship types.
  • One-to-one Relationship – Our file clerk got a new request the other day. This time, she needed to pull not only the correspondence to a customer but also the correspondence from a customer. Her letters from customers are stored in a totally separate drawer (table) but she has taken the time to create lists for these files as well. She realizes that it would be very handy if she could refer to a single list to pull both files for a customer. So back she goes to her spreadsheet and creates a new list. Here’s part of it.

Customers by Number – Incoming and Outgoing Correspondence

Customer Number

Outgoing File #

Incoming File #

00001

254

012

00002

925

654

00003

645

 

00004

821

957

Now, when she needs to pull both files for a customer number she has both of the file numbers handy with only one lookup in her lists. She has created a one-to-one relationship, based on customer number (the common piece of information), between the tables that store incoming and outgoing correspondence. That is, for each outgoing file, there will be a maximum of one and only one incoming file when based on customer number.

If our clerk were dealing with computer databases, here is what the data would look like.

Outgoing Table

Incoming Table

Customer Number

Outgoing File #

Customer Number

Incoming File #

00001

254

00001

012

00002

925

00002

654

00003

645

   

00004

821

00004

957

She also notes that there is no entry for an incoming file for customer number 00003. That is, customer 00003 has not sent letters to her company but has had letters sent from her company. "Hmmm…", she ponders, "Is this going to cause a problem?" She quickly determines that it does not as long as she understands that she need not look for incoming letters when there is no entry for the incoming file number.

We also have to be aware of this. In our case, depending on the data engine we’re dealing with, we’ll either find a Null field value when there’s no match or the table that has no match will be at end of file. Just as our file clerk must make sure to not look for a file that has no number, we must make sure that we don’t look for a field value that doesn’t exist.

"Okay, that’s pretty slick!", she says. "I should do the same thing for my customer name list." She tries this but quickly runs into a problem. Here’s the part of her list that just won’t seem to work.

Customers by Name

 

Customer Name

Outgoing File #

Incoming File #

Rowen, Inc.

652

824

Smith, Inc.

???

???

Smith, Inc.

???

???

Smith, Inc.

???

???

She finds that her customer list by name has three entries for Smith, Inc. That is, her company has three separate customers that are all named Smith, Inc. She’s not sure which incoming file goes with which outgoing file and realizes that this just isn’t going to work. She now knows that for a one-to-one relationship to work, the common piece of relating information must be a unique identifier in both tables. If the identifier is not unique, the one-to-one relationship can relate incorrect data.

  • One-to-many Relationship – Another request comes in for our file clerk. She’s asked to pull all of the invoices for a customer. Invoices are stored in yet another file drawer and are sorted by invoice number. She realizes that she must go through each invoice to see if it was billed to this customer. "There has to be a better way!", she says. After pulling the invoices she sits back down at her computer and starts a new list. Here’s part of it.

Customer #

Invoice #

00001

102125

 

103254

00002

102965

 

102966

 

104754

Now, by looking up the customer number, she can quickly find the invoice number of each invoice billed to that customer. She has created a one-to-many relationship. That is, for each customer record there are zero or more matching invoice records. If our clerk were dealing with computer data tables, here’s what they would look like.

Customer Table

Invoice Table

Customer #

Customer #

Invoice #

00001

00001

102125

 

00001

103254

00002

00002

102965

 

00002

102966

 

00002

104754

As you can see, the common field for the relationship is the customer number. In the customer table, there will be one and only one record with a particular customer number. But, in the invoice table, there may be zero to an unlimited number of records for a particular customer number.

  • Many-to-many Relationship – This type of relationship relates many records from one table to many records in another table. In this case, our file clerk is asked to not only pull all of the invoices for a customer but also locate all of the purchase orders for the same customer. So she starts another list. Here’s part of it.

Customer Number

PO #

Invoice #

00001

64515

102125

 

86544

103254

00002

456465

102965

 

4454654

102966

   

104754

In this case, there are zero or more purchase orders and zero or more invoices for each customer. That is, there are two one-to-many relationships. Together, the two one-to-many relationships make a single many-to-many relationship. Here’s what the computer data would look like.

 

PO Table

Customer Table

Invoice Table

PO #

Customer #

Customer #

Customer #

Invoice #

64515

00001

00001

00001

102125

86544

00001

 

00001

103254

456465

00002

00002

00002

102965

4454654

00002

 

00002

102966

     

00002

104754

Note that there is no particular relation between each individual purchase order record and each individual invoice record. Effectively, each record in the PO table is related to all records in the invoice table that have the same customer number and vice versa. Also note that there are always at least three tables used in a many-to-many relationship and the joining table always has information that is common to both tables. In this case, the joining table is the customer table and the common information is the customer number.

  • Complex Relationships – Complex relationships are comprised of more than one relationship of the types listed above. One of the more common of these is the one-to-one-to-one relationship. For instance, suppose our clerk wanted to take the many-to-many relationship above one step further. Suppose she actually needed to know which purchase order goes with which invoice. How would she do this? Simple. By combining two one-to-one relationships to create a one-to-one-to-one relationship. Here the joining table has to have a unique one-to-one relationship with the two outer tables.

The relationship might look something like this.

PO Table

Joining Table

Invoice Table

PO #

Customer #

PO #

Invoice #

Customer #

Invoice #

64515

00001

64515

102125

00001

102125

86544

00001

86544

103254

00001

103254

456465

00002

456465

102965

00002

102965

4454654

00002

4454564

102966

00002

102966

     

104754

00002

104754

The joining table acts to relate each individual purchase order to one and only one individual invoice. Note that the customer number is no longer used in the relationship. In reality it probably would be and would be used in conjunction with the PO #. Do you know why? Think about it. The PO number comes from our clerk’s customers. That means that there’s no guarantee that each PO number will be unique. But, a combination of our clerk’s customer number and the customer’s PO number should be unique. This is a very typical example of a segmented index key.

  • Joins – Different name, same game. SQL (Structured Query Language usually and incorrectly pronounced sequel) calls a relationship a join. In SQL, there are two basic join types, inner joins and outer joins. Inner joins are further categorized as Full Inner Joins, Left Inner Joins or Right Inner Joins. Here’s this list of joins and their equivalents to the types listed above.

Full Inner Join – Often simply called Inner Join. A one-to-one relationship with a twist. That is, include only those records in the left table that have corresponding records in the right table and include only those records in the right table that have corresponding records in the left table. This means that only records from both tables that match on the joining condition are included from either table.

Left Inner Join – A one-to-one relationship that includes all records from the left table and only those records that match the join condition from the right table.

Right Inner Join – A one-to-one relationship that includes all records from the right table and only those records that match the join condition from the left table.

Outer Join – A many-to-many relationship with a twist. This type of join requires only two tables to execute but has absolutely no restrictions on records to include. That is, all records in the left table and all records in the right table will be included regardless of whether or not they match the join condition.

To duplicate the many-to-many relationship described above you again need three tables with a left inner join and a right inner join.

  • Filters – Filters allow a limited number of records to be returned from a table. SQL calls this a where condition. The advantage of a filter is that it eliminates a lot of extraneous data that would otherwise have to be tested and discarded in code. This elimination is done by the data engine, usually at the server. Thus, this is a very efficient way to limit the number of records returned by a request. For instance, if you had a table of addresses, you could request a filter where State = "OK". All records that are not noted as Oklahoma addresses would be automatically discarded when the data is retrieved. The filter condition must always be a logical expression, meaning it returns either true or false. Also, for most engines, for performance reasons, it’s helpful if there is an index in the underlying table that matches with the filter condition. For instance, in the example above, we would want to make sure that the address table had an index on the state field.
  • Queries – Simply put, a query is nothing more than one or more fields from one or more tables joined by one or more relationships ordered in some particular order with zero or more filters. That’s simply put, but defining them is not so simple. Maybe there will have to be an SQL 101 but we’ll put that off to another time.
  • Edit Buffers – Edit buffers provide temporary locations in memory to edit the current record’s field values. Note that these locations are in memory, not on the disk. In most of the database formats we use, an edit must be committed to be written to disk. That is, you change the buffer values and then instruct the system to flush the changes to the record on the disk. Some systems will automatically flush the changes when the record position is changed while others scrap the changes. Because of this, it’s important that we specifically either flush the changes or scrap the changes in our code before moving the record position. This way, our systems are data format independent.

Some Basic Rules and Philosophies

Okay. Enough with definitions. Let’s take a look at some rules and philosophies. First a quick review of the criteria we have established for a well designed RDBMS.

  • The database must be arranged in logical units that allow complete and quick access to the information.
  • The database must be secure.
  • The database design must be extensible and versatile.
  • Each discreet data set must be unique based upon the rules for the database as a whole.
  • In general, no single piece of information should be stored in more than one place on a persistent basis.
  • The database designs and systems should be fault tolerant.

These criteria tend to force a few rules on us. Let’s take them one by one.

The database must be arranged in logical units that allow complete and quick access to the information.

What does this mean and how do we meet this criteria? Let’s break this up too see.

The database must be arranged in logical units…

Logical units in this case are records and tables. By logical we mean that the structures of the records and tables make sense. For instance, if we were storing information about widgets and wing-dings, two totally different things, we would not store this information in the same table unless it made sense to do so.

Widgets will demand one particular field structure while wing-dings will require something totally different. So this information would be stored in two separate tables. By storing this information in two different tables, the record structures for those tables will be somewhat self evident, i.e., the structures will reflect the unique characteristics of each object type. The short version. We don’t try to fit a square peg into a round hole. It’s much easier to cut a square hole, that is, add another table.

…that allow complete and quick access to the information.

Here we’re usually referring to indexes and queries. When we design a table, we also want to consider the logical ways this table will need to be accessed. What are the typical orders we will need to see this data in? Is there related information that will need to be seen in conjunction with this data? The answers to these questions tell us what indexes and queries are needed for accessing this table.

The database must be secure.

The first security issue is more one of network file security. Nevertheless, we cannot assume that our customers have the skills necessary to establish this security so we do it for them.

The second security issue has to do with internal database security and also relates back to the original table and record designs. What security will be required for this information? Am I mixing open data with classified data in the same table? We obviously must not do this and must accommodate these issues in the original design. Last is the security during run time. We must provide, either programmatically or internal to the database itself the necessary security required to meet the customer’s needs.

The third security issue has to do with data validation. In general, we never write data to a production table until it has been fully validated. To accomplish this we must do three things. First we have to establish exactly what the validation rules need to be during the design of the table. Second, we must provide a temporary editing location for the user input of data especially if the edit involves multiple records. Third we must write validation procedures that test for the validation rules before the data is written.

The database design must be extensible and versatile.

This has two meanings. First we must design the database in such a way that adding and relating additional tables and fields is a simple process. Second, we must accommodate changes in user policies by the creation of user editable default tables rather than hard coding values into the systems. This means writing more code but provides exceptional and very rare versatility to the client.

Each discreet data set must be unique based upon the rules for the database as a whole.

In two words, this means unique IDs. In general, we almost always provide a system generated unique ID for each record in a production table, we ALWAYS relate tables on system generated unique IDs and we NEVER allow a unique ID to be reused or to be derived from user entered data. This makes our lives much simpler later on.

An example. Suppose we were developing a system that tracked various types of information about individuals. Also suppose that the information being tracked required two or more tables for storage. We might immediately note that an obvious identifier for each individual would be his social security number. Great! We add a field for the SSN to the primary individual table. Now suppose there’s another table that contains the various addresses for each individual. Most people have a couple, their home address and their work address, and they can have more. Simple, we create a table that’s related to the primary individual table on social security number. Done!

Now, here comes the problem.

By definition, because we’re using it to relate in a one to many fashion to other tables, the SSN in the primary table must be unique. That means that the SSN entry in the primary individual data cannot be duplicated for any two or more individual records. Okay, no problem, we have to add a validation rule to make sure the entered SSN is unique.

Now suppose a user does enter a duplicate SSN and further suppose that the existing record, not the new one, is the one that’s in error. First, we can’t allow the save of the new record because we must not create a duplicate ID, even though the new record is correct in every way!

Second, to correct the error, we must change the SSN in the original record and we must change all records that contain that SSN in all related tables. What if we can’t change them all? What if one’s locked by another user?

Worse, what if we change all the data in all of the tables except that new table we added in the last rev? Shucks, we just forgot to modify this one little save routine but the ramifications are potentially huge. This means that we will orphan what is perhaps critical data for the existing individual and still worse, we’re going to link that orphaned data to the new individual! There is no indicator that will catch this. There’s nothing wrong with the data, it’s just related to the wrong individual and there’s absolutely no way the system or the user can know this.

To prevent this from happening we must toss the whole edit until one or the other user finishes with the related record and we must be certain that we catch every occurrence of the SSN field when we make changes. Simply put, we are complicating our lives horribly when we have to change more than one record for a simple edit of a single field.

Now lets suppose we do it the right way. We’ll keep the SSN field in the primary data and we’ll add another field, an auto number field, to the primary table as well. Let’s call this field UID. The user may never see this field and most certainly cannot change it. It’s strictly used to relate the primary individual data with the secondary individual tables. Since this is an auto number field, it is by definition unique.

Now, the user enters the new individual and the SSN entered is duplicated with an existing record. Again, the existing record is the one that is in error. No problem. We pop up a warning box that says, "The SSN you entered is already in the system. You can save this record now and correct the problem later or you can click the Edit button below to edit the existing record."

We’ve given them two choices. Let’s look at the first.

The user chooses to save the individual with a duplicate SSN. This is not correct but it’s not a catastrophic problem. We still have our internal UID to keep things straight in the related tables. At worst, it means that the user’s going to see two SSNs that are the same in a search window. Not right but not a big deal and this will take place only until the existing individual’s SSN is corrected.

Here’s the second option. The user chooses to edit the existing record. Okay, we pop a new edit window up over the first and allow the user to change the SSN in the existing record. The user chooses to save the record. None of the related records for the existing individual are affected in any way. They are still related on the UID field to the primary individual table and that UID NEVER changes. What was a very complicated situation has been drastically simplified simply by the addition of a single auto number field that we control.

This concept is a critical part of an overall concept called referential integrity which simply means that the rules for relating data are not broken. By always using internal UIDs this becomes a simple task.

In general, no single piece of information should be stored in more than one place on a persistent basis.

In another two words, related tables, but along two separate concepts. First concept, related data.

By related data I mean a set of tables that when taken as a whole fully represents an object. For instance, we saw in the example above how it is possible for a single individual to be related to an indeterminate number of addresses. Given this example, we have a couple of choices when we define the tables for this database. We can:

  1. Limit each individual to having, say a maximum of three addresses, by adding three sets of address fields to the primary individual table.
  2. Allow an unlimited number of addresses by creating a related table that is related on the individual’s UID field.

Let’s look at the ramifications of both. First option 1.

  • We’re allowing up to three addresses for each individual which will certainly accommodate the vast majority of people but maybe not all.
  • Given the fact that most individuals may need only one or at most two addresses, we could be wasting a lot of disk space storing empty fields for unused address entries.
  • The interface for this situation will be difficult to implement. We’re going to take up a bunch of screen real estate for all of these fields or we’re going to have to create a complicated interface to accommodate the other addresses.

Option 2

  • We’re allowing any number of addresses for any individual. That’s the most versatility we can provide. The client will never come to us asking for more address space.
  • Because most individuals will have one or two addresses, and the rare exception will have more than two, we’re actually going to take less disk space because we’re not going to be storing empty fields.
  • The interface for this implementation is simple. We have controls bound to the address table fields with a simple interface for scrolling through and adding to the individual’s addresses.

This same philosophy can be applied to all sorts of related data, phone numbers, bank and investment accounts, you name it.

Second concept, associated data.

By associated data I mean data that represents an object in and of itself but that is still associated to another object. And example of this might be a data record that represents a subscription to a magazine. In this example, there would be an individual related to the subscription record for retrieval of mailing information. Now let’s consider two separate designs, one that is cumbersome (WRONG) and the other that is simple (RIGHT).

First the wrong design:

In this design the magazine subscription record has the recipient’s name and address stored within it as well as the expiration date, the start date, the periodical to be mailed, etc. Let’s say this subscription is for Time magazine.

Now, the individual moves and sends us a change of address notice. No problem, we retrieve this subscription and modify the address as required.

The only rub is that we are a mailing service for Fortune magazine as well and this individual takes that magazine too. Since we changed only the sub record for Time, our customer’s not going to get his Fortune. We’ve got at least one more record change to make for the Fortune subscription and maybe more. We don’t really know and we won’t find out until our angry customer calls to let us know.

Now the right design:

In this design we still have a subscription table that holds information related to the subscription, that is, the periodical, the expiration date, etc., and one other field. A relating ID to a separate individual table.

Okay, we get the change of address notice, pull up the Time subscription and edit the address as required. However, what we are actually editing is the record in the individual table, not fields in the subscription table. Now, since the Fortune subscription record is also related to this same individual record, Fortune is automatically going to be mailed to the new address as well. The customer’s happy and the clerk who made the change had to make only one change to accommodate both subscriptions.

Now let’s expand on this right way by adding one other feature for related data that we’ve already been using. That is, the concept of multiple related addresses per individual. Suppose that our subscriber wants Fortune sent to his office rather than to his home because he’s trying to impress the boss.

Still no problem. We simply add another field to the subscription record for a unique address ID. Now our mailing system looks in the subscription data to determine the subs that need to be mailed. Next, it looks in the individual data, based on the unique individual ID stored in the sub record, to get a name to mail to. Finally, it looks in the address table, based on the unique address ID stored in the sub table, to get an address to mail to.

One other point to make here, remember that our subscriber is moving and wants only one of his subscriptions changed, or more accurately, he wants all of the subscriptions mailed to his home to be mailed to the new address. Okay, by changing the home address entry in the address table we do exactly that. However, we have not affected his Fortune subscription. It’s still going to be mailed to the office because it’s related to a different unique address.

Now, I said above that the simple way was the right way. Yet you may be saying, "Wait a minute! Having three related tables is much more complex than having a single table! What do you mean simple is right?" True enough. This is a more complex data structure. However, the implementation is simpler both from a programming standpoint and from a user standpoint.

We almost always prefer this because we define the structures only once but may implement them any number of times in code. Further, the user will use it many more times than we will program it and our solution is the most versatile solution that we can provide for the client. So, the rule is, first make it simple and versatile for the client, second, make it simple to implement in code even if this means it is complex to design up front.

This is why we must spend a great deal of time up front when defining database structures. We must always consider worst case scenarios and accommodate them as well as we can before we start coding the application.

The database designs and systems should be fault tolerant.

This philosophy has several contexts. First, when updating, data we treat it as an all or nothing proposition. Because of the design of our databases, it’s very common for us to update several records from several tables as part of a single user edit. As noted above, we want to avoid this as much as possible but it still happens a lot.

To be fault tolerant, this sort of update must be either completely written or not written at all. We must always ensure that we never update only a part of an edit. Usually this is done by wrapping multiple record and table updates in transactions. This allows us to write and verify all of the updates and then commit them all as a single transaction. Likewise, should something fail, we can rollback the changes as a single transaction. This implies a rule; Whenever updating multiple records as part of a single edit, always wrap the updates in a transaction.

We also provide fault tolerance by writing our systems for automatic crash recovery. Crashes can be the result of a variety of causes. These can be things as pedestrian as simple power failures to more complex problems like system lockups. In all of these cases, we want to provide for automatic recovery if it is feasible to do so. There are some tools that we can use to provide this. Consider the following:

We provide a form in a system that adds and updates several records and tables simultaneously. Let’s say this form adds and edits customer invoices. First, note that we’ll have a one-to-many relationship between a table that contains the invoice header information. Let’s call this one Header. We’ll also have a table that contains the line item detail information for each invoice. Let’s call this one Detail. There will always be one or more detail records for each header record and the tables will be related on a unique invoice ID. These tables are stored on a server somewhere.

Okay. Let’s consider how we can make this system fault tolerant. That is, let’s find a way for this system to recover exactly where the user left off in the event of a system crash or power failure.

First, though the user sees this as a single edit of a single invoice, we must treat it as an edit of multiple records in two separate tables. In spite of this, we still want the user to see this as a single edit. If this edit is going to be able to recover automatically, it has to store the partial entries in some location that is persistent. We know that we must not store a partially entered invoice in the tables on the server. Se we need temporary locations where the user may enter data.

To do this, we create two temporary tables on the user’s hard drive and give them names that will allow us to detect them when the system is restarted after the crash. When we create them, there will be no records in either table but we know we will have one and only one header record and at least one detail record so we go ahead and add them now. This gives the user a starting location for the required edits.

Also, because only this user has access to these temporary tables, we can give her exclusive access meaning we don’t need to worry about user contention and record locking. Okay, the user begins her entries and, as she edits, we write the changes she’s making to these temporary tables. Let’s suppose, she’s entered all of the header information and half of the line item details. Along comes some doofus and kicks her computer’s power cord out of the wall. Obviously, without power, our user’s system has crashed and crashed hard. She reboots, logs back in to the server and restarts our application thinking she’s going to have to reenter all of that information. But we’re going to surprise her.

When she restarts the edit invoice form, we’re going to look in that temporary table location and find the two temporary tables from her interrupted edit. The data she’s entered to this point is still there. All we need to do is ask her if she would like to continue. So we give her a yes no message box that says, "We’ve detected an incomplete invoice edit. Would you like to continue?"

If the user answers yes, we open these temporary tables and magically (to the user anyway) all of the entries she’s made appear in the form. She can now complete the line item entries and save the invoice to the server tables.

If she answers no, we can still open these tables but we delete all of the entries made. She’s now working with a fresh invoice entry.

Once the user saves the invoice, that is, once we’ve validated her entries, started a transaction, written her temporary entries to the server tables and committed the transaction, we delete the temporary tables and close the form. The next time the form is requested, it will not find temporary tables and will create new ones for the user to edit.

Now let’s consider the advantages of this technique.

  • We can provide automatic recovery for the user in the event of a system failure. Users love this! It’s an indicator of a well thought out and user oriented system. This means that we’ve taken the extra effort to make the user’s life just a little bit easier and we save her some time. Users’ bosses love this!
  • We are protecting the production data on the server from invalid entry by allowing the user to edit to temporary tables. Remember she sees this as a single invoice edit but we have to manage multiple records. If we require validation and correction as she goes along, she’s going to get really frustrated with the system. The time to tell a user there’s a problem is after she’s entered all of the data. Not during the entry.
  • We are also ensuring that a partial entry can’t be made to the production data. Consider this. Suppose when entering this data, the system was physically writing to the production tables on the server and that doofus still comes along and kicks out the power cord. At this point, we have half of an invoice in the production data and probably no way to detect it. What’s the user going to do? She’ll start over and enter a completely new invoice. Now we’ve got a partial duplicate invoice and a mess that we have no way to detect.
  • By editing to local tables on the user’s hard drive we reduce network traffic during the edit. When we save the records to the server, we’ll be performing a burst transmission. That is, for a very short period of time, the network traffic will spike. It’s better to have short spikes of very high network traffic than a continuous moderately high traffic rate. The overall performance is much better.

So let’s come up with some rules that will always provide this robustness.

  1. Anytime a single edit involves multiple records, the edits must be made to temporary tables on the user’s hard drive. The user must have exclusive access to the temporary tables and each field edit must be immediately written to the temporary table if disaster recovery is to be provided.
  2. Generally speaking, all validations to multiple record edits are to be done on request to save the edit. There are special circumstances where this is either not possible or not desired but they are rare.
  3. All multiple record saves to production tables are to be wrapped in a transaction to insure an all or nothing update has been made.
  4. Even when editing a single record, it is often desirable to perform the edit in a temporary table. This would be done when the single record edit is complex and disaster recovery is desired.

Data Access and Optimization Techniques

If we’re not careful, it’s very easy to code an application that runs like molasses when a little extra attention can result in an application that runs at mach one. This section points out some potential pitfalls to watch for that can have drastic effects on application performance.

The Slowest Of the Slow

By far, the slowest single operation that can be performed on a database or recordset is opening it. So, if an application needs to repeatedly access a database, it is best to open it once and leave it open for subsequent access. In the VB world this usually means creating a commonly used form the first time it’s requested then, rather than destroying the form and recreating it on the next request, simply make it invisible and keep it in memory for the life of the application execution.

Important! This technique works only for forms that are allowed to have only a single instance during the run of the application. Forms that may need to be run multiple times at the same time must be created and destroyed as required or there will be no data encapsulation, huge memory leaks and subsequent crashes.

Loops Are a Killer

Always keep the operations within a loop to the absolute minimum necessary to get the job done. Here’s an example of some code that’s not well thought out followed by some code that is. Both will work without crashing but the difference in performance for the second example would be orders of magnitude faster than the first example.

Private Sub ImADummy()
' Illustrates how NOT to code loops.
Dim i as Integer
Dim dbsSomeData as Database
Dim rstSomeRST as RecordSet
For i = 1 to 1000
' Open a database to add a record.
Set dbsSomeData = OpenDatabase("c:\junk\SomeData.MDB")
Set rstSomeRST = dbsSomeData.OpenRecordset("SomeTable")
' Add a new record to the database and replace counter field with i.
With rstSomeRST
.AddNew
!Counter = i
' Update the recordset
.Update
End With
' Close the recordset and the database.
rstSomeRST.Close
dbsSomeData.Close
Next i
Set rstSomeRST = Nothing
Set dbsSomeData = Nothing
End Sub

See anything wrong here? This code will run without error but compare it to this next routine.

Private Sub ImASmarty()
' Illustrates HOW to code loops.
Dim i as Integer
Dim dbsSomeData as Database
Dim rstSomeRST as RecordSet
' Open a database to add records.
Set dbsSomeData = OpenDatabase("c:\Junk\SomeData.MDB")
Set rstSomeRST = dbSomeData.OpenRecordSet("SomeTable")
With rstSomeRST
For i = 1 To 1000
' Add a new record to the database and replace the counter
' field with i.
.AddNew
!Counter = i
' Update the recordset.
.Update
Next i
End With
' Close the recordset and database.
rstSomeRST.Close
rstSomeData.Close
Set rstSomeRST = Nothing
Set rstSomeData = Nothing
End Sub

‘ Close the recordset and the database.
rstSomeRST.Close
dbsSomeData.Close

End Sub

It’s worthwhile to note all of the differences in these routines.

  1. There are nine operations within the for loop in the first routine. The second has three. Worse, two of these operations are the slowest that can be done to data objects, that is, opening the database and recordset. This means, that the database and recordset must be opened and closed one thousand times while the second routine opens and closes them only once.
  2. Overall, the operation count for the entire run of both functions compares this way:
  3. ImADummy – 11,003 Operations
    ImASmarty – 5,009 Operations

    That’s a difference of 220%. However the difference in time will be much greater because of the opens in the loop.

  4. Note the care taken in the second routine to keep operations outside of the loop. Even the With / End With statements are outside. These are the least expensive in clock ticks to run of any of the statements here, but, if they were inside the loop they would add another 1,998 operations to the ImaSmarty routine.

Obviously, it’s very important to write loops in an as optimal a way as possible.

Is Tinkerbell Alive?

If you remember Peter Pan, then you know that Tinkerbell glows when she’s healthy and if she’s not glowing… Well, she’s dead or at least not well. While our application is processing loops we need to let the user know that Tinkerbell is still alive and well.

Consider this next code snippet:

nFound = 0
' Count the number of occurrences of cSomeValue in rst.
With rstSomeRST
' Find the first occurrence and loop for all other occurrences.
.FindFirst "SomeField = '" & cSomeValue & "'"
Do Until .NoMatch
' Found one.  Increment the counter and find the next one.
nFound = nFound + 1
.FindNext "SomeField = '" & cSomeValue & "'"
Loop
End With
' Let the user know how many we found.
MsgBox "We found " & nFound & " occurrences of " & cSomeValue & "."

On first inspection, this looks like well written and efficient code since we’ve kept the number of statements in the Do loop to the absolute minimum. However, the number of times this loop must execute is indeterminate. It depends on the number of occurrences of cSomeValue in rstSomeRST. This can be anywhere from 0 to the number of records in the recordset which might be thousands, millions or even billions of records.

Since we don’t know how many times this loop will iterate, we also don’t know how long it will take to process the loop. This could be anywhere from nanoseconds to hours so we need to let the user know that things are still working while the loop is processing.

Users are willing to wait an amazing amount of time for a process as long as we distract their attention and keep them entertained. In general, something needs to move. This might be a progress bar, some text value on the form, a bouncing ball; anything at all will do as long as it let’s the user know that Tinkerbell is still alive. Without this indicator, users will get frustrated and quit after only a few seconds. Here’s a new version of this code that does this.

nFound = 0
' Let the user know what's about to happen.
SetStatusText "Finding occurrences of " & cSomeValue & ". Please wait..."
CursorMode = vbHourGlass
' Count the number of occurrences of cSomeValue in rst.
With rstSomeRST
' Find the first occurrence and loop for all other occurrences.
.FindFirst "SomeField = '" & cSomeValue & "'"
Do Until .NoMatch
' Found one.  Increment the counter and find the next one.
nFound = nFound + 1
' Update the status bar every tenth time through.
If nFound Mod 10 = 0 then
SetStatusText "Finding occurrences of " & cSomeValue & _
	". Please Wait... " & nFound & " found so far."
End If
.FindNext "SomeField = '" & cSomeValue & "'"
Loop
End With
' Reset the mouse cursor and the status bar.
CursorMode
SetStatusText "Processing Complete."
' Let the user know how many we found.
MsgBox "We found " & nFound & " occurrences of " & cSomeValue & "."

Let’s look at the differences. First, note we’ve added some code inside the do loop. While this is going to increase the total processing time, it’s a trade off between letting the user know things are still running versus just processing in the background with the likely appearance of a system lockup. As long as they see that things are progressing, they’ll watch that number count up for a very long time indeed.

In this case, we’ve compromised adequately between fast performance and distracting the user. The status bar will be updated every tenth time through the loop rather than every time. This is done because the SetStatusText routine has substantial error checking that takes some time to run. Also, this loop will run fast enough that if the text is updated every time the loop iterates, the user won’t have time to read the status before it’s changed again. It may well be that we should update this text perhaps every 100 times. Testing the routine would indicate this and it’s something that should be checked. In general, we want that update to happen once every couple of seconds so we need to fine tune the routine to accomplish this.

Also note that we’ve changed the cursor to an hourglass right before the loop. This is the standard Windows "hold on a second" indicator so we definitely want to use it. Also, we’ve added status text updates to let the user know what’s happening both before and after the loop and after the loop we’re careful to change the mouse cursor back to the default.

All of this extra code is for no other purpose then to let the user know that the system is not locked up and is still working. But suppose there are billions of records? This means the loop may very well take hours to run and the loop has control of the entire system. While this loop is running, the user can’t even switch to another application. So we need to give the user a way to do this. Here’s an even more enhanced version of this code.

nFound = 0
' Let the user know what's about to happen.
SetStatusText "Finding occurrences of " & cSomeValue & ". Please wait..."
CursorMode vbHourGlass
' Disable the process button to insure that this process can't be run again.
cmdStartProc.Enabled = False
' Count the number of occurrences of cSomeValue in rst.
With rstSomeRST
' Find the first occurrence and loop for all other occurrences.
.FindFirst "SomeField = '" & cSomeValue & "'"
Do Until .NoMatch
' Found one.  Increment the counter.
nFound = nFound + 1
' Update the status bar every tenth time through.
If nFound Mod 10 = 0 then
SetStatusText "Finding occurrences of " & cSomeValue & _
	". Please wait... " & nFound & " found so far."
' Let Windows have control for a bit.
DoEvents
End If
' Find the next one.
.FindNext "SomeField = '" & cSomeValue & "'"
Loop
End With
' Reset the mouse cursor and the status bar.
CursorMode
SetStatusText "Processing Complete."
' Let the user know how many we found.
MsgBox "We found " & nFound & " occurrences of " & cSomeValue & "."

Do you see the differences? First the command button that was clicked to start this process is disabled. Second is the addition of the call to DoEvents within the If/End If statement. See the help topic for DoEvents for details but the short version is that every tenth time through the loop, the loop is going to yield control to the operating system.

Assuming there are events in the events queue, Windows will process these events and then return control to the loop. If the user has done nothing, no events are processed and the loop continues virtually without interruption.

This is also why we disable the start process button. We must make certain that this process cannot be started while it’s still running this would most likely cause a general protection fault if we did.

Now, suppose the user does switch to a different application? In this case, the loop does not continue in the background. It just waits for Windows to set focus back to it so that it can continue. If we need true background processing things are a bit more complex. We need to use a timer or spawn an ActiveX object that can run on it’s own multi-tasking thread. Either is doable but neither is simple and they are normally not necessary especially if we distract the user.

Now let’s enhance this routine for the last time to actually give the user a way out of the loop before it’s finished. First, suppose that there is a Stop Process button on the form with its cancel property set to true. Here’s the click method for the button.

Private Sub cmdStopProc_Click()
' Set's the stop process flag defined in module header to true.
lStopProcess = True
End Sub

As you can see this method does nothing more than set the value of a flag.

Here’s the final revision to the code for the do loop. There’s quite a few more changes here so study it closely.

nFound  = 0
' Let the user know what's about to happen.
SetStatusText "Finding occurrences of " & cSomeValue & _
	". Press the ESC key or click the Stop Process Button to Cancel..."
CursorMode = vbArrowHourGlass
' Disable the process button to insure that this process can't be run again.
cmdStartProc.Enabled = False
' Set the stop process flag to allow the process to run.
lStopProcess = False
' Enable the stop button so the user can stop the process.
cmdStopProc.Enabled = True
' Count the number of occurrences of cSomeValue in rst.
With rstSomeRST
' Find the first occurrence and loop for all other occurrences.
.FindFirst "SomeField = '" & cSomeValue & "'"
Do Until .NoMatch
' Found one. Increment the counter.
nFound = nFound + 1
' Update the status bar every tenth time through.
if nFound Mod 10 = 0 then
SetStatusText "Finding occurrences of " & cSomeValue & _
	". Press the ESC key or click the Stop Process " & _
	"Button to Cancel... " & nFound & " found so far."
' Let Windows have control for a bit.
DoEvents
' Did the user cancel?
If lStopProcess then
If MsgBox("Cancel?", vbYesNo, "Are You Sure?") = vbYes then
Exit Do
Else
lStopProcess = False
End If ' MsgBox
End If ' lStopProcess
End If ' nFound Mod 10 = 0
' Find the next one.
.FindNext "SomeField = '" & cSomeValue & "'"
Loop
End With
' Reset the mouse cursor and the status bar.
CursorMode
If lStopProcess then
SetStatusText "Processing Canceled."
MsgBox "Process was canceled before completion."
Else
SetStatusText "Processing Complete."
' Let the user know how many we found.
MsgBox "We found " & nFound & " occurrences of " & cSomeValue & "."
End If
' Disable the stop process button.
cmdStopProc.Enable = False
' Enable the start process button.
cmdStartProc.Enabled = True

Okay. What happens is simple. First, before the loop starts we make sure that everything is initialized to let the loop process and to let the user cancel. That is, we make sure the stop flag is false and the stop button is enabled. We also give the user some additional instructions in the status bar that tells him how to cancel the process. The loop then begins and runs as it always has.

Now, when the do events function is called, if the user has either pressed the escape key or clicked the stop button, the stop flag will be true. Since we have the cancel property of the stop button set to true, pressing the escape key has the same effect as clicking the stop button. We give the user one more chance to continue the process just to make sure. If they answer Yes, we let them out and report that the process is not complete. If they answer No, we reset the stop flag and continue the processing.

Unless you know for certain that there are a limited number of records to process, it’s always a good idea to add code that works similar to this anytime you have a do loop that controls processing on records. Keep in mind that we keep a limited amount of data around for testing purposes. Chances are the user will have many more records then we have.

For loops are a little easier to control because we can tell up front how many iterations of the loop will be required. This means that for loops often don’t need this attention to detail. However, if a for loop is likely to take a substantial time to run, we, at the very least, want to throw in some Tinkerbell indicators to let the user know she’s still alive.

Detecting Changes on Form Unloads

When we close an edit form, we typically want to check and make sure the user has saved the changes before the close. This is just a convenience that helps the user to avoid accidentally tossing out changes that need to be written to a table. Edit forms usually have a Cancel button that allows the user to abort the changes and close the form. Since this is the case, it might seem logical for a check for changes to occur in the Cancel button’s click method.

However, there’s another way for the user to close a form. This is by clicking the X button in the control box at the right end of the form’s banner. If the user clicks this button, the Cancel button’s click method is not executed so the user won’t be warned about losing changes.

Also, suppose the user chooses to close the entire application by exiting the app through the menus or by clicking the MDI form’s X button. This to would bypass the Cancel button’s click method.

Fortunately, Visual Basic provides an event we can use to take care of this problem. This is the QueryUnload event. All forms, regardless of their type have this event and the event is fired when a request to unload the form is detected. Even if the user chooses to close the parent MDI form, the MDI form will first fire the QueryUnload event for each of its child forms. This then is the logical place to put code that checks for changes and, in general, the Cancel button’s click method should have nothing more than an Unload Me statement.

Some forms require no special code in the query unload event. These include search windows, dialog windows and other windows that don’t actually edit data but all forms that do edit data should have a change check that warns the user that they haven’t saved their changes.

The code that detects and reports possible changes should look something like this for MS Jet record sets. Other database types will have different requirements.

Dim dbs As Database
Dim rst As RecordSet
' Check each database in the current workspace for an incomplete edit.
For Each dbs In DBEngine.Workspaces(0).Databases
' For through each recordset in this dbs' record set collection.
For Each rst in dbs.Recordsets
If rst.EditMode <> dbEditNone then
If MsgBox("Are you sure you wish to close this window?" & vbCRLF & _
	  "Any changes you have made will be lost.", _
	vbExclamation + vbYesNo, Me.Caption) <> vbYes then
Cancel = 1
Exit For
End If ' MsgBox
End If ' rst.EditMode
Next ' Each rst
Next ' Each dbs

This routine checks each record set for some edit underway. Unfortunately, MS Jet does not provide a property for a record set or field that returns whether or not a change has actually been made. Thus, the message box will be displayed for the first recordset for which an Edit or AddNew has been called and not updated.

By setting the Cancel value to a non-zero number, the form unload event is interrupted and the user has the opportunity to save changes that have taken place. If none of the record sets are in an edit mode, the Cancel value is left at zero and the form is allowed to close.

Also, note that by using the Databases and Recordsets collections, this code will work regardless of the number of tables opened in the form.

Handling User Contention

What is user contention? Simply put, it is an attempt by two users to edit the same piece of data at the same time. A single user can also have contention with himself if he is attempting to edit the same data in two separate windows but this is rare and should generally not be allowed by our applications. You’ll see some techniques below that will prevent this.

Since all of our systems are written as multi-user applications, we must assume that any time we attempt to update a piece of data the update may fail. Indeed, simply opening the table can fail as you will see below. Here are the various types of contention that can occur and the steps we take to resolve them.

  1. Two users attempt to save the same data at the same time. This is a condition that is called deadlock and can be thought of as two people trying to go through the same door at the same time. One or the other must graciously allow the other to go first or both will be stuck in the door frame forever.
  2. The remedy is for both users to attempt the save again. However, it is possible that both users will again try the save at the same time and again, and so on. The way we prevent this is to wait a random time interval before we retry. This guarantees that one or the other user will be able to save the data. The user that does not save the data will most likely experience user contention as described in the next type.

    Then again, he may not. Since MS Jet locks pages of data rather than single records, it is very possible that the first user is editing a different record that happens to be on the same page of data. In this case, the second user will be able to save once the first user releases the lock on the page. That is, the second user will place a lock on the same page and then update the different record.

  3. Two users are editing the same data and the first user requests and succeeds with his save. When the second user attempts to save, the system will not allow it because by doing so the second user will overwrite changes made by the first. Here’s the sequence of events that can cause this.
  • User 1 retrieves the data to his edit buffer.
  • User 2 retrieves the same data to his edit buffer.
  • User 1 edits and saves the data. At this point, User 2 has made no changes but the information in his edit buffer no longer agrees with the actual information in the record. User 1 has written new values to the record but User 2’s edit buffer still has the original values.
  • User 2 edits and attempts to save. The save will not succeed because the record has been changed by User 1.

This condition is resolved by updating the second user’s edit buffer with the changes made by the first. This is easily done by moving zero records. That is, make a call like this.

rst.Move 0

This refreshes the second user’s edit buffer with the new values from the record. Note that when you do this, you are throwing out the second user’s changes so it’s polite to let them know with a message box before you actually do this.

Also, if the edit underway involves multiple records, it’s best to refresh all of the records rather than just those that were changed by the first user. That is, throw out all of the changes made by the second user so that the second user can start his edit from scratch. This prevents the second user from being confused.

  1. A user attempts to edit and update data that is read-only. This should never occur and, if it does, it’s a program bug that we should catch in our testing. The table has to have been opened in read-only mode which will cause the program to error when a call to Edit is made.
  2. The remedy to this is obviously to correct the program.

  3. A user attempts to open a table in read-write mode but another user has the table opened exclusively.
  4. We normally don’t handle this condition because it’s most unlikely to happen. Generally, we don’t open production tables exclusively unless we are performing system maintenance of some sort.

  5. A user attempts to open a table exclusively but another user has the table opened in read-write or exclusive mode.
  6. We do handle this one. Again, this would normally occur when a user is performing system maintenance and, when this happens, it’s a simple thing to handle. We simply inform the user that exclusive access was not available and close the form.

  7. The same user is attempting an edit on the same record in two instances of the same form.

If this happens, we haven’t done our job to well. Here’s a typical call from a search window that would start an edit on an existing record. Typically this would be the click method of an Edit button.

Private Sub cmdEdit_Click()
Dim oForm as New frmEdit
oForm.nUID = rstSomeTable!UID
oForm.Show
End Sub

You can see that if the user clicked this button twice on the same record, two instances of frmEdit would be created and both would be editing the same record. How do we prevent this? Here’s the code that will.

Private Sub cmdEdit_Click()
dim oEditForm as New frmEdit
dim oForm as Form
' Make sure this record is not already being edited.
For Each oForm in Forms
If oForm.Name = oEditForm.Name Then
' We found an instance of the form.  Same record?
If oForm.nUID = rstSomeTable!UID Then
' Yes it is.  Make this form the active form.
oMDIMain.ActiveForm = oForm
' Make sure this form isn't minimized.
oForm.WindowState = 0
Exit Sub
End If ' oForm.nUID
End If ' oForm.Name
Next ' oForm
' If we get to here we aren't already editing this record.
oEditForm.nUID = rstSomeTable!UID
oEditForm.Show
End Sub

What we do here is look at each form in the forms collection and see whether or not an instance of the edit form we need is already running. If there is an instance of this form, we see if it’s editing the same record. If it is, we pop that form to the top of the window stack and make sure it’s not minimized. Then we exit the sub to prevent spawning a new form. On the other hand, if we don’t find an edit underway for this record, we go ahead and spawn the new form. This technique makes it impossible for one user to edit the same record twice, at least with this form.

A slightly simpler version of this technique can be used to prevent a user from having more than one instance of a particular form running at the same time. Simply remove the if condition that test nUID and make the existing instance of the form the active form.

There is one other way that a user can be editing the same record at the same time. That is if we allow multiple instances of the same application to be running at the same time. We always prevent this in the startup routine of the application so that’s handled too.

What Really Happens When We Add New Records?

Depending on your past experience, the AddNew method of a record set may behave in ways you don’t really expect. Older, PC oriented, database engines, especially Xbase engines (those based more or less on Dbase) typically handled the addition of new records in the following sequence of events.

The program calls an Append function which physically adds a new record to the database, locks the new record and positions the record pointer to the new record.

The record’s fields are edited.

The lock is released.

In this scenario, note that the record position is immediately moved to the new record on the append call and the record position stays on the new record after the record lock is released. MS Jet works a bit differently. Here is the sequence of events.

A call is made to the record set’s AddNew method. The edit buffer’s field values are cleared or initialized to the default field values for the recordset. The record pointer is still positioned at what ever record is current at the time of the AddNew call.

The fields in the edit buffer are edited.

The record is written to the recordset by a call to Update. Since we used the AddNew method to start the edit, the Update method physically adds the new record and updates the new records fields with the values in the edit buffer. The edit buffer is then reinitialized with the current record’s field values. That is, the original record before the call to AddNew.

This second sequence is very common among more advanced data engines such as Btrieve, Oracle and SQL but it is not the traditional way thing have been done with PC based databases. Also, most of the time you add a new record to a table, you typically want to be positioned on that record after the update. Here’s how to accomplish this.

' Start an AddNew transaction and update the edit buffer.
With rst
.AddNew
!Field1 = cSomeString
!Field2 = nSomeNumber
' Update the recordset.
.Update
' Reposition the record pointer at the new record.
.BookMark = .LastModified
End With

Note the statement right before the End With statement. This statement repositions the bookmark, that is, the current record pointer, to the last modified record in the record set which is the one I just added.

Conclusion

That’s it! You now know how to handle most of the situations that we run across. Of course, each application is unique and will most likely require some special attention. Normally though, you can apply the techniques laid out here in one form or another to handle just about every situation we encounter.

Happy DBMSing!

Return To Top