Full Circle Magazine FR

Ceci est une ancienne révision du document !

In the early days of computers, a company called Digital Equipment Corporation (DEC) created its 32-bit VAX computer using openVMS as its operating system. Because a VAX/VMS computer is so reliable, there are today - after more than 25 years - still a large number of them in use. But, in the end, even these reliable computers will have to be replaced. As described in part 1, you could migrate from VAX/VMS to Linux, as the way Linux works is largely compatible with VAX/VMS. If you use Pascal as your programming language, you will find that Lazarus/Free Pascal is a good replacement. But there are technical functions used in VMS with no apparent replacement in Linux. In this article, I will describe the migration of the network-type database DBMS32.

Network vs relational database

Today you have a choice of different databases, varying from a free MySQL database to the very expensive Oracle database. But they all have one thing in common: they are relational databases. Relational databases have a lot of advantages, but also a big disadvantage: accessing a large database can take quite some time and it can be unpredictable how long it will take. When you are creating some kind of report, this is acceptable. But in a real-time environment this might lead to disruptions.

Digital Equipment Corporation (today a part of Hewlett-Packard) created on their VAX/VMS computers a different kind of database: a network database called DBMS32. In this case, the word “network” does not refer to a LAN or the Internet, but to the internal organization of the data. The different types of data (records) are not linked to each other through a relation but by a double linked list. Finding the first/next/last member of a set is lightning fast, because you only have to follow the link, instead of reading all records in the database to see if the relation is satisfied. This is, of course, only true if a set (relation) is defined at database design time. Searching through the database as you would do in a relational database is still possible, but that must be implemented within the application. The advantage of implementation within the application is the control of the flow. If the result of a query is unexpectedly large, there might be problems with allocating memory or with the time it takes when using a relational database. In your application you could specify a limit on the results and abort the action, instead of freezing up or crashing.

Another advantage of the use of linked lists in comparison to relations, is the order of the items in the linked list. This can be organized and changed, just as you wish, whereas, with a relation, you need to define some attribute to specify the order. When you insert or remove an item somewhere in the middle, the order attribute of ALL following items has to be changed, which is time consuming. In DBMS32 you can also use more than one list with the same set definition, so an item can be assigned to one or the other list or to none at all (but not to two or more lists).

To create a network database, you must create a database definition and run a database generation program. This database definition cannot be changed at run time as with a relational database (define table as….). This makes a network database inflexible, but when you create a set of programs with a dedicated task (for exampe, to control a manufacturing machine), speed is more important then flexibility.

Another advantage of using a database definition is the possibility of recreating the database in case of corruption (do you remember every change you made to your relational database before it became unusable?). Making a backup of your database will help, but I have witnessed an attempt to restore a backup, only to find the backup was incomplete or the result also was corrupted. No fun there.

And what about a planned change to the database? At execution time, the changes must be implemented manually - which takes time - and what if you make a mistake? What if the entire change has to be rolled back? Using DBMS32 you can already change the database definition, create a new database, and at execution time unload the old and load the new database. The same mechanism can be used to go back to the old database definition in case the entire change must be rolled back. This gives you complete version control.

Drawback is that you have to recompile and link all of the programs using the database, as they all have to get aware of the changed layout. But that can be done (and tested!) before the change is executed.

When unloading the database, you get the contents of the database in readable form, in a text file. If your database is very large, this text file is also very large and the unload process will take some time. This may be unacceptable, in which case a network database is not suited for your task.

Because the unload file is plain text, you can modify the contents of your database using a simple text editor. You can cut a large amount of records connected to one record, and paste it to a different record, in one move. This saved my ass last Christmas!

For the communication with the application, DBMS32 uses shared memory called the User Work Area (UWA). The application fills a part of the UWA with data, and then calls a database request, specifying what is to be done. A program called DATABASE_MANAGER handles the request, taking the data from the UWA, accessing the physical database, and puts the result back in the UWA. In the UWA there is space for exactly one record of every type, so each request to read the database can have only one result.

My implementation

DBMS32 was created in the 1980's. Memory and hard-disks were expensive and therefore limited. When you designed a database using DBMS32, you had to think carefully about the distribution of the data over the available hard-disks. Nowadays, we don't care anymore because disk space is cheap and abundant. Some of the specifications in the original database definition file are therefore obsolete. When you are migrating from a VAX/VMS system to Linux, you do not have to remove these items, because in my implementation they will simply be ignored. No changes to the definition file necessary!

Modern relational databases use TCPIP for the communication between an application and the database. Besides decoupling of application and database, this also makes it possible to put the database on a different server somewhere on the network. Multiple computers could connect at the same time to such a database. For the use of DBMS32 it was not necessary that TCPIP was installed, and only local hard-disks could be used. In my implementation, I decided to keep it so. In DBMS32, you were assigning groups of records to “AREA”s (files) and, for each “AREA”, specifying on what disk and directory it resides. In my implementation, every record gets its own file, and all files reside in the same disk and directory. There is no DATABASE_MANAGER, and every application accesses the record files itself by mapping them into shared memory (“memory mapped files”). Synchronization and locking are handled through the same shared memory. The use of shared memory makes it possible that the operating system controls the assignment of physical memory and how much of the record files are in fact read and loaded in memory. This allows for the use of very large record files without the use of a huge amount of memory and large access times.

Changes to the database are also written to the journal file with a timestamp. When you create a full backup at regular intervals, and incremental backups on a smaller interval (a new journal file is created every time a backup is created), you can restore the database to any point in time on another computer. You would simply copy a set of connected files (full + incremental backup + journal file) to this computer, and restore the database up to time X. This allows for analyzing the data in the database at exactly (within a millisecond) the time a “bad thing happened”.

My clone of DBMS32 consists of four programs. The first is the database generator. It reads the database description and creates all type definitions and database access routines for the applications, plus routines for use by the other programs. The second program is a GUI type of replacement for DBQ, the database query program of DBMS32. It can be used to read - and navigate through - the database without intervening with the production applications and to manage the database. Managing the database includes the creation of the record files for a new (empty) database, (un)loading the database, making a full or incremental backup and restoring a backup. The last two actions are executed using the remaining two programs. Because these programs are run separately, they can also be started from the terminal or by a script, allowing for the aforementioned creation of backups at a regular interval.

Conclusion

This is the last part of my series about the migration of VAX/VMS to Linux. Although I talk about the VAX, this article is in fact valid for all computers using OpenVMS, so also for the Alpha. The most important requirement is that your programs are created with Pascal. Because I am going into early retirement, I have a lot of time to assist you if you still have one or more of these “old girls”. Migration to Linux is not only less expensive, the most important advantage is that the result is predictable. No functionality “lost in translation”, no production loss, no hidden bugs (unless already present…) that will hit you at the worst possible moment, and no intrusion from hackers or viruses to stop your production.

I hope you enjoyed reading my series. If you want to know more about VMS, Pascal or DBMS32 (or network databases in general), you can always send me an email: info@theovanoosten.nl.