Full Circle Magazine FR

In the early days of computers, a company called Digital Equipment Corporation (DEC) created its 32-bit VAX computer using openVMS as its operating system. Because a VAX/VMS computer is so reliable, there are today - after more than 25 years - still a large number of them in use. But, in the end, even these reliable computers will have to be replaced. As described in part 1, you could migrate from VAX/VMS to Linux, as the way Linux works is largely compatible with VAX/VMS. If you use Pascal as your programming language, you will find that Lazarus/Free Pascal is a good replacement. But there are technical functions used in VMS with no apparent replacement in Linux. In this article, I will, among others, describe how to replace mailboxes, and deal with file version numbers and packed arrays of char (strings).

Au début des ordinateurs, une société appelée Digital Equipment Corporation (DEC) créa son ordinateur 32-bit VAX avec OpenVMS comme système d'exploitation. Comme un ordinateur VAX/VMS est vraiment fiable, aujourd'hui, après plus de 25 ans, ils sont encore très nombreux à être en service. Mais, à la longue, mêmes ces ordinateurs fiables devront être remplacés. Comme décrit dans la partie 1, vous pourriez migrer de VAX/VMS vers Linux, car le fonctionnement de Linux est en grande partie compatible avec VAX/VMS. Si votre langage de programmation est Pascal, vous trouverez que Lazarus/Free Pascal est une bonne alternative. Mais il y a des fonctions techniques dans VMS sans remplacement évident sous Linux. Dans cet article, je décrirai, notamment, comment je remplace des boîtes aux lettres et comment je traite les numéros de versions des fichiers et les matrices empaquetées de caractères (packed arrays of chars - les chaînes de caractères).

Mailboxes For IPC (interprocess communication), VMS offers mailboxes. In Linux, there is a perfect replacement, only the way you create and use them is very different: • To create a mailbox in VMS, you call $CREMBX (handle, logical_name, permanent). The identifier of the opened link to the (new) device named MBAxxxx (xxxx is a number between 0 and 9999) is assigned to “handle”. By using the boolean argument “permanent”, VMS is told to create a mailbox that will still exist after all processes linked to that mailbox are gone (otherwise, it will be deleted and the content is lost). The given logical name (for an explanation of logicals, see part 3) is used to see if the mailbox already exists. VMS will look for it in the table LNM$PERMANENT_MAILBOXES or LNM$TEMPORARY_MAILBOXES, depending on the value of 'permanent'. If found, the logical is 'translated', and the mailbox it points to is referenced; if not, a new mailbox is created and the logical - pointing to the newly created mailbox - is created in the above mentioned table. In Linux you have to get a key first. In the examples from the documentation, they use the function ftok (file-to-key) for that. You pass the file specification of an existing file, plus a small number between 0 and 255, to this function, and it converts this to a large number, the key. If you hand over the same file and number, you will get the same key. This might suggest that this file will contain the content of the mailbox, but that's not true. The file is only used to get a unique key. The meaning of this file can be compared to the logical used by VMS.

Boîtes mail

Pour l'IPC (interprocess communication - communication entre processus), VMS offre des boîtes aux lettres. Dans Linux, il existe un remplaçant parfait, mais la façon de les créer et utiliser est différente :

• Pour créer une boîte aux lettres dans VMS, vous appelez $CREMBX (handle, logical_name, permanent). L'identifiant du lien ouvert vers le (nouveau) dispositif nommé MBAxxxx (xxxx est un nombre entre 0 et 9999), est assigné à « handle ». En utilisant l'argument booléen « permanent », VMS est averti que la boîte aux lettres à créer existera encore quand tous les processus liés à cette boîte seront terminés (sinon, elle est effacée et le contenu est perdu). Le logical_name (nom de logical - voir partie 3) donné est utilisé pour voir si la boîte aux lettres existe déjà. VMS le cherchera dans la table LNM$PERMANENT_MAILBOXES ou LNM$TEMPORARY_MAILBOXES, suivant la valeur de « permanent ». S'il le trouve, le logical est « traduit » et la boîte aux lettres vers laquelle il pointe est référencée ; sinon, une nouvelle boîte aux lettres est créée et le logical - pointant vers la boîte nouvellement créée - est créé dans la table mentionnée ci-dessus.

Dans Linux, vous devez d'abord obtenir la clé. Dans les exemples de la documentation, la fonction ftok (file to key - fichier vers clé) est utilisée pour ça. Vous passez la spécification de fichier d'un fichier existant, plus un petit nombre entre 0 et 255, à cette fonction et elle le convertit en un grand nombre, la clé. Si vous passez les mêmes fichier et nombre, vous obtiendrez la même clé. Ceci pourrait suggérer que le contenu de la boîte aux lettres se trouve dans ce fichier, mais ce n'est pas le cas. Le fichier n'est utilisé que pour obtenir une clé unique. La signification de ce fichier peut se comparer au logical utilisé par VMS.

Now you can create or link to a mailbox using this key as the identifier to get a handle. So you can use two methods: Either create a dummy file for every mailbox you want to use and always use 0 as the small number, or use, e.g., the folder at the base of the current version of your project (see part 3) to get separate mailboxes for every version and assign a constant (smaller than 256) to every mailbox to get the unique key. Whichever method you use, the created mailbox will be permanent. In Linux there are no temporary mailboxes. So be aware of that when starting a new session; you might get old messages left in the mailboxes from a previous run. • Now you are ready to send and receive messages. To send data in VMS, you would fill a buffer with data and make a call to $QIO(W) (see part 2) using the handle of the mailbox you want to put the message into, a pointer to the buffer, and the size of the buffer (plus an eventflag, see part 2). To receive data, you would make the same call to $QIO(W), only the function specifier would be “read” instead of “write”. On return, the buffer will be filled with the received message, and the size parameter will be filled with the size of the received message.

Maintenant, vous pouvez créer un lien vers une boîte aux lettres en utilisant cette clé comme identifiant pour obtenir un handle (indicateur). Vous pouvez utiliser deux méthodes : soit créer un fichier factice pour chaque boîte aux lettres que vous voulez utiliser et toujours utiliser 0 comme plus petit nombre, ou utiliser, par exemple, le dossier à la base de la version courante de votre projet (voir partie 3), pour disposer d'une boîte aux lettres différente pour chaque version et assigner une constante (inférieure à 256) à chaque boîte pour avoir une clé unique. Quelle que soit la méthode utilisée, la boîte aux lettres créée sera permanente. Dans Linux, il n'y a pas de boîte aux lettres temporaire. Aussi, soyez bien conscient de cela quand vous démarrez une nouvelle session ; vous pourriez trouvez de vieux messages datant d'un fonctionnement précédent.

• Maintenant, vous êtes prêt à envoyer et recevoir des messages. Pour envoyer des données dans VMS, vous rempliriez un tampon (buffer) avec des données et feriez appel à $QIO(W) (voir partie 2) en utilisant le handle de la boîte aux lettres dans laquelle vous voulez placer le message, un pointeur vers le buffer, et la taille du buffer (plus un eventflag, voir partie 2). Pour recevoir des données, vous feriez le même appel à $QIO(W), mais l'indicateur de fonction serait « read » (lire) plutôt que « write » (écrire). En retour, le buffer serait rempli avec le message reçu et le paramètre de taille serait rempli avec la taille du message reçu.

To send data in Linux, you would fill a buffer just like in VMS, but there must be an extra integer at the beginning of the buffer. This integer value must be filled with a number not being zero! Then you would call sndmsg() with the handle, a pointer to the buffer, and the size of the buffer – not counting the extra integer! To receive a message, you would call rcvmsg() with the same parameters plus a message identifier. This message identifier is used to filter the messages sent to you, so you get only the messages with the matching number in the extra integer value. If the message identifier is zero, there will be no filtering. For example: suppose there are five messages sent to you: message A with that extra integer set to 1, B with 2, C with 1 again, D with 3, and E with 2 again. If you start rcvmsg (…., 2), you will get message B first and then message E. Then rcvmsg (…, 3) will retrieve message D, and, eventually, rcvmsg (…., 0) will get the rest: A and C. If rcvmsg (…., 4) is issued, no messages will be returned. The message size, again not counting the extra integer, is returned as the result of the function. Another difference is that, in VMS, you must specify the total size of the mailbox when you create it. In Linux the size is fixed. This might be a problem when using large, or a huge number of, messages.

Pour envoyer des données dans Linux, vous rempliriez un buffer comme dans VMS, mais il y a un entier supplémentaire au début du buffer. La valeur de cet entier sera un nombre différent de zéro ! Ensuite, vous appelleriez sndmsg() avec le handle, un pointeur vers le buffer, et la taille du buffer - sans compter l'entier supplémentaire ! Pour recevoir un message, vous appelleriez rcvmsg() avec les mêmes paramètres plus un identifiant de message. L'identifiant de message est utilisé pour filtrer les messages qui vous sont envoyés, de façon à n'obtenir que les messages dont le nombre correspond à la valeur de l'entier supplémentaire. Si l'identifiant de message est zéro, il n'y aura pas de filtrage. Par exemple : supposez que cinq messages vous ont été envoyés : le message A avec l'entier supplémentaire à 1, B avec 2, C avec 1 à nouveau, D avec 3 et E avec 2 à nouveau. Si vous lancez rcvmsg (…., 2), vous recevrez le message B puis le message E. Ensuite, rcvmsg (…, 3) recevra le message D et, à la longue, rcvmsg (…., 0) recevra le reste : A et C. Si rcvmsg (…., 4) est envoyé, « aucun message » sera retourné. La taille du message, à nouveau sans compter l'entier supplémentaire, est retourné comme résultat de la fonction.

Autre différence : dans VMS, vous devez spécifier la taille totale de la boîte aux lettres quand vous la créez. Dans Linux, la taille est fixée. Ceci peut être un problème si vous utilisez des messages très grands ou très nombreux.

The implementation In the beginning, I simply replaced every reference to mailboxes with the corresponding Linux calls. I used the specified logical as the name of a dummy file in a dedicated folder to get the unique key. For a small project, this would be acceptable, but this is not the intention of this migration tool. So I created the functions _CREMBX and _QIO(W) to behave the same as in VMS. (A “$” sign is not allowed in names when using Free Pascal, so my conversion program substitutes the “$” by an underscore. The same problem arose when using the terminal, I will discus this in part 5 about DCL). • _CREMBX uses a dedicated logical to point to a file containing info about assigned numbers. This file is also used for creating the unique key. _CREMBX assigns a free number in the range 0-255 if the mailbox does not already exist. The given logical in table LNM_PERMANENT_MAILBOXES is used or created just like VMS would. Then the mailbox is linked to or created using the key, and the handle is returned. • The _QIO(W) will send or receive depending on the function specifier. It uses a new thread and the specified eventflag to create the asynchronous behavior as in VMS (see part 2). This way, the only thing you would have to do for the migration is define the above mentioned dedicated logical and add the extra integer in front of the declaration of the buffer. Removing the logical used for a mailbox would cause the creation of a new mailbox upon opening, while the still connected processes would use the old one, exactly like in VMS.

L'implémentation

Au début, j'ai simplement remplacé chaque référence à une boîte aux lettres par l'appel Linux correspondant. J'utilisais le logical spécifique comme nom d'un fichier factice dans un dossier dédié pour avoir une clé unique. Pour un petit projet, ça pourrait être acceptable, mais ce n'est pas le but de cet outil de migration. Aussi, j'ai créé les fonctions _CREMBX et _QIO(W) qui se comportent comme dans VMS. (Le symbole « $ » n'est pas autorisé dans les noms en Free Pascal, aussi, ma conversion remplace le « $ » par un trait de soulignement. Le même problème se pose quand on utilise un terminal ; j'en parlerai dans la partie 5 à propos de DCL.) • _CREMBX utilise un logical dédié pour pointer vers un fichier contenant l'information sur les nombres assignés. Ce fichier est aussi utilisé pour créer la clé unique. _CREMBX assigne un nombre quelconque dans la tranche 0-255 si la boîte aux lettres n'existe pas encore. Le logical donné dans la table LNM_PERMANENT_MAILBOXES est utilisé ou créé comme le ferait VMS. Puis, la boîte aux lettres est reliée à la clé ou créée avec elle, et le handle est retourné. • _QIO(W) recevra ou enverra suivant les paramètres de la fonction. Elle utilise un nouveau fil et l'eventflag spécifié pour créer un comportement asynchrone comme dans VMS (voir partie 2).

De cette façon, la seule chose à faire pour la migration est de définir le logical dédié mentionné plus haut et d'ajouter un entier supplémentaire au début de la déclaration du buffer. La suppression du logical utilisé pour une boîte aux lettres causerait la création d'une nouvelle boîte à l'ouverture, tandis que les processus encore connectés utiliseraient l'ancienne, exactement comme VMS.

File version numbers This part is about file systems: In the beginning, there was DOS (and CP/M) which used FAT as its file system. File names were 8 characters long with a file type of 3 characters. Later, NTFS (New Technology File System) offered long names and types,and more attributes for better security. They were both using assigned drive letters to identify the device (disk). In Linux, most of the time “Ext” is used as its default file system, currently Ext4. As mentioned in part 3, Linux does not use assigned drive letters, but mounting points. From a functional perspective, that’s the only difference to FAT/NTFS. When DEC created ODS-2 (On Disk Structure version 2), it took a different approach. They named all devices (not just disks) with two characters to specify the type, one character starting with A, then B and so on to identify the adapter, plus a serial number. The base folder (directory) on a disk is named [000000] and names and types of files are long like in NTFS and EXT. And they decided that there should be more versions of the same file in the same directory (folder). An example:

Numéros de versions des fichiers

Cette partie concerne les systèmes de fichiers ; au début, il y avait DOS (et CP/M) qui utilisaient FAT comme système de fichiers. Les noms de fichiers avaient 8 caractères avec un type de fichier de 3 caractères. Plus tard, NTFS (New Technology File System - Système de Fichiers de Nouvelle Technologie) permit des noms et des types longs, et plus d'attributs pour une meilleure sécurité. Ils utilisaient tous les deux des lettres pour identifier le périphérique (disque). Dans Linux, la plupart du temps, « ext » est utilisé comme système de fichier par défaut, actuellement Ext4. Comme mentionné en partie 3, Linux n'utilise pas de lettre assignée au disque, mais des points de montage. Dans une perspective fonctionnelle, c'est la seule différence avec FAT/NTFS.

Quand DEC a créé ODS-2 (On Disk Structure version 2 - structure sur disque version 2), il a pris une autre approche. Il nomma tous les périphériques (pas seulement les disques) avec deux caractères qui spécifiaient le type, un caractère commençant à A, puis B et ainsi de suite pour identifier l'adaptateur, plus un numéro séquentiel. Le dossier de base (répertoire) sur le disque est nommé [000000] et les noms de fichiers et de types sont longs comme dans NTFS et EXT. Et ils décidèrent qu'il y aurait plusieurs versions du même fichier dans le même répertoire (dossier). Un exemple :

Suppose I create a text file with an editor and save the file. Besides the name and type that I specify, it will also get a version number, starting with “1”. If I would use the editor to change this file and save it again, the existing file will not be overwritten like in Linux and Windows. The editor creates a new file, with the same name and type, but with a higher version number, “2” in this case. The same happens with log files, executables (programs), and so on. The advantage is that you can see the history of a file and even restore a previous version. For example, if a new version of a program does not work as it should, you can just kill the process and start the previous version, it is still there (unless you deleted it). But of course there is also a down-side: For a thousand versions of a file, you need a thousand times the disk-space and the version is limited to 32767. If you go beyond that, the creating program will crash! If your project is depending on this behavior (the file version, not the crashing), you will have to change your programs. Either by adding a “version number” to the name or type, or by changing your project in such a way it will no longer depend on the file versions. Whichever solution is best depends on the project and there's no single solution that fits all possibilities.

Supposez que je crée un fichier texte dans un éditeur et que je le sauvegarde. Outre le nom et le type que je spécifie, il y aura aussi un numéro de version, commençant à « 1 ». Si j'utilisais l'éditeur pour modifier ce fichier, et le sauver à nouveau, le fichier existant ne serait pas écrasé comme dans Linux ou Windows. L'éditeur crée un nouveau fichier, avec les mêmes nom et type, mais avec un numéro de version plus élevé, « 2 » dans ce cas. La même chose arrive avec les fichiers de log, les exécutables (programmes), etc. L'avantage est que vous pouvez voir l'historique d'un fichier et même restaurer une version précédente. Par exemple, si une nouvelle version de programme ne fonctionne pas comme attendu, vous pouvez juste tuer le processus et démarrer la version précédente, qui est toujours là (sauf si vous l'avez effacée). Mais, bien sûr, ceci a un inconvénient : pour mille versions d'un fichier, vous avez besoin de mille fois l'espace disque et la version est limitée à 32767. Si vous allez au-delà, le programme de création plantera !

Si votre projet est sensible à ce comportement (la version de fichier, pas le plantage), vous devrez changer vos programmes. Soit en ajoutant un « numéro de version » aux nom et type, ou en changeant votre projet de telle façon qu'il ne dépende plus des numéros de version. L'une de ces solutions est la meilleure pour votre projet, mais il n'y a pas de solution unique qui réponde à tous les cas.

Packed array of chars In Free Pascal, two worlds collide. On the one hand are the C-type strings, and on the other the Pascal-type strings. C-type strings are either a pointer to a data-structure which can dynamically change in size, or a data-structure of fixed size, starting with a byte containing the length of the string. Both contain a zero terminated number of characters. Pascal-type strings are essentially just another fixed size array, without the terminating zero. In VAX-Pascal, there is a variant that uses a length word in front of the array (VARYING OF CHAR), but Free Pascal does not recognize the keyword VARYING. You will have to use fixed size C-type strings to replace a VARYING, but the behavior is not exactly the same. For this type of strings, the migration will need some work. Big problems arise when you compare two strings with at least one of them being a packed array of chars. In VAX-Pascal, the remainder of a packed array of chars will be filled with spaces, in Free Pascal the remainder is unknown! As a result, comparing even fails if the strings you use are equal. Suppose you have a packed array of chars with size 10 ([1..10]) named STR. Now fill it with a text of size 4 like “STR := 'test';” and compare it with the same text: “IF STR = 'test' THEN …. ELSE …..”. Both should be equal, but it will execute the ELSE part! This is a problem for a constant type of string, but it will also fail if you use a C-type string to fill a packed array of chars.

Les matrices empaquetées de caractères

En Free Pascal, deux mondes se télescopent. D'un côté, il y a les chaînes de type C et, de l'autre, les chaînes de type Pascal. Les chaînes de type C sont, soit un pointeur vers une structure de données qui peut changer de taille dynamiquement, soit une structure de données de taille fixe, commençant par un octet qui contient la longueur de la chaîne. Les deux, contenant un certain nombre de caractères, se terminent par un zéro. En Pascal VAX, il y a une variante qui utilise un mot de longueur devant la matrice (VARYING OF CHAR), mais Free Pascal ne reconnaît pas le mot clé VARYING. Vous devez utiliser des chaînes de longueur fixe de type C pour remplacer un VARYING, mais le comportement n'est pas exactement le même. Pour ce type de chaînes, la migration nécessitera un peu de travail.

Les gros problèmes surviennent quand vous comparez deux chaînes et qu'au moins l'une d'elles est une matrice empaquetée de caractères. En Pascal VAX, le reste de la matrice de caractères est rempli avec des espaces, en Free Pascal il n'y a pas de reste ! Résultat : la comparaison plante si les deux chaînes sont égales. Supposez que vous ayez une matrice empaquetée de caractères de taille 10 ([1..10]) appelée STR. Maintenant, remplissez-la avec un texte de taille 4 comme « STR := 'test'; » et comparez-la avec le même texte : « IF STR = 'test' THEN …. ELSE ….. ». Les deux devraient être égales, mais c'est la partie ELSE qui sera exécutée ! C'est un problème avec une chaîne de type fixe, mais ça ne marchera pas non plus si vous utilisez une chaîne de type C pour remplir la chaîne empaquetée de caractères.

This is clearly a bug, but I didn't have the time to fill out a bug-report. As a workaround, you could add a string of spaces with the same length as the size of the packed array of char, because Free Pascal will not complain about the string being too long. It just truncates it. So “STR := 'test' + ' ';” would fill the remainder with spaces as in VAX-Pascal, but this must be done with every assignment concerning a packed array of chars! You also have to do something similar when comparing. Lots of work to get around this behavior, so it might be easier - if possible - to switch to another type of string. But then there will be problems with code depending on the size of the packed array of chars, being not the size of its contents. There will be also problems when assigning a C-type string from a packed array of chars. If the remainder of the packed array of chars are zero's (usually the case) the contents of the string will list as “test” in the above example in the debugger, but the length of the string will be 10, so a comparison to “test” will fail. In the debugger, it looks like they are the same, but still the “if” statement will execute the “else” part. It took me days to figure that out. It shows an inconsistency in the way Free Pascal handles strings, as the terminating zero is ignored. Worth a bug report! Next month: In the next article, I will go more in-depth about DCL (Digital Command Language) – the interface used by Digital when interacting with the terminal, and AST's (Asynchronous System Trap), also named ‘call back routines’, and, related to that, catching the signals as a result of pressing ^C, ^Y, ^T or ^Z.

C'est clairement un défaut, mais je n'avais pas le temps de remplir un rapport d'erreur.

Comme contournement, vous pouvez ajouter une chaîne d'espaces de même longueur que la taille de la chaîne empaquetée de caractères, car Free Pascal ne se plaindra pas si la chaîne est trop longue. Il va juste la tronquer. Ainsi, « STR := 'test' + ' '; » remplira le reste avec des espaces comme dans le Pascal VAX, mais ça doit être fait pour chaque déclaration concernant une chaîne empaquetée de caractères ! Vous devez aussi faire quelque chose d'identique lors d'une comparaison. Il y a beaucoup de travail pour surmonter ce comportement ; il serait peut-être plus simple - si possible - de passer à un autre type de chaîne. Mais, ensuite, des problèmes surviendront avec le code, suivant la taille de la chaîne empaquetée de caractères, qui sera différente de la taille de son contenu.

Des problèmes apparaîtront aussi lors de la déclaration d'une chaîne de type C à partir d'une chaîne empaquetée de caractères. Si le reste de la chaîne empaquetée de caractères est zéro (cas classique), le contenu de la chaîne sera affiché « test » dans le débogueur, pour reprendre l'exemple du dessus, mais la longueur de la chaîne sera 10 ; aussi, une comparaison avec « test » échouera. Dans le débogueur, elles ont l'air identiques, mais, à nouveau, c'est la partie « else » de la déclaration « if » qui sera exécutée. Il m'a fallu des jours pour le comprendre. Cela montre une incohérence dans la manière avec laquelle Free Pascal manipule les chaînes, car le zéro final est ignoré. Ça mériterait un rapport d'erreur !

La mois prochain : dans le prochain article, j'approfondirai DCL (Digital Command Language - Langage de Commande de Digital) - l'interface utilisée par Digital dans l'interaction avec un terminal et les AST (Asynchronous System Trap - litt., piège système asynchrone), appelées aussi « routines de rappel » (call back routines) et, en rapport avec ça, la capture des signaux résultant de l'appui sur ^C, ^Y, ^T ou ^Z.