Full Circle Magazine FR

In this series of articles, I will be building a text-based application with Free Pascal, using its text-based interface for user interaction. This will be combined with other, more modern, technologies such as database access using SQL and Web access with HTTP. The final aim of the project is to demonstrate how Pascal can be used to build a modern application, while avoiding the overhead associated with a graphical interface that uses a widget set such as GTK or Qt. In the previous part of the series, we covered setting up a small Sqlite database, then building a command-line Free Pascal program to access it. Finally, we integrated the database code into our Free Vision application through a new Dialog type to connect to the database and display data retrieved. In this fourth part of the series, we will connect to the Internet in order to refresh the information in our database directly from the Full Circle Magazine website.

Dans cette série d'articles, je construirai une application en mode texte avec Free Pascal, en utilisant son interface en mode texte pour l'interaction avec l'utilisateur. Ceci sera combiné avec d'autres technologies, plus modernes, telles que l'accès à une base de données en utilisant SQL et l'accès au Web avec HTTP. Le but final du projet est de démontrer comment Pascal peut être utilisé pour construire une application moderne, tout en évitant le surpoids associé à une interface graphique qui utilise un jeu de gadgets comme GTK ou Qt.

Dans la dernière partie de la série, nous avons couvert le paramétrage d'une petite base de données Sqlite, puis construit un programme Free Pascal en ligne de commande pour y accéder. Enfin, nous avons intégré le code de la base de données dans notre application Free Vision via un nouveau type de Dialog pour se connecter à la base de données et afficher les données récupérées. Dans cette quatrième partie de la série, nous nous connecterons à Internet de façon à rafraîchir les informations de notre base de données directement depuis le site Web du Full Circle Magazine.

Tools to connect to the network The default installation of Free Pascal does not always come with the network units; if we installed the compiler using the apt method under Ubuntu, we will need to install a supplementary package: apt install fp-units-net This will make several units available to us, both for simple HTTP connections and for more complex scenarios such as OpenSSL. In our case, we will merely be connecting to the Internet to pull down the RSS feed for our favorite magazine. For this task, we can follow one of several different strategies. One would be to build our own HTTP protocol client, starting from bare sockets. This would certainly be feasible and not very complex for a simple connection, given the simple nature of the HTTP protocol. However, things can get a tad more complex when HTTPS connections are required - as, indeed, they are nowadays for many services such as Google. Besides, parsing HTTP code quickly gets tedious. This is where a second strategy comes in, which is to use an existing library to encapsulate these tasks. The application programer can then concentrate on the actual data being transferred, leaving low-level connection mechanisms to the library.

Outils pour se connecter au réseau

L'installation par défaut de Free Pascal ne contient pas toujours les « units » pour le réseau ; si nous avons installé le compilateur en utilisant la méthode apt sous Ubuntu, nous aurons besoin d'installer un paquet supplémentaire :

apt install fp-units-net

Ceci rendra plusieurs units disponibles, à la fois pour les simples connexions HTTP et pour des scénarios plus complexes comme OpenSSL. Dans notre cas, nous nous connecterons juste à Internet pour extraire le flux RSS de notre magazine favori. Pour cette tâche, nous pouvons suivre différentes stratégies. L'une serait de construire notre propre protocole client HTTP, en démarrant à partir des « sockets » (interfaces) nus. Ceci serait certainement faisable et pas très compliqué pour une simple connexion, étant donné la nature du protocole HTTP. Cependant, les choses peuvent devenir un peu plus complexes quand des connexions HTTPS sont requises - comme c'est le cas maintenant pour de nombreux services, tels que Google. De plus, l'analyse du code HTTP deviendra vite fastidieuse. C'est là où une seconde stratégie apparaît, qui est d'utiliser la bibliothèque existante pour encapsuler ces tâches. Le programmeur de l'application peut ensuite se concentrer sur les vraies données à transférer, laissant les mécanismes de bas niveau à la bibliothèque.

The well-known library libcurl is standard in many POSIX environments, and the Free Pascal project has made it available through an appropriate unit. The CURL or “C-URL” (pronounced “see URL”) library is, no doubt, a bit of an overkill for our task, but it is very easy to set up and use. The interested reader can peruse the documentation at the project’s site (https://curl.haxx.se/ ) to find out more. A short extract from the website will help understand what it is capable of doing: “A free and easy-to-use client-side URL transfer library, supporting DICT, FILE, FTP, FTPS, GOPHER, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, POP3, POP3S, RTMP, RTSP, SCP, SFTP, SMB, SMBS, SMTP, SMTPS, TELNET and TFTP. libcurl supports HTTPS certificates, HTTP POST, HTTP PUT, FTP uploading, Kerberos, SPNEGO, HTTP form based upload, proxies, cookies, user+password authentication, file transfer resume, http proxy tunneling, and more!”

La bibliothèque bien connue libcurl est standard dans de nombreux environnements POSIX et le projet Free Pascal l'a rendue disponible dans une unit appropriée. La bibliothèque CURL ou C-URL (prononcez “si-URL”) est, sans aucun doute, un peu surdimensionnée pour notre tâche, mais elle est très facile à paramétrer et utiliser. Le lecteur intéressé peut consulter la documentation sur le site du projet (https://curl.haxx.se/) pour en apprendre plus. Un court extrait de ce site Web aidera à notre compréhension de ce qu'il est capable de faire :

« Une bibliothèque de transfert d'URL côté client, libre et facile à utiliser, supportant DICT, FILE, FTP, FTPS, GOPHER, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, POP3, POP3S, RTMP, RTSP, SCP, SFTP, SMB, SMBS, SMTP, SMTPS, TELNET and TFTP. libcurl supporte les certificats HTTPS, HTTP POST, HTTP PUT, le téléversement FTP, Kerberos, SPNEGO, le téléversement basé sur le format HTTP, les proxy, les cookies, l'authentification identifiant+mot-de-passe, la reprise de transfert de fichier, le tunnel par proxy http, et plus encore ! »

However, most Ubuntu installations contain the libcurl library itself, but not the header files. In order to use it in conjunction with a compiled language such as Free Pascal, we will need to install the corresponding header files as well. The actual version may change across distro versions, but in Ubuntu 16.04 and Linux Mint 18 the following command should set you up: apt install libcurl4-gnutls-dev As a final note, it must be said that CURL is fast - usually much faster than opening a web page in a browser. This fits in well with Free Pascal’s general lightweight and fast operation. What shall we be downloading from the Web? What we are basically looking for is a list of recent publications of Full Circle Magazine. This is quite a simple task for a human reader: simply navigate to the web page, and read the articles choosing those with “Full Circle Magazine” in the title (as opposed to “Weekly News”) or something else. However, it is more difficult for a computer program than for a human to do this, since it must somehow learn to distinguish between articles and background elements, images, etc. Modern web pages are in fact quite complex assemblies of information. So let us help our program by using the facilities built into a modern Content Management System (CMS) such as Wordpress used by FCM. One of these is the Rich Site Summary (RSS) feed.

Cependant, la plupart des installations Ubuntu contiennent la bibliothèque libcurl, mais pas les fichiers d'en-têtes. De façon à l'utiliser en conjonction avec un langage compilé comme Free Pascal, nous devrons aussi installer les fichiers d'en-têtes correspondants. La bonne version peut varier suivant les versions de la distrib., mais, dans Ubuntu 16.04 et Linux Mint 18, la commande suivante conviendra :

apt install libcurl4-gnutls-dev

Comme dernière note, on peut dire que CURL est rapide - habituellement plus rapide que d'ouvrir une page Web dans un navigateur. Cela convient bien au fonctionnement général léger et rapide de Free Pascal.

Qu'est-ce que nous téléchargerons du Web ?

Au départ, nous sommes à la recherche d'une liste des publications récentes du Full Circle Magazine. C'est une tâche très simple pour un lecteur humain : naviguer simplement jusqu'à la page Web et lire les articles en choisissant ceux avec le titre Full Circle Magazine (par opposition à Weekly News ou quelque chose d'autre). Cependant, c'est plus difficile à faire pour un programme informatique que pour un humain, car il doit apprendre à distinguer entre les articles et les éléments d'arrière-plan, les images, etc. Les pages Web modernes sont en fait un assemblage bien complexe d'informations. Aussi, aidons notre programme en utilisant les facilités intégrées dans un CMS (Content Management System - système de gestion du contenu) moderne tel que Wordpress utilisé par le FCM. L'un d'eux est le flux RSS (Rich Site Summary - Résumé enrichi d'un site).

An RSS feed is a way of querying the site’s database of articles. Most CMS allow us to build a query along several different lines. In the case of FCM, we can use the following query to get a list of all articles: http://fullcirclemagazine.org/feed/ If we open it in our web browser, we will get a properly formatted page since modern browsers are aware of the underlying code in eXtended Meta Language (XML), and can parse it correctly. However, we should be aware that what has really come across the network looks like that shown in the image below. If anybody is interested, I actually obtained the accompanying screenshot using the CURL command by itself: $ curl http://fullcirclemagazine.org/feed/ | less

Un flux RSS est une façon d'interroger la base de données des articles du site. La plupart des CMS nous permettent de bâtir une requête composée de plusieurs façons différentes. Dans le cas du FCM, nous pouvons utiliser la requête suivante pour obtenir la liste de tous les articles : http://fullcirclemagazine.org/feed/

Si nous l'ouvrons dans notre navigateur Internet, nous obtiendrons un page formatée car les navigateurs modernes connaissent le code sous-jacent en XML (eXtended Meta Language - Meta-langage étendu) et peuvent l'analyser correctement. Cependant, nous devons savoir que ce qui a traversé le réseau ressemble à ce qui est présenté dans l'image ci-dessous.

Si ça intéresse quelqu'un, j'ai en fait obtenu la copie d'écran jointe en utilisant la commande CURL elle-même :

$ curl http://fullcirclemagazine.org/feed/ | less

Other queries can be handled in a similar fashion. For example, to locate all articles that are tagged with “podcast”, “python”, “pascal” etc.: http://fullcirclemagazine.org/tag/podcast/feed/ http://fullcirclemagazine.org/tag/python/feed/ http://fullcirclemagazine.org/tag/pascal/feed/ Ronnie has actually included a simplified version of the feed for the Weekly News podcast: http://fullcirclemagazine.org/feed/podcast This mechanism can be used to specify articles published by a specific author. Since articles announcing a new issue of FCM are invariably by Ronnie, we can use this URL to filter them out: http://fullcirclemagazine.org/author/ronnie-2/feed/ This is what we will be using to help our search for new issues.

D'autres requêtes peuvent être manipulées de façon similaire. Par exemple, pour localiser tous les articles qui sont étiquetés « podcast », « python », « pascal » etc. : http://fullcirclemagazine.org/tag/podcast/feed/

http://fullcirclemagazine.org/tag/python/feed/

http://fullcirclemagazine.org/tag/pascal/feed/

En fait, Ronnie inclut une version simplifiée du flux pour le podcast Weekly News :

http://fullcirclemagazine.org/feed/podcast

Ce mécanisme peut être utilisé pour spécifier les articles publiés par un auteur en particulier. Comme les articles annonçant un nouveau numéro du FCM sont invariablement de Ronnie, nous pouvons utiliser cette URL pour les filtrer :

http://fullcirclemagazine.org/author/ronnie-2/feed/

C'est ce que nous allons faire pour nous aider dans notre recherche des nouveaux numéros.

Combining the pieces So far, we have a means of connecting to the Web, and downloading a single page, from its URL. On the other hand, we have a specific URL that allows us to get dynamic content built from the FCM content management system. Let us put this together in a 20-line Pascal program (with empty lines and nice formatting). First, we need to include the unit, and declare a couple of variables: uses LibCurl; Var URL : Pchar = 'http://fullcirclemagazine.org/author/ronnie-2/feed/'; hCurl : pCurl; The first variable is simply the URL we will be passing on to libcurl, though in PChar format. Pascal strings were from its origin arrays of 256 bytes. Bytes in positions 1 to 255 held the ASCII codes for each character, while position 0 held the total number of characters in the string. Needless to say, this scheme has several limitations, including the handling of strings with more than 255 characters and using multiple-byte codifications (think Unicode). On the other hand, the C programming language - in which most of the Internet has been developed - traditionally uses null-terminated strings. These are simply an area of contiguous memory long enough to hold the string, and a special null character at its end. Luckily, these can also easily be used in Pascal through the Pchar type, which is simply a pointer to a character.

Combiner des éléments

À ce stade, nous avons un moyen de nous connecter au Web pour télécharger une page unique, à partir de son URL. En revanche, nous avons une URL spécifique qui nous permet d'obtenir du contenu dynamique construit à partir du CMS du FCM. Mettons cela ensemble dans un programme Pascal de 20 lignes (avec des lignes vides et sous un beau format). D'abord, nous avons besoin d'inclure l'unit et de déclarer deux variables :

uses

LibCurl;

Var

URL : Pchar = 'http://fullcirclemagazine.org/author/ronnie-2/feed/';

hCurl : pCurl;

La première variable est simplement l'URL que nous voulons passer à libcurl, dans le format PChar. Depuis l'origine, les chaînes Pascal sont des rangées de 256 caractères. Les mots dans les positions 1 à 255 contiennent les codes ASCII de chaque caractère, alors que la position 0 contient le nombre total de caractères de la chaîne. Inutile de dire que ce schéma a plusieurs limitations, en particulier pour la manipulation des chaînes de plus de 255 caractères et l'utilisation des codifications multi-mots (pensez à l'Unicode). D'autre part, le langage de programmation C - dans lequel la majeure partie de l'Internet a été développée - utilise traditionnellement des chaînes se terminant par zéro. Elles sont simplement une zone de mémoire contiguë assez longue pour contenir la chaîne, avec un caractère spécial zéro à la fin. Heureusement, elles peuvent être aussi utilisées facilement en Pascal par le type PChar, qui est simplement un pointeur vers un caractère.

The second variable is a pointer to the CURL routine that will handle our connection. The following code has been directly copied from the unit’s example: hCurl:= curl_easy_init; if Assigned(hCurl) then begin curl_easy_setopt(hCurl,CURLOPT_VERBOSE, [True]); curl_easy_setopt(hCurl,CURLOPT_URL,[URL]); curl_easy_perform(hCurl); curl_easy_cleanup(hCurl); end; A handler process is set up with curl_easy_init. If this worked (we have access to the dynamic library and enough free memory), we can then set a pair of options (verbose output on screen, the URL we wish to parse). curl_easy_perform does the actual work of downloading the page, and finally curl_easy_cleanup free up any memory used. The complete code of this program is available here: http://pastebin.com/QM9m3jug

La seconde variable est un pointeur vers la routine CURL qui gère notre connexion. Le code suivant a été copié directement de l'exemple de l'unit :

hCurl:= curl_easy_init;
if Assigned(hCurl) then
begin
  curl_easy_setopt(hCurl,CURLOPT_VERBOSE, [True]);
  curl_easy_setopt(hCurl,CURLOPT_URL,[URL]);

  curl_easy_perform(hCurl);
  curl_easy_cleanup(hCurl);
end;

Un processus de gestion est paramétré avec curl_easy_init. Si cela fonctionne (nous avons accès à une bibliothèque dynamique et assez de mémoire libre), nous pouvons alors paramétrer deux options (une sortie verbeuse à l'écran et l'URL que nous voulons analyser). curl_easy_perform fait le vrai travail de téléchargement de la page et, enfin, curl_easy_cleanup libère la mémoire utilisée.

Le code complet de ce programme est disponible ici : http://pastebin.com/QM9m3jug

If we compile and run the above program, we simply get the XML code returned by the FCM server echoed on screen. To do something useful with this data, we could for example write it to a file. To be more precise, we will need to get the CURL handler to do this for us. The process will take place in two steps. In the first step, we will pass the handler a callback function to transform its internal data structure into a byte stream. This function needs to be declared with a very precise syntax, since it will in fact be called by the main CURL handler - which has been written in C. This is, in fact, a good example of using multiple programming languages within the same compiled object. Function DoWrite(Ptr : Pointer; Size : size_t; nmemb: size_t; Data : Pointer) : size_t; cdecl; begin DoWrite := TStream(Data).Write(Ptr^,Size*nmemb); end; In the second step, we will create a new file stream and instruct the handler to use the DoWrite function to populate it. So, begin by creating an appropriate TFileStream, and open it: Var f : TFileStream; begin f:=TFileStream.Create('fcm.xml',fmCreate);

Si nous compilons et lançons le programme ci-dessus, nous obtenons simplement l'écho sur l'écran du code XML retourné par le serveur du FCM. Pour faire quelque chose d'utile avec ces données, nous pourrions, par exemple, les écrire dans un fichier. Pour être plus précis, nous avons besoin d'obtenir le gestionnaire de CURL pour qu'il le fasse pour nous. Le processus aura lieu en deux étapes.

Dans la première étape, nous allons passer au gestionnaire une fonction de rappel pour transformer la structure de ses données internes en flux de mots. Cette fonction a besoin d'être déclarée avec une syntaxe très précise, car elle sera appelée en fait par le gestionnaire de CURL, qui est écrit en C. En fait, c'est un bon exemple de l'utilisation de plusieurs langages de programmation dans le même objet compilé.

Function DoWrite(Ptr : Pointer; Size : size_t; nmemb: size_t; Data : Pointer) : size_t; cdecl;

begin

DoWrite := TStream(Data).Write(Ptr^,Size*nmemb);

end;

Dans la seconde étape, nous créerons un nouveau flux de fichier et indiquerons au gestionnaire d'utiliser la fonction DoWrite pour le remplir. Aussi, commencez par créer le TFileStram approprié et ouvrez-le :

Var

f : TFileStream;

begin

f:=TFileStream.Create('fcm.xml',fmCreate);

Continue by setting up hCurl options as before, but add: curl_easy_setopt(hCurl,CURLOPT_WRITEFUNCTION,[@DoWrite]); curl_easy_setopt(hCurl,CURLOPT_WRITEDATA,[Pointer(f)]); The curl_easy_perform and curl_easy_cleanup stages take place as before. Do not forget to close the file stream at the end of the program: f.Free; The complete code for this program is available at this link: http://pastebin.com/Ayth2cfK In this part of our series on Free Pascal, we deviated a bit from Free Vision and went into the technical details on how to use the CURL library from Pascal to connect to an RSS feed on FCM’s content management system. At this stage, we know how to connect to the server and download an XML file containing a list of recent articles published on the web page. In the next part of our series, we will see how to parse the XML code to retrieve the information we are aiming for: issue numbers and download URLs.

Continuez en paramétrant les options de hCurl comme précédemment et ajoutez :

   curl_easy_setopt(hCurl,CURLOPT_WRITEFUNCTION,[@DoWrite]);
   curl_easy_setopt(hCurl,CURLOPT_WRITEDATA,[Pointer(f)]);

Les niveaux curl_easy_perform and curl_easy_cleanup prennent la même place que précédemment. N'oubliez pas de fermer le flux de fichiers à la fin du programme :

f.Free;

Le code complet de ce programme est disponible par ce lien : http://pastebin.com/Ayth2cfK

Dans cette partie de notre série sur Free Pascal, nous nous sommes un peu éloignés de Free Vision et sommes rentrés dans des détails techniques sur comment utiliser la bibliothèque CURL à partir de Pascal pour se connecter au flux RSS du CMS du FCM. À ce stade, nous savons comment nous connecter au serveur et télécharger un fichier XML contenant une liste des articles récents publiés sur la page Web. Dans la prochaine partie de notre série, nous regarderons comment analyser le code XML pour récupérer l'information que nous cherchons : numéros de publication et URL de téléchargement.