Ceci est une ancienne révision du document !
In this series of articles, I will be building a text-based application with Free Pascal, using its text-based interface for user interaction. This will be combined with other, more modern, technologies such as database access using SQL and Web access with HTTP. The final aim of the project is to demonstrate how Pascal can be used to build a modern application, while avoiding the overhead associated with a graphical interface that uses a widget set such as GTK or Qt. In the previous part of the series, we covered setting up a small Sqlite database, then building a command-line Free Pascal program to access it. Finally, we integrated the database code into our Free Vision application through a new Dialog type to connect to the database and display data retrieved. In this fourth part of the series, we will connect to the Internet in order to refresh the information in our database directly from the Full Circle Magazine website.
Dans cette série d'articles, je construirai une application en mode texte avec FreePascal, en utilisant son interface en mode texte pour l'interaction avec l'utilisateur. Ceci sera combiné avec d'autres technologies, plus modernes, telles que l'accès à une base de données en utilisant SQL et l'accès au Web avec HTTP. Le but final du projet est de démontrer comment Pascal peut être utilisé pour construire une application moderne, tout en évitant le surpoids associé à une interface graphique qui utilise un jeu de gadgets comme GTK ou Qt.
Dans la partir précédente de la série, nous avons couvert le paramétrage d'une petite base de données Sqlite, puis construit un programme Free Pascal en ligne de commande pour y accéder. Enfin, nous avons intégré le code de la base de données dans notre application Free Vision via un nouveau type Dialog pour se connecter à la base de données et afficher les données récupérées. Dans cette quatrième partie de la série, nous nous connecterons à Internet de façon à rafraîchir les information de notre bbase de données directement depuis le site Web du Full Circle Magazine.
Tools to connect to the network The default installation of Free Pascal does not always come with the network units; if we installed the compiler using the apt method under Ubuntu, we will need to install a supplementary package: apt install fp-units-net This will make several units available to us, both for simple HTTP connections and for more complex scenarios such as OpenSSL. In our case, we will merely be connecting to the Internet to pull down the RSS feed for our favorite magazine. For this task, we can follow one of several different strategies. One would be to build our own HTTP protocol client, starting from bare sockets. This would certainly be feasible and not very complex for a simple connection, given the simple nature of the HTTP protocol. However, things can get a tad more complex when HTTPS connections are required - as, indeed, they are nowadays for many services such as Google. Besides, parsing HTTP code quickly gets tedious. This is where a second strategy comes in, which is to use an existing library to encapsulate these tasks. The application programer can then concentrate on the actual data being transferred, leaving low-level connection mechanisms to the library.
Outils pour se connecter au réseau
L'installation par défaut de Free Pascal ne contient pas toujours les « units » pour le réseau ; si nous avons installé le compilateur en utilisant la méthode apt sous Ubuntu, nous aurons besoin d'installer un paquet supplémentaire :
apt install fp-units-net
Ceci nous rendra plusieurs units disponibles, à la fois pour les simples connexions HTTP et pour des scénarios plus complexes comme OpenSSL. Dans notre cas, nous nous connecterons juste à Internet pour extraire le flux RSS de notre magazine favori. Pour cette tache, nous pouvons suivre l'une des différentes stratégies. Une serait de construire notre propre protocole client HTTP, en démarrant à partir des « sockets » (interfaces) nus. Ceci serait probablement faisable et pas très compliqué pour une simple connexion, étant donné la nature du protocole HTTP. Cependant, les choses peuvent devenir un peu plus complexe quand des connexions HTTPS sont requises - comme c'est le cas maintenant pour de nombreux services, comme Google. De plus, l'analyse du code HTTP deviendra vite fastidieuse. C'est là où une seconde stratégie apparaît, qui est d'utiliser la bibliothèque existante pour encapsuler ces taches. Le programmeur de l'application peut ensuite se concentrer sur les vraies données à transférer, laissant les mécanismes de bas niveau à la bibliothèque.
The well-known library libcurl is standard in many POSIX environments, and the Free Pascal project has made it available through an appropriate unit. The CURL or “C-URL” (pronounced “see URL”) library is, no doubt, a bit of an overkill for our task, but it is very easy to set up and use. The interested reader can peruse the documentation at the project’s site (https://curl.haxx.se/ ) to find out more. A short extract from the website will help understand what it is capable of doing: “A free and easy-to-use client-side URL transfer library, supporting DICT, FILE, FTP, FTPS, GOPHER, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, POP3, POP3S, RTMP, RTSP, SCP, SFTP, SMB, SMBS, SMTP, SMTPS, TELNET and TFTP. libcurl supports HTTPS certificates, HTTP POST, HTTP PUT, FTP uploading, Kerberos, SPNEGO, HTTP form based upload, proxies, cookies, user+password authentication, file transfer resume, http proxy tunneling, and more!”
However, most Ubuntu installations contain the libcurl library itself, but not the header files. In order to use it in conjunction with a compiled language such as Free Pascal, we will need to install the corresponding header files as well. The actual version may change across distro versions, but in Ubuntu 16.04 and Linux Mint 18 the following command should set you up: apt install libcurl4-gnutls-dev As a final note, it must be said that CURL is fast - usually much faster than opening a web page in a browser. This fits in well with Free Pascal’s general lightweight and fast operation. What shall we be downloading from the Web? What we are basically looking for is a list of recent publications of Full Circle Magazine. This is quite a simple task for a human reader: simply navigate to the web page, and read the articles choosing those with “Full Circle Magazine” in the title (as opposed to “Weekly News”) or something else. However, it is more difficult for a computer program than for a human to do this, since it must somehow learn to distinguish between articles and background elements, images, etc. Modern web pages are in fact quite complex assemblies of information. So let us help our program by using the facilities built into a modern Content Management System (CMS) such as Wordpress used by FCM. One of these is the Rich Site Summary (RSS) feed.
An RSS feed is a way of querying the site’s database of articles. Most CMS allow us to build a query along several different lines. In the case of FCM, we can use the following query to get a list of all articles: http://fullcirclemagazine.org/feed/ If we open it in our web browser, we will get a properly formatted page since modern browsers are aware of the underlying code in eXtended Meta Language (XML), and can parse it correctly. However, we should be aware that what has really come across the network looks like that shown in the image below. If anybody is interested, I actually obtained the accompanying screenshot using the CURL command by itself: $ curl http://fullcirclemagazine.org/feed/ | less
Other queries can be handled in a similar fashion. For example, to locate all articles that are tagged with “podcast”, “python”, “pascal” etc.: http://fullcirclemagazine.org/tag/podcast/feed/ http://fullcirclemagazine.org/tag/python/feed/ http://fullcirclemagazine.org/tag/pascal/feed/ Ronnie has actually included a simplified version of the feed for the Weekly News podcast: http://fullcirclemagazine.org/feed/podcast This mechanism can be used to specify articles published by a specific author. Since articles announcing a new issue of FCM are invariably by Ronnie, we can use this URL to filter them out: http://fullcirclemagazine.org/author/ronnie-2/feed/ This is what we will be using to help our search for new issues.
Combining the pieces So far, we have a means of connecting to the Web, and downloading a single page, from its URL. On the other hand, we have a specific URL that allows us to get dynamic content built from the FCM content management system. Let us put this together in a 20-line Pascal program (with empty lines and nice formatting). First, we need to include the unit, and declare a couple of variables: uses LibCurl; Var URL : Pchar = 'http://fullcirclemagazine.org/author/ronnie-2/feed/'; hCurl : pCurl; The first variable is simply the URL we will be passing on to libcurl, though in PChar format. Pascal strings were from its origin arrays of 256 bytes. Bytes in positions 1 to 255 held the ASCII codes for each character, while position 0 held the total number of characters in the string. Needless to say, this scheme has several limitations, including the handling of strings with more than 255 characters and using multiple-byte codifications (think Unicode). On the other hand, the C programming language - in which most of the Internet has been developed - traditionally uses null-terminated strings. These are simply an area of contiguous memory long enough to hold the string, and a special null character at its end. Luckily, these can also easily be used in Pascal through the Pchar type, which is simply a pointer to a character.
The second variable is a pointer to the CURL routine that will handle our connection. The following code has been directly copied from the unit’s example: hCurl:= curl_easy_init; if Assigned(hCurl) then begin curl_easy_setopt(hCurl,CURLOPT_VERBOSE, [True]); curl_easy_setopt(hCurl,CURLOPT_URL,[URL]); curl_easy_perform(hCurl); curl_easy_cleanup(hCurl); end; A handler process is set up with curl_easy_init. If this worked (we have access to the dynamic library and enough free memory), we can then set a pair of options (verbose output on screen, the URL we wish to parse). curl_easy_perform does the actual work of downloading the page, and finally curl_easy_cleanup free up any memory used. The complete code of this program is available here: http://pastebin.com/QM9m3jug
If we compile and run the above program, we simply get the XML code returned by the FCM server echoed on screen. To do something useful with this data, we could for example write it to a file. To be more precise, we will need to get the CURL handler to do this for us. The process will take place in two steps. In the first step, we will pass the handler a callback function to transform its internal data structure into a byte stream. This function needs to be declared with a very precise syntax, since it will in fact be called by the main CURL handler - which has been written in C. This is, in fact, a good example of using multiple programming languages within the same compiled object. Function DoWrite(Ptr : Pointer; Size : size_t; nmemb: size_t; Data : Pointer) : size_t; cdecl; begin DoWrite := TStream(Data).Write(Ptr^,Size*nmemb); end; In the second step, we will create a new file stream and instruct the handler to use the DoWrite function to populate it. So, begin by creating an appropriate TFileStream, and open it: Var f : TFileStream; begin f:=TFileStream.Create('fcm.xml',fmCreate);
Continue by setting up hCurl options as before, but add: curl_easy_setopt(hCurl,CURLOPT_WRITEFUNCTION,[@DoWrite]); curl_easy_setopt(hCurl,CURLOPT_WRITEDATA,[Pointer(f)]); The curl_easy_perform and curl_easy_cleanup stages take place as before. Do not forget to close the file stream at the end of the program: f.Free; The complete code for this program is available at this link: http://pastebin.com/Ayth2cfK In this part of our series on Free Pascal, we deviated a bit from Free Vision and went into the technical details on how to use the CURL library from Pascal to connect to an RSS feed on FCM’s content management system. At this stage, we know how to connect to the server and download an XML file containing a list of recent articles published on the web page. In the next part of our series, we will see how to parse the XML code to retrieve the information we are aiming for: issue numbers and download URLs.