Full Circle Magazine FR

Ceci est une ancienne révision du document !

In this series of articles, I will be building a text-based application with Free Pascal, using its text-based interface for user interaction. This will be combined with other, more modern, technologies such as database access using SQL and Web access with HTTP. The final aim of the project is to demonstrate how Pascal can be used to build a modern application, while avoiding the overhead associated with a graphical interface that uses a widget set such as GTK or Qt.

In the previous part of the series, we deviated a bit from Free Vision and went into the technical details on how to use the CURL library from Pascal to connect to an RSS feed on FCM’s content management system. At this stage, we know how to connect to the server and download an XML file containing a list of recent articles published on the web page. In this part, we will see how to parse the XML code to retrieve the information we are aiming for: issue numbers and download URLs. We will then put it all together, and update the database our application uses with fresh data from the Web.

Understanding the XML language

EXtended Meta Language (XML) is a simple text-based language that aims to structure data as a tree: each data element may have none, or several child nodes. On the other hand, each element must have a single parent element - except for a single node in the entire tree, which is then called the root element. Each element should open with a start tag, such as <element>. The corresponding ending tag would be </element>.

Perhaps an example may help. If we wish to codify a library, for instance, our root element will be the library itself. This library may then contain an element defining its owner, and perhaps another giving the date on which the data set was compiled. Finally, we will need to create an element for each book in the library, giving its title and author.

The beauty of this scheme is that it may easily be adapted for different purposes in a flexible manner. For instance, in the above example, one of the books has been marked with a genre, while another has not.

Writing a program to transverse a data set in XML has its own challenges; luckily for us, the Free Pascal project has foreseen the use of a set of standard classes that may be implemented for our own purposes. Let us write a simple program that reads in an XML file, and outputs on screen each element name in sequence, or its value if it is a text element that simply contains text. To begin with, let us include the standard and XML classes and define our variables (previous page, bottom right).

The TFileStream will be used to access the XML file on our local disk, and make it available to a TXMLReader through an adaptor class, TXMLInputSource. The TXMLReaderSettings object is needed to pass parameters to the reader.

We begin by configuring the settings, basically telling the reader to ignore supplementary whitespace (actual spaces, but also line breaks and tabulations), and to use XML namespaces if available - though we will not need them here:

settings := TXMLReaderSettings.Create;

settings.PreserveWhiteSpace := false;

settings.Namespaces := true;

We access the file, and create a TXMLInputSource from the resulting stream:

f := TFileStream.Create('test.xml', fmOpenRead);

input := TXMLInputSource.Create(f);

Now (shown above) we can create our TXMLTextReader, and have it parse each element encountered.

Finally, let’s not forget to close the file stream neatly:

f.Free;

The code for the complete program is available at this link: http://pastebin.com/PtciSAQb .

Parsing our RSS feed in XML

In the last part of this series, we obtained the RSS for Full Circle Magazine using the CURL library. This is a piece of XML data, with the following structure. It has been cleaned up a bit to showcase relevant elements (shown below).

So, what we want to do is isolate individual <title> elements, and, within each element, the corresponding <link>. We have on the one hand a routine from CURL that fetches the contents of a URL, and produces a readable Stream. On the other hand, we have an XML parser that can parse a writable Stream. The link is obvious: we now need a mechanism to pipe data from the first stream into the second, and, in Free Pascal, this mechanism is a piped stream. Let’s do it. First, we will need a double set of variables (next page, top right).

The first set are those used for the CURL library, the second will be the input and output streams to be parsed together, and the third set is for the XML parser. Finally, the two strings and associated boolean variables will be needed to link each element (of type ntElement) with its associated value (type ntText) - which is not the element itself, but a sub-element inserted inside the parent element. Unfortunately, this textual element is not always in the first position among the element’s children, so a rather convoluted set of flags (the boolean variables) must be used to detect them.

We will not go over either the use of the CURL library, that has been described in the previous part of this series, or over the XML parser. We will concentrate instead on the use of the piped streams. We will create the two streams together:

CreatePipeStreams (inPipe, outPipe);

The CURL library can then be created using the outPipe section, to which it will write the data obtained from the Internet:

curl_easy_setopt(hCurl,CURLOPT_WRITEDATA,[Pointer(outPipe)]);

The XML reader’s input will be connected to the inPipe section, from which it will read back the bytes:

input := TXMLInputSource.Create(inPipe);

Finally, the XML reader’s main loop can be configured to detect title/link pairs. For the time being, they will simply be output on screen (shown bottom right).

The complete program can be found at this link: http://pastebin.com/ciVGXvy6 .

Integrating XML parsing into our Free Vision application

At this stage, we have on the one hand a working Free Vision application, that consults its internal SQLite database of FCM issues and gives the result in a scrolling list on screen. On the other hand, we have a mechanism to connect to the Internet and update the database. Now, we need to connect the two, so that the database is updated before the data is shown to the user.

Perhaps the most elegant way of doing this - and the least expensive in terms of writing code - is to create a new Dialog type. Called TUpDateDialog, it will be shown on screen just before TDisplaySQLDialog is created. So, in the main application’s HandleEvent procedure, we have what's shown on the next page, top right.

The TUpdateDialog will need no outside inputs, since it will always be using the same target URL to connect to the Internet, and local database filename to append any data found on new issues of FCM. This Dialog will just need a constructor that builds it, and sets off the process:

TUpdateDialog = object(TDialog)
constructor Init (FileName : String);
end;
PUpdateDialog = ^TUpdateDialog;

This constructor procedure will need a whole lot of variables, but they can be classified into separate categories. We will need: • A TRect and PLabel to set up this Dialog on screen; this is the Free Vision part. • A URL and PCurl to get to the Internet and retrieve a stream accessing FCM’s feed. • Two pipes, to set up the connection between the incoming stream from the Internet, and an outgoing stream towards the XML reader. • The XML reader itself, associated settings, and several variables to identify each new issue’s identification code (e.g. ‘111’), title (‘Full Circle Magazine #111’) and download link. • A handler for the SQLite connection to the local database.

So, have a look at the code shown bottom right.

Most of the code will not be reproduced here, since it is in essence a mashup of that written in our previous part and the beginning of this one. Salient points would include the use of a Regular Expression parser (regexp) in order to parse the titles from the XML stream, identifying which contain the text identifying a new issue of FCM. We are looking for something such as ‘#109’, ‘#110’, ‘#111’…, so basically a pound sign ‘#’ followed by a series of digits. This can be made systematic with the following code:

re := TRegExpr.Create;

re.Expression := '#[0-9]*';

We can now use ‘re’ as a regular expression reader in the following way, to identify if the next value found by the XML reader contains the expression we are looking for. If so, it can be isolated and used to prime the issue code for insertion into the database (shown bottm left).

Now, all we need to do is determine, for each issue announcement found in the XML stream, if this issue is already inside our database. To so do, we will need to get back to the SQLite driver, and search the existing issues with the same identifying code. If a match is not found, this issue is a new one and needs to be appended to the existing table (shown top right).

Once the XML code has been completely parsed, we can alter the label on the Dialog to notify the user of how many new issues of FCM have been found. In my case, my database had been initialized by hand with issues 108, 109 and 110. I launched the application, and several new issues were detected from the XML feed: previous issue 107, and newer issue 111. This last one was identified twice, since two different posts on FCM’s feed referenced this issue, but was inserted only once into the database.

msgLabel^.Text := NewStr('Found ' + IntToStr(newItems) + ' new issues...');

DrawView;

The finalized application’s code can be found here: http://pastebin.com/H422xg3V .

In this part of the series, we put our complete application together using Free Vision for the user interface, SQLite to create a local data, and CURL and XML to retrieve fresh data from an RSS feed from the Web to update our database. In the next part, we will see various ways in which our application can run on a Raspberry Pi.