1

Before we get started on this month’s actual python subject, let me toot my own horn for just a minute. In late December and early January, my first book on Python was published by Apress. It is named “The Python Quick Syntax Reference”, and is available from a number of places. You can find it on the Apress site (http://www.apress.com/9781430264781), Springer.com (http://www.springer.com/computer/book/978-1-4302-6478-1) and Amazon (http://www.amazon.com/The-Python-Quick-Syntax-Reference/dp/1430264780) as well as others. It is, as the title suggests, a syntax reference that will help those of us who program in other languages as well as Python, to remember how a certain command works and the requirements for that command. Please help a poor old programmer make a living by buying the book, if you can. Now on to bigger and better things.

Avant de commencer l'article python de ce mois-ci, permettez-moi une courte page d'auto-publicité. Fin décembre et début janvier, mon premier livre sur Python a été publié aux éditions Apress. Il est intitulé « The Python Quick Syntax Reference », et est disponible dans un certain nombre d'endroits. Vous pouvez le trouver sur les sites Apress (http://www.apress.com/9781430264781), Springer.com (http://www.springer.com/computer/book/978-1-4302-6478-1) et Amazon (http://www.amazon.com/The-Python-Quick-Syntax-Reference/dp/1430264780) ainsi que d'autres. Il s'agit, comme le titre l'indique, d'une référence de la syntaxe qui aidera ceux d'entre nous qui programment aussi dans d'autres langages que Python à se rappeler comment une certaine commande fonctionne, et les prérequis pour cette commande. S'il vous plaît aidez un ancien programmeur à gagner sa vie en achetant le livre, si vous le pouvez.

Passons maintenant à des choses plus importantes et passionnantes.

2

While I was working on my latest book for Apress, I rediscovered a SQL command that I didn't discuss when we were working with SQL databases a long time ago, so I thought I'd share the information with you. It is the CREATE TABLE AS SELECT command, which allows us to pull a query from one table (or joined tables) and create another table on the fly. The general syntax is: CREATE TABLE [IF NOT EXISTS] {New Table Name} AS SELECT {query} The part in square brackets (IF NOT EXISTS) is totally optional, which will create the table only if it doesn’t exist already. The part in curly brackets, however, is not. The first is the new table name and the second is the query that you want to use to pull data and create the new table.

Alors que je travaillais sur mon dernier livre pour Apress, j'ai redécouvert une commande SQL dont je n'avais pas parlé lorsque nous travaillions avec des bases de données SQL, il y a longtemps, alors j'ai pensé que je partagerais l'information avec vous. Il s'agit de la commande CREATE TABLE AS SELECT, qui nous permet d'exécuter une requête sur une table (ou des tables jointes) et de créer une autre table à la volée. La syntaxe générale est :

CREATE TABLE [IF NOT EXISTS] {Nouveau nom de la table} AS SELECT {requête}

La partie entre crochets (IF NOT EXISTS) est totalement optionnelle, et créera la table seulement si elle n'existe pas déjà. Les parties entre accolades, par contre, ne le sont pas. La première est le nouveau nom de la table et la seconde est la requête que vous souhaitez utiliser pour extraire des données et créer la nouvelle table.

3

Assume we have a database that has multiple tables in it. One of the tables is named “study” that holds data from a receiving operation. There are six fields which are shown below.

One of the datasets that we will need to produce from this raw data is a grouping of package count and the number of days within the study that quantity of packages came in on, assuming that the days are weekdays (Monday thru Friday) and that the day is not a holiday, since holidays have less than normal number of packages. Our query is shown above.

This then provides us with data that would look something like this:

pkgs CountOfDow 31 1 32 2 33 1 … 48 3

4

So the data is showing that during the study of 65 days, only one weekday had 31 packages but 3 weekdays had 48 packages and so on. Similar queries could be created that would cover holidays and weekends.

While having the data simply as a returned dataset from the query, we might want to do further analysis on the data, so we want to put the resulting data from the query into a table. That's why we would create a table from the query. So in the following example, shown above right, we create a table named “weekdays” using the same query we just showed above.

Now anytime we need the data for that weekday result set, we can just run a query on the weekdays table.

5

Once we know what we need, and have tested the query, then we can begin our code. Assuming we already have the study table created and populated, we can use Python to then create our new table in the main database. Just as an FYI, I am using the APSW SQLite library to do the database work.

We, of course, have to open a connection (right) and create a cursor to the SQLite database. We have covered this in a number of past articles.

Now we need to create the routine that will actually create the table with the returned dataset from the query, shown below, then alter it and run some calculations.

6

As you can see, we want to create a second cursor, so that we don’t run any risk of the first cursor having data we need to maintain. We will be using it in the final part of the code. We then drop the table if it exists and run our query on the “study” table.

Now we create three more columns (shown below) within the weekdays table named “probability”, “lower” and “upper”. We do this by using the “ALTER TABLE” SQL command.

The next step (top right) will be to sum the data in the CountOfDOW field.

7

There is only one record returned, but we do the for loop thing anyway. Remember from the above discussion that the “CountOfDow” field holds the number of days during the study that a particular number of packages came in. This gives us a value that contains the sum of all of the “CountOfDow” entries. Just so you have a reference as we go forward, the number I got from all my dummy data is 44.

upquery = “SELECT * FROM weekdays”

c1 = cursor.execute(upquery)

8

Here we have done a ‘SELECT all’ query so every record in the datatable is in the ‘c1’ cursor. We’ll walk through each row of the dataset, pulling the pkgs (row[0]) and CountOfDow (row[1]) data into variables.

LastUpper = .0 for row in c1:

cod = row[1]
pkg = row[0]

Now we will create a probability of each daily package count in the database and calculate an upper and lower value that will be used in another process later on. Notice that we check to see if the LastUpper variable contains ‘.0’. If it does, we set it to the probability value, otherwise we set it to the lower plus the probability value.

9

Finally we use an update SQL statement to put the new computed values into the database.

What we end up with is a package count (pkgs), a count of the number of days that package count came in, a probability of that occurring within the whole of the study (31 packages on 1 day out of a total of 44 (weekdays in that 60+ day study), will have a probability of 0.02.).

If we add up all the probability values in the table it should add up to 1.0 .

10

The upper and lower values then reflect a number between floating point number 0 and 1 that will mirror the possibility of any random number within that range that will give us a randomized number of packages. This number can then be used for a statistics analysis of this data. A “normal real-world” example would be to predict the number of cars that arrive at a carwash based on observational data done in the field. If you want to understand more, you could look at http://www.algebra.com/algebra/homework/Probability-and-statistics/Probability-and-statistics.faq.question.309110.html to see an example of this. All we did is generate (the hard part) easily with Python.

The code for the two routines that we presented this time is at: http://pastebin.com/kMc9EXes

Until next time.