issue158:python
Différences
Ci-dessous, les différences entre deux révisions de la page.
Les deux révisions précédentesRévision précédenteProchaine révision | Révision précédenteDernière révisionLes deux révisions suivantes | ||
issue158:python [2020/07/01 14:04] – auntiee | issue158:python [2020/07/03 14:21] – auntiee | ||
---|---|---|---|
Ligne 1: | Ligne 1: | ||
- | This month I've decided to continue our discussion of dealing with data. This time, we will look at the " | + | **This month I've decided to continue our discussion of dealing with data. This time, we will look at the " |
Why did I decide to put Law in quotes? Because it's not really a law: | Why did I decide to put Law in quotes? Because it's not really a law: | ||
• There IS a Law of Large numbers that basically states that if you perform the same experiment a large number of times, the average of the results should be close to the expected outcome. | • There IS a Law of Large numbers that basically states that if you perform the same experiment a large number of times, the average of the results should be close to the expected outcome. | ||
- | • The Law of Truly Large Numbers states that "with a large enough sample of data, many odd ' | + | • The Law of Truly Large Numbers states that "with a large enough sample of data, many odd ' |
- | This month, we will experiment to see if we can experience either of these two " | + | Pour ce mois-ci, j'ai décidé de continuer notre présentation de la gestion des données. Cette fois, nous regarderons la « Loi » des Vraiment Grands Nombres. |
+ | |||
+ | Pourquoi ai-je décidé de mettre Loi entre guillemets ? Parce que ce n'est pas réellement une loi : | ||
+ | ••Il EXISTE une Loi des Grands Nombres qui affirme en gros que si vous réalisez la même expérience un grand nombre de fois, la moyenne des résultats devrait être proche du résultat attendu. | ||
+ | ••La Loi des Vraiment Grands Nombres soutient que « avec un échantillon assez large de données, de nombreuses | ||
+ | |||
+ | **This month, we will experiment to see if we can experience either of these two " | ||
Let's first take a look at random numbers. Computers CAN NOT, by themselves, generate TRULY random numbers. They can get pretty close, and most of us are fine with close enough. But what exactly are random numbers? | Let's first take a look at random numbers. Computers CAN NOT, by themselves, generate TRULY random numbers. They can get pretty close, and most of us are fine with close enough. But what exactly are random numbers? | ||
Ligne 11: | Ligne 17: | ||
A random number is a number that is independent – with no correlations between any successive numbers. | A random number is a number that is independent – with no correlations between any successive numbers. | ||
- | Probability theory says, basically, that if you have two outcomes that are equally likely to occur (the heads of a coin in this case), there is an equally likely chance that either will occur, or in the case of a coin toss, 50% that it will end up with heads and 50% that it will end up with tails. | + | Probability theory says, basically, that if you have two outcomes that are equally likely to occur (the heads of a coin in this case), there is an equally likely chance that either will occur, or in the case of a coin toss, 50% that it will end up with heads and 50% that it will end up with tails.** |
- | Michael Crichton' | + | Ce mois-ci, nous ferons des expériences pour voir si nous pouvons vérifier l'une ou l' |
- | Now, let's create a VERY simple Python program to check this out. We will use the numpy library for the random number generator, rather than the built-in Python random number generator. While both are pretty much the same thing, the numpy library has some additional options that make it a better choice for future work. It's not good enough for serious cryptography use, but for what we need, it's fine. Because of the f-string formatting, you will need to use Python 3.7 or greater. | + | D' |
- | from numpy.random import seed | + | Un nombre aléatoire est un nombre qui est indépendant - sans aucune corrélation avec aucune suite de nombres. |
+ | |||
+ | La théorie des probabilités dit, en gros, que, si vous avez deux résultats qui ont des chances égales d' | ||
+ | |||
+ | **Michael Crichton' | ||
+ | |||
+ | Now, let's create a VERY simple Python program to check this out. We will use the numpy library for the random number generator, rather than the built-in Python random number generator. While both are pretty much the same thing, the numpy library has some additional options that make it a better choice for future work. It's not good enough for serious cryptography use, but for what we need, it's fine. Because of the f-string formatting, you will need to use Python 3.7 or greater.** | ||
+ | |||
+ | Jurrasic Park de Michael Crichton (que ce soit le livre ou le film, mais, à mon avis, le film est plus amusant) contient une bonne (mais simplifiée) présentation de la Théorie du Chaos où Ian Malcolm (joué par Jeff Goldblum) décrit la direction que prendra une goutte d'eau glissant le long de la main du docteur Elie Sattler (joué par Laura Dern). On peut dire la même chose au sujet d'une pièce tombant sur le sol ou dans le creux de votre main. L'un ou l' | ||
+ | |||
+ | Maintenant créons un programme TRÈS simple en Python pour le tester. Nous utiliserons la bibliothèque numpy pour le générateur de nombres aléatoires, | ||
+ | |||
+ | **from numpy.random import seed | ||
from numpy.random import randint | from numpy.random import randint | ||
Ligne 23: | Ligne 41: | ||
Of course, we start with the imports. In the next line of code, we set the seed value of the random generator to a value of one. If you do this, you will get the same values that I do. To run this independently from me, just comment out the seed(1) line (above). | Of course, we start with the imports. In the next line of code, we set the seed value of the random generator to a value of one. If you do this, you will get the same values that I do. To run this independently from me, just comment out the seed(1) line (above). | ||
- | Now, we’ll run the loop ten times and generate 10 random numbers between 0 and 1 (zero = Tails and 1 = Heads). The randint function gets a minimum value, a maximum value and the number of results to return in a list. The reason we use a value of 2 for the maximum value is that numpy takes this value and always returns values 1 less than the maximum. | + | Now, we’ll run the loop ten times and generate 10 random numbers between 0 and 1 (zero = Tails and 1 = Heads). The randint function gets a minimum value, a maximum value and the number of results to return in a list. The reason we use a value of 2 for the maximum value is that numpy takes this value and always returns values 1 less than the maximum. |
- | Now step through the list of returned numbers and count the number of zeros and ones. | + | from numpy.random import seed |
+ | |||
+ | from numpy.random import randint | ||
+ | |||
+ | Bien sûr, nous commençons par les imports. Dans la ligne de code suivante, nous positionnons la valeur de semence du générateur aléatoire à la valeur un. Si vous faites cela, vous aurez les mêmes valeurs que moi. Pour être indépendant de moi, mettez la ligne seed(1) en commentaire (au-dessus). | ||
+ | |||
+ | Maintenant, nous lançons la boucle dix fois et générons dix nombres aléatoires entre 0 et 1 (zéro = pile et un = face). La fonction randint met une valeur minimum, une valeur maximum et le nombre de résultats à retourner dans une liste. La raison pour laquelle nous utilisons une valeur de 2 pour la valeur maximum vient de ce que numpy prend cette valeur et renvoie toujours les valeurs à 1 de moins que le maximum. | ||
+ | |||
+ | **Now step through the list of returned numbers and count the number of zeros and ones. | ||
Name your program as cointoss.py and run it. You should see the following output… | Name your program as cointoss.py and run it. You should see the following output… | ||
Ligne 38: | Ligne 64: | ||
Percentage of Heads: 40.0% | Percentage of Heads: 40.0% | ||
- | It’s not what you would expect to be. You would expect a 50% number of Heads each time. Take a coin and try it. You will find a similar result. It won’t be 50% each time. Remember the Chaos Theory? | + | It’s not what you would expect to be. You would expect a 50% number of Heads each time. Take a coin and try it. You will find a similar result. It won’t be 50% each time. Remember the Chaos Theory?** |
- | Now, the value of 10 “flips” is a fairly low number of samples. Let’s try it with a larger sample size. Change the todo value to 1000 and re-run your program. | + | Maintenant, passons à la liste des nombres retournés et comptons le nombre de zéros et de uns. |
+ | |||
+ | Appelez le programme cointoss.py et lancez-le. Vous devriez voir la sortie suivante... | ||
+ | |||
+ | $ python cointoss.py | ||
+ | |||
+ | [1 1 0 0 1 1 1 1 1 0] | ||
+ | Heads: 7 - Tails: 3 | ||
+ | Percentage of Heads: 70.0% | ||
+ | [0 1 0 1 1 0 0 1 0 0] | ||
+ | Heads: 4 - Tails: 6 | ||
+ | Percentage of Heads: 40.0% | ||
+ | |||
+ | Ce n'est pas ce à quoi vous auriez pu vous attendre. Vous auriez prévu 50 % de « face » à chaque fois. Prenez une pièce et essayez. Vous trouverez un résultat semblable. Ce ne sera pas 50 % à chaque fois. Vous vous souvenez de la Théorie du Chaos ? | ||
+ | |||
+ | **Now, the value of 10 “flips” is a fairly low number of samples. Let’s try it with a larger sample size. Change the todo value to 1000 and re-run your program. | ||
I’m going to shorten the output (shown below) to save space, but here is what you should see… | I’m going to shorten the output (shown below) to save space, but here is what you should see… | ||
This time, our results were much closer to 50%, but not really close enough. What would it look like if we do a series of 100000 flips? Change the todo variable to 100000 and re-run the program. | This time, our results were much closer to 50%, but not really close enough. What would it look like if we do a series of 100000 flips? Change the todo variable to 100000 and re-run the program. | ||
+ | |||
+ | [1 1 0 ... 0 0 0] | ||
+ | Heads: 49771 - Tails: 50229 | ||
+ | Percentage of Heads: 49.771% | ||
+ | [0 0 0 ... 0 0 1] | ||
+ | Heads: 49943 - Tails: 50057 | ||
+ | Percentage of Heads: 49.943%** | ||
+ | |||
+ | Il s' | ||
+ | |||
+ | Je vais réduire la sortie (montrée ci-dessous) pour économiser la place, mais voici ce que vous pouvez voir... | ||
+ | |||
+ | Cette fois-ci les résultats sont beaucoup plus proches de 50 %, mais pas vraiment assez proches. Et ça donnerait quoi si nous faisions une série de 100000 lancers ? Modifez la variable todo en 100000 et re-lancez le programme. | ||
[1 1 0 ... 0 0 0] | [1 1 0 ... 0 0 0] | ||
Ligne 53: | Ligne 107: | ||
Percentage of Heads: 49.943% | Percentage of Heads: 49.943% | ||
- | Now we are very close to what we expect the result to be, close enough to say that, yes we do get almost 50% distribution. In addition, we have now seen the Law of Large Numbers take effect. | + | **Now we are very close to what we expect the result to be, close enough to say that, yes we do get almost 50% distribution. In addition, we have now seen the Law of Large Numbers take effect. |
But what about the “Law of Truly Large Numbers”? One of the examples that is often used to explain this would be (http:// | But what about the “Law of Truly Large Numbers”? One of the examples that is often used to explain this would be (http:// | ||
- | “In July 1975, a taxi in Hamilton, Bermuda knocked Erskine Lawrence Ebbin from his moped, killing him. The year before, his brother Neville Ebbin had been killed by the same driver driving the same taxi and carrying the same passenger while riding the same moped on the same street.” | + | “In July 1975, a taxi in Hamilton, Bermuda knocked Erskine Lawrence Ebbin from his moped, killing him. The year before, his brother Neville Ebbin had been killed by the same driver driving the same taxi and carrying the same passenger while riding the same moped on the same street.”** |
+ | |||
+ | Maintenant, nous sommes très près du résultat que nous attendons, suffisament proches pour dire que, oui, nous avons presque obtenu une distribution à 50 %. En plus, nous avons maintenant vu la Loi des Grands Nombres devenir réalité. | ||
+ | |||
+ | Mais alors, la « Loi des Vraiment Grands Nombres » ? Une des exemples qui sont souvent utilisés pour l' | ||
- | In another example, | + | **In another example, |
+ | |||
+ | from numpy.random import seed | ||
+ | from numpy.random import randint | ||
+ | import datetime | ||
+ | # seed random number generator | ||
+ | seed(1)** | ||
+ | |||
+ | Un autre exemple : « Lors d'un match de foot avec 50 000 supporters, la plupart des supporters partagent probablement leur date de naissance avec 135 autres dans l' | ||
from numpy.random import seed | from numpy.random import seed | ||
Ligne 66: | Ligne 132: | ||
seed(1) | seed(1) | ||
- | Again, we start off with our imports (we added datetime for this example) and set the seed value. Next we set the number of random numbers in our list to be 50000 and create an empty list. | + | **Again, we start off with our imports (we added datetime for this example) and set the seed value. Next we set the number of random numbers in our list to be 50000 and create an empty list. |
todo = 50000 | todo = 50000 | ||
Ligne 73: | Ligne 139: | ||
Now we loop through a series of statements that pick valid dates at random. (I use Kite for my programming and they provided the base example for this code. I modified it slightly). Once we have the date, we append that to the list (top right) | Now we loop through a series of statements that pick valid dates at random. (I use Kite for my programming and they provided the base example for this code. I modified it slightly). Once we have the date, we append that to the list (top right) | ||
- | Finally, we create a date (I picked my son’s birthday) to see if it is in the list and print the number of times it occurred, if in fact it did. | + | Finally, we create a date (I picked my son’s birthday) to see if it is in the list and print the number of times it occurred, if in fact it did.** |
- | datetocheck = datetime.date(1986, | + | À nouveau, nous débutons par les imports (nous avons ajouté dateline dans cet exemple) et réglons la valeur de la semence. Ensuite, nous fixons à 50000 le chiffre des nombres aléatoires de notre liste et créons une liste vide. |
+ | |||
+ | todo = 50000 | ||
+ | dates = [] | ||
+ | |||
+ | Nous bouclons alors dans une série de déclarations qui récupèrent les dates valides aléatoirement. (J' | ||
+ | |||
+ | Enfin, nous créons une date (j'ai pris la date de naissance de mon fils) pour voir si elle est dans la liste et imprimer le nombre de fois où je la trouve, si elle y est, en fait. | ||
+ | |||
+ | **datetocheck = datetime.date(1986, | ||
print(f' | print(f' | ||
Ligne 84: | Ligne 159: | ||
Found 3 occurrences | Found 3 occurrences | ||
- | We could even modify the code to do this a number of times, keep track of the results and at the end, provide an average of the occurrences. I named this “birthdays2.py” | + | We could even modify the code to do this a number of times, keep track of the results and at the end, provide an average of the occurrences. I named this “birthdays2.py”** |
- | Here’s the result (shortened of course): | + | datetocheck = datetime.date(1986, |
+ | |||
+ | print(f' | ||
+ | |||
+ | Vous ne devriez pas être surpris qu'il y ait quelques occurrences. | ||
+ | |||
+ | $python birthdays.py | ||
+ | Found 3 occurrences # 3 occurrences trouvées | ||
+ | |||
+ | Vous pouvez même modifier le code pour le faire un certain nombre de fois, en gardant une trace des résultats et, à la fin, réaliser une moyenne des occurrences. Je l'ai appelé « birthdays2.py ». | ||
+ | |||
+ | **Here’s the result (shortened of course): | ||
$ python birthdays2.py | $ python birthdays2.py | ||
Ligne 95: | Ligne 181: | ||
Average is 2.96 | Average is 2.96 | ||
- | I hope that this has given you an appreciation of Large Numbers and Truly Large Numbers and random numbers in general. | + | I hope that this has given you an appreciation of Large Numbers and Truly Large Numbers and random numbers in general.** |
+ | |||
+ | Voici le résultat (abrégé, bien sûr) : | ||
+ | |||
+ | $ python birthdays2.py | ||
+ | Found 4 occurrences | ||
+ | ... | ||
+ | Found 4 occurrences | ||
+ | Results: [4, 3, 5, 5, 6, 3, 1, 3, 3, 4, 0, 2, 1, 0, 5, 2, 3, 3, 3, 3, 4, 3, 3, 6, 5, 3, 3, 1, 3, 2, 4, 4, 2, 4, 2, 2, 2, 4, 0, 1] | ||
+ | Average is 2.96 # La moyenne est 2.96 | ||
+ | |||
+ | **I’ve put the code files up on PasteBin: | ||
+ | Cointoss.py | ||
+ | https:// | ||
+ | Birthdays.py | ||
+ | https:// | ||
+ | Birthdays2.py | ||
+ | https:// | ||
+ | |||
+ | Until next time; stay safe, healthy, positive and creative!** | ||
- | I’ve put the code files up on PasteBin: | + | J'ai mis les fichiers de code sur PasteBin : |
Cointoss.py | Cointoss.py | ||
https:// | https:// | ||
Ligne 105: | Ligne 210: | ||
https:// | https:// | ||
- | Until next time; stay safe, healthy, positive and creative! | + | Jusqu' |
issue158/python.txt · Dernière modification : 2020/07/03 14:54 de andre_domenech