Full Circle Magazine FR

HELLO WORLD! I hate using that phrase when introducing someone to a new programming language or concept; so much in fact, I refuse to use it. I change it to something like “Hello from Python” or something equally close but equally different. You might notice above that this is article # 98 in my Python programming series. If everything goes according to plan, my 100th article will in December's Full Circle Magazine.

HELLO WORLD ! Je hais l'utilisation de cette phrase quand un nouveau langage ou concept de programmation est présenté à quelqu'un ; tellement, en fait, que je refuse de l'utiliser. Je le change pour quelque chose comme « Bonjour de Python » ou autre chose d'assez similaire mais assez différent.

Vous avez pu noter ci-dessus que cet article est le 98e de la série sur la programmation en Python. Si tout se passe comme prévu, mon 100e article sera dans le magazine Full Circle de décembre.

Now let's start with this month's article… the reason you are here… Text to Speech. Something that has been around for many years, but when it comes to Linux, is fairly limited, especially when it comes to free software. Add a requirement of Python usage to that and the list gets shorter, so let's explore what's out there. Another requirement is that it needs to be something that is somewhat regularly maintained, and it needs to have some documentation that a normal person can really understand. Remember, as we are going through this, the old saying “You get what you pay for” and in this instance it's true to some extent.

Commençons maintenant l'article de ce mois … la raison pour laquelle vous lisez ceci…

De l'écrit à l'oral. Quelque chose qui traîne ici et là depuis des années, mais, quand il s'agit de Linux, est plutôt limité, particulièrement quand il s'agit d'un logiciel libre. Ajoutez une obligation d'utiliser Python pour cela et la liste se réduit encore ; aussi, explorons ce qu'il y a. Une autre nécessité est que ce doit être quelque chose qui est maintenu régulièrement et dont la documentation est telle qu'une personne normale peut la comprendre.

Souvenez-vous, pendant que nous progressons, du vieux diction « on en a pour son argent » et, dans ce cas, c'est vrai jusqu'à un certain point.

The best that I could find that fits all those requirements is a package called eSpeak (https://sourceforge.net/projects/espeak/). While it appears that there hasn't been any forward progress since the end of 2017, there is a fork of this project that is currently being worked on called 'eSpeak NG' (https://github.com/espeak-ng/espeak-ng). The eSpeak projects have support for over 100 languages and accents. This having been said, the voice quality is very robotic, to say the least. Nothing like what you get with Google Assistant, Alexa, Cortana or Siri. However, with the proper manipulations, it can sound understandable, at least in English. I always say, I know only two languages, English and BAD English, so I'm at the mercy of those who can speak other languages to determine the actual usability.

Le mieux que j'ai pu trouver qui correspond à toutes ces exigences est un paquet appelé eSpeak (https://sourceforge.net/projects/espeak/). Bien qu'il semble qu'il n'y ait eu aucun progrès depuis fin 2017, il existe un fork de ce projet qui est en chantier actuellement et qui s'appelle « eSpeak NG » (https://github.com/espeak-ng/espeak-ng). Les projets eSpeak supportent plus de 100 langues et accents. Cela dit, la qualité vocale est très mécanique, c'est le moins qu'on puisse dire. Rien à voir avec ce que vous obtenez avec Google Assistant, Alexa, Cortana ou Siri. Cependant, avec des manipulations correctes, il peut être compréhensible, au moins en anglais. Je dis toujours que je connais seulement deux langues, l'anglais et le mauvais anglais ; aussi, je suis à la merci de ceux qui parlent d'autres langues pour déterminer son utilisabilité réelle.

How to use it… Luckily, to install eSpeak-ng on Ubuntu is pretty easy. :~$ sudo apt-get install espeak-ng-espeak To test it, while you are in the terminal, try this… :~$ espeak-ng “Welcome to free and open source Text to Speech processing.” Now you can hear what I'm talking about. It's pretty much robotic and something reminds you of listening to the voice of Stephen Hawking. If you listen carefully, it can be mostly understood.

Comment l'utiliser

Par chance, l'installation de eSpeak NG sous Ubuntu est assez facile.

:~$ sudo apt-get install espeak-ng-espeak

Pour le tester, quand vous êtes dans un terminal, essayez ceci :

:~$ espeak-ng « Bienvenue dans le traitement de l'écrit à l'oral, libre et Open Source. »

Maintenant, vous pouvez entendre ce dont je parlais. C'est vraiment mécanique et ça vous rappelle quelque peu la voix de Stephen Hawking. Si vous écoutez avec attention, c'est presque entièrement compréhensible.

There are many command-line arguments that you can use to change things around and to provide other options. A quick documentation page is at https://github.com/espeak-ng/espeak-ng/blob/master/src/espeak-ng.1.ronn. I'll try to distill them down, like a fine scotch whiskey, for you. Let's take a quick look at some of them. If you want to see the various languages that are available, just type: :~$ espeak-ng –voices You will receive the output shown on the next page (top right). I've cut that list down considerably to save space here in the article. And to be brutally honest, I wouldn't begin to know if some of these were even close to reality or not.

Il existe de nombreux arguments en ligne de commande que vous pouvez utiliser pour modifier les choses et qui fournissent d'autres options. Une page de documentation rapide se trouve à https://github.com/espeak-ng/espeak-ng/blob/master/src/espeak-ng.1.ronn. J'essayerai de vous les distiller, comme un scotch whiskey de qualité. Regardons rapidement quelques-uns d'entre eux.

Si vous voulez voir les différentes langues disponibles, tapez simplement :

:~$ espeak-ng –voices

Vous recevrez la réponse présentée à la page suivante (en haut à droite).

J'ai considérablement réduit cette liste pour économiser la place dans cet article. Et pour être très franc, je suis incapable de dire si certains d'entre eux sont proches de la réalité ou non.

To use a specific voice, such as Spanish, you can use: :~$ espeak-ng -ves “Buenos días. ¿Cómo estás?” We can also change the speed of the vocal output by using the -s <integer> option: :~$ espeak-ng -ves -s 125 “Buenos días. ¿Cómo estás?” :~$ espeak-ng -ves -s 90 “Buenos días. ¿Cómo estás?” Another thing that we can do is to change the pitch using the -p <integer> option: :~$ espeak-ng -ves -s 125 -p 75 “Buenos días. ¿Cómo estás?” :~$ espeak-ng -ves -s 125 -p 35 “Buenos días. ¿Cómo estás?”

Pour utiliser une voix particulière, comme l'espagnol, vous pouvez utiliser :

:~$ espeak-ng -ves “Buenos días. ¿Cómo estás?”

Nous pouvons aussi modifier la vitesse de la sortie vocale en utilisant l'option -s <entier> :

:~$ espeak-ng -ves -s 125 “Buenos días. ¿Cómo estás?” :~$ espeak-ng -ves -s 90 “Buenos días. ¿Cómo estás?”

Autre chose : vous pouvez modifier la hauteur (pitch) en utilisant l'option -p <entier> :

:~$ espeak-ng -ves -s 125 -p 75 “Buenos días. ¿Cómo estás?” :~$ espeak-ng -ves -s 125 -p 35 “Buenos días. ¿Cómo estás?”

That's fine for the command-line, but what we really want to do is create the speech from a Python program. No problem. We need a library to interface with eSpeak-ng. Luckily, there is a pretty nice version that can be installed via pip. It's called py-espeak-ng. It works on both Python 2.x and 3.x . The homepage is https://github.com/gooofy/py-espeak-ng. pip install py-espeak-ng or pip3 install py-espeak-ng Once py-espeak-ng is installed, fire up your favorite version of Python. The documentation shows a slightly different sequence of commands, but they don't work on my system. This sequence does… The first thing we have to do is import the library… »> from espeakng import ESpeakNG

C'est bien en ligne de commande, mais ce que nous voulons, c'est pouvoir créer la parole depuis un programme en Python. Pas de problème.

Nous avons besoin d'une bibliothèque qui s'interface avec eSpeak-ng. Par chance, il y a une fort belle version qui peut être installée avec pip. Elle est appelée py-espeak-ng. Elle fonctionne avec Pytho 2.x et Python 3.x. La page d'accueil est : https://github.com/gooofy/py-espeak-ng.

pip install py-espeak-ng

ou

pip3 install py-espeak-ng

Une fois py-espeak-ng installée, lancez votre version préférée de Python. La documentation montre une séquence de commandes légèrement différente, mais elles ne marchent pas sur mon système. Cette séquence fait : La première chose que nous avons à faire est d'importer la bibliothèque :

from espeakng import ESpeakNG

Next, we need to instantiate the engine: »> esng = ESpeakNG() Next, we need to assign a voice… »> esng.voice = 'en' Now we can finally have the engine speak to us… »> esng.say('Hello from Python. Welcome to text to speech from Python.') Now, let's change the voice, this time to French: »> esng.voice = 'fr' »> esng.say('Bonjour. Comment vas-tu?')

Ensuite, nous devons instancier le moteur :

esng = ESpeakNG()

Puis, nous avons besoin d'assigner une voix :

esng.voice = 'en'

Maintenant, nous disposons enfin d'un moteur qui peut nous parler :

esng.say('Hello from Python. Welcome to text to speech from Python.')

Changeons la voix, cette fois en français :

esng.voice = 'fr'

esng.say('Bonjour. Comment vas-tu ?')

Now, let's change the pitch as we did before. The syntax is a bit different, but still simple: »> esng.pitch = 32 »> esng.say('Bonjour. Comment vas-tu?') What if we want to find our the current speed or pitch? Just this simple… »> p = esng.pitch »> print(p) 32 »> sp = esng.speed »> print(sp) 175

Maintenant, changeons la hauteur, comme nous l'avons déjà fait. La syntaxe est un peu différente, mais toujours simple :

esng.pitch = 32
esng.say('Bonjour. Comment vas-tu ?')

Et que se passe-t-il si nous voulons connaître la vitesse ou la hauteur actuelle ? Rien de plus simple :

p = esng.pitch
print(p)

32

sp = esng.speed
print(sp)

175

Even finding out the current voice is simple: »> print(esng.voice) fr To get the list of voices: »> print(esng.voices) (output is below) Many more options are available, and you can pretty much use everything shown above to figure out how to carry on from here.

C'est même facile de trouver la voix actuelle :

print(esng.voice)

fr

Pour obtenir la liste des voix :

print(esng.voices)

(La sortie est ci-dessous.)

Beaucoup d'autres options sont disponibles, et vous pouvez très bien utiliser tout ce qui est montré ci-dessus pour découvrir comment poursuivre.

Now there is one other Text to Speech option that we have available to us. The reason I haven't mentioned it until now, is that it isn't quite free. It's the Google Translate TTS API. You need to have Python 3.4 to start, so if you are still hanging on to Python 2.x, you are out of luck for this one. You also need to add a few files. For Ubuntu and other Debian distributions, in a terminal type: :~$ sudo apt-get install sox libsox-fmt-mp3 Next, install the google_speech library using pip: :~$ pip3 install google_speech Once we have that done, let's try it on the command-line. :~$ google_speech -l en “Hello $USER, it is $(date)“

En fait, une autre option Écrit vers oral existe et est disponible. Je ne l'ai pas mentionnée jusqu'ici parce qu'elle n'est pas totalement libre. C'est l'API de Google Translate TTS. Vous avez besoin de Python 3.4 pour commencer ; aussi, si vous êtes toujours attaché à Python 2.x, vous ne pourrez rien faire. Vous devez aussi ajouter quelques fichiers. Pour Ubuntu et les autres distributions Debian, tapez dans un terminal :

:~$ sudo apt-get install sox libsox-fmt-mp3

Ensuite, installez la bibliothèque google_speech en utilisant pip :

:~$ pip3 install google_speech

Une fois que nous l'avons fait, essayons-le en ligne de commande.

:~$ google_speech -l en “Hello $USER, it is $(date)”

For some reason I get 'sox WARN alsa: can't encode 0-bit Unknown or not applicable', but that's ok. There is a small amount of documentation available at https://github.com/desbma/GoogleSpeech that you can also try. You can even try the code shown above. Now, let's look at google_speech in Python. »> from google_speech import Speech »> text = 'Hello user from the google speech api' »> lang = “en” »> speech = Speech(text, lang) »> speech.play()

Pour une raison indéterminée, j'ai reçu « sox WARN alsa: can't encode 0-bit Unknown or not applicable » (Avertissement de sox alsa : impossible d'encoder 0-bit Inconnu ou non applicable), mais tout est OK.

Il a une petite quantité de documentation disponible sur https://github.com/desbma/GoogleSpeech que vous pouvez essayer aussi.

Vous pouvez même essayer le code ci-dessus.

Maintenant, regardons google_speech dans Python.

from google_speech import Speech
text = 'Hello user from the google speech api'
lang = “en”
speech = Speech(text, lang)
speech.play()

And now for something completely different… »> lang = 'nb' »> text = 'God morgen. Hvordan har du det?' #Good morning. How are you? in Norwegian »> speech = Speech(text, lang) »> speech.play() You can certainly see that the speech is much better and more understandable. Why not stick with this? One of the requirements I stated earlier was that it needed to be free. That not only applies to the software that we use, but the engine service and the lack of internet. If these last two don't bother you, then this is for you. You do, however need to be aware of the cost of using the Google API for this. According to https://cloud.google.com/text-to-speech/pricing for the “Standard (non-WaveNet voices) service, there is a monthly free tier that (the way I read it) is from 0 to 4 million characters. Anything over that amount per month would be charged at $4.00 USD per million characters. If you look at their example example near the top of the page…

Et maintenant, pour quelque chose de complètement différent :

lang = 'nb'
text = 'God morgen. Hvordan har du det?' #Bonjour. Comment allez-vous ? en norvégien
speech = Speech(text, lang)
speech.play()

Vous pouvez certainement voir que la parole est de meilleure qualité et plus compréhensible. Pourquoi ne pas rester avec ce logiciel-là ? Une des exigences que j'ai affirmée plus tôt était qu'il fallait que ce soit libre. Ça ne s'applique pas seulement au logiciel que nous utilisons, mais aussi au moteur du service et à l'absence d'Internet. Si ces deux derniers ne vous ennuient pas, alors il est fait pour vous. Vous devez cependant être averti du coût de l'utilisation de l'API de Google pour cela. D'après https://cloud.google.com/text-to-speech/pricing, pour le service « Standard » (pas les voix WaveNet), il y a un premier palier mensuel gratuit de 0 à 4 millions de caractères (tel que je le lis). Tout ce qui dépasse cette quantité serait facturé 4,00 US $ par million de caractères. Si vous regardez leur exemple presque en haut de la page :

<speak> <say-as interpret-as=“cardinal”>12345</say-as> and one more </speak> would count as 79 characters. So be careful when you attempt to calculate your usage. There is also the possibility that if you send too much data too quickly, the system might block you for a while if you don't have an account. Well, that’s about it for this month. Until next time, keep coding!

<speak>

<say-as interpret-as="cardinal">12345</say-as>

and one more </speak>

79 caractères seront comptés. Aussi, soyez prudent si vous cherchez à calculer votre utilisation. Il est aussi possible que, si vous envoyez trop de données trop rapidement, le système puisse vous bloquer pendant un moment si vous n'avez pas de compte.

Bon. Ce sera tout pour ce mois-ci. Jusqu'à la prochaine fois, continuez à coder !