issue81:c_c
Différences
Ci-dessous, les différences entre deux révisions de la page.
Les deux révisions précédentesRévision précédenteProchaine révision | Révision précédente | ||
issue81:c_c [2014/05/04 16:38] – andre_domenech | issue81:c_c [2014/05/14 12:56] (Version actuelle) – andre_domenech | ||
---|---|---|---|
Ligne 4: | Ligne 4: | ||
Last month I received an email from John, a reader of C&C. He had turned to me for advice on using Sed to insert semi-colons within the text file created by Task Warrior. The reason he wanted to do this was to use the conkytext script to format the To-Do list nicely for his Conky. Included in the email was the file as created by Task Warrior. We then spent a couple of days putting together a functioning Sed script (and going through a few format changes), and the end result was an excellent basis for an article. Hopefully by the end of this article, the reader will have an idea how to approach Sed expressions in order to tackle tasks that may at first seem complex. | Last month I received an email from John, a reader of C&C. He had turned to me for advice on using Sed to insert semi-colons within the text file created by Task Warrior. The reason he wanted to do this was to use the conkytext script to format the To-Do list nicely for his Conky. Included in the email was the file as created by Task Warrior. We then spent a couple of days putting together a functioning Sed script (and going through a few format changes), and the end result was an excellent basis for an article. Hopefully by the end of this article, the reader will have an idea how to approach Sed expressions in order to tackle tasks that may at first seem complex. | ||
+ | |||
+ | **Titre : Courriers des lecteurs - Sed** | ||
+ | |||
+ | **Le mois dernier, j'ai reçu un courriel de John, un lecteur de C & C. Il s' | ||
+ | ** | ||
The Task | The Task | ||
We want to add a semi-colon after the contents of every column (in the text shown top right, ignoring the white space). As you can imagine, the fact that the number of spaces vary can make this a difficult task. Also, the last line (tasks) is supposed to be preceded by three semi-colons (“;;;10 tasks”). After our first attempt, John came back to me and told me he'd decided to leave the first column semi-colon-less (shown above). | We want to add a semi-colon after the contents of every column (in the text shown top right, ignoring the white space). As you can imagine, the fact that the number of spaces vary can make this a difficult task. Also, the last line (tasks) is supposed to be preceded by three semi-colons (“;;;10 tasks”). After our first attempt, John came back to me and told me he'd decided to leave the first column semi-colon-less (shown above). | ||
+ | |||
+ | **L' | ||
+ | |||
+ | **Nous voulons ajouter un point-virgule après le contenu de chaque colonne (dans le texte affiché en haut à droite, en ignorant l' | ||
+ | ** | ||
My Script | My Script | ||
Due to the fact that the script is rather long, as it offers extra functionality (supports some arguments, outputting to a file, etc), I've put it up on pastebin: http:// | Due to the fact that the script is rather long, as it offers extra functionality (supports some arguments, outputting to a file, etc), I've put it up on pastebin: http:// | ||
+ | |||
+ | **Mon script** | ||
+ | |||
+ | **Du fait de la longueur du script, car il comporte des fonctionnalités supplémentaires (prise en charge des arguments, la sortie vers un fichier, etc.), je l'ai mis sur pastebin : http:// | ||
+ | ** | ||
The Thought Process | The Thought Process | ||
Ligne 22: | Ligne 37: | ||
• Declaring a set number of repetitions can be done with: \{3\} for 3 repetitions, | • Declaring a set number of repetitions can be done with: \{3\} for 3 repetitions, | ||
• You must escape the semi-colon. | • You must escape the semi-colon. | ||
+ | |||
+ | ** L' | ||
+ | |||
+ | **Avant de commencer, quelques notions importantes : | ||
+ | • La syntaxe typique d'une commande sed est : sed s /< | ||
+ | ce qui indique à Sed de remplacer toutes les occurrences, | ||
+ | • L' | ||
+ | Il existe certains caractères spéciaux qui peuvent être utilisés dans Sed. Nous utiliserons surtout l' | ||
+ | • La déclaration d'un certain nombre de répétitions peut être faite avec : \{3\} pour 3 répétitions, | ||
+ | • Vous devez « échapper » le point-virgule (placez un antislash devant) afin qu'il ne soit pas interprété. | ||
+ | ** | ||
Some tips as to how I decide on each expression: | Some tips as to how I decide on each expression: | ||
Ligne 28: | Ligne 54: | ||
• If you have issues with step 2 because you can't get the regular expressions working, try using grep and the same regular expression. This lets you rule out the expression itself being wrong, and indicates it's a quirk of Sed's you haven' | • If you have issues with step 2 because you can't get the regular expressions working, try using grep and the same regular expression. This lets you rule out the expression itself being wrong, and indicates it's a quirk of Sed's you haven' | ||
• If you want the same formatting at the end, the RHS of the expression should almost always be the same, and if it isn't, it's an indicator that you're either going too complicated, | • If you want the same formatting at the end, the RHS of the expression should almost always be the same, and if it isn't, it's an indicator that you're either going too complicated, | ||
+ | |||
+ | ** Quelques conseils sur la façon de faire pour chaque expression : Déterminez où vous devez insérer le caractère, car c'est cela qui définira où placer les groupes (dans notre cas, avant les espaces, donc le deuxième groupe commence presque toujours avant le caractère Espace). | ||
+ | • Travaillez par petits bouts. Commencez par une commande sed simple comme : sed -e " | ||
+ | • Si vous avez des problèmes avec l' | ||
+ | • Si vous voulez le même formatage à la fin de la commande, le côté droit (RHS) de l' | ||
+ | ** | ||
The Expressions | The Expressions | ||
+ | |||
+ | **Les Expressions** | ||
first_expression=" | first_expression=" | ||
+ | |||
second_expression=" | second_expression=" | ||
Ligne 38: | Ligne 73: | ||
fourth_expression=" | fourth_expression=" | ||
- | fifth_expression=" | + | fifth_expression=" |
+ | |||
+ | **Les Expressions | ||
+ | |||
+ | first_expression=" | ||
+ | |||
+ | second_expression=" | ||
+ | |||
+ | third_expression=" | ||
+ | |||
+ | fourth_expression=" | ||
+ | |||
+ | fifth_expression=" | ||
+ | |||
+ | |||
+ | # Check for any number of capital letters at the start of a line, followed by a space and more text, and insert a semicolon. | ||
+ | |||
+ | **# Recherchez chaque ligne commençant par une ou plusieurs lettres majuscules suivies d'une espace et encore du texte et insérez un point-virgule.** | ||
The explanations | The explanations | ||
The first expression tells Sed “Look for any character (a-z, A-Z, or 0-9), and see if it's followed by 2 or more spaces, then add a semi-colon before the spaces”. The trick to this is knowing that Sed can group matches to the regular expressions. This is why we have escaped brackets around the expressions. “\(a-zA-Z0-9]\)” then becomes match “\1” in the replacement section of Sed. We are essentially forming two groups – the character that precedes the spaces, and the spaces themselves. Then, in the replacement step, we're inserting a semi-colon between the two groups. This corresponds to column 2 and column 4 in our file, as well as all the headers except ID. The reason why ID isn't included is due to the fact that we state 2 or more spaces, and changing that to one or more would cause issues in all the descriptions. Note: The semi-colon must be escaped (have a backslash in front of it). Also, if you want to match more than 15 spaces, simply leave that side of the comma empty - \{2,\}. | The first expression tells Sed “Look for any character (a-z, A-Z, or 0-9), and see if it's followed by 2 or more spaces, then add a semi-colon before the spaces”. The trick to this is knowing that Sed can group matches to the regular expressions. This is why we have escaped brackets around the expressions. “\(a-zA-Z0-9]\)” then becomes match “\1” in the replacement section of Sed. We are essentially forming two groups – the character that precedes the spaces, and the spaces themselves. Then, in the replacement step, we're inserting a semi-colon between the two groups. This corresponds to column 2 and column 4 in our file, as well as all the headers except ID. The reason why ID isn't included is due to the fact that we state 2 or more spaces, and changing that to one or more would cause issues in all the descriptions. Note: The semi-colon must be escaped (have a backslash in front of it). Also, if you want to match more than 15 spaces, simply leave that side of the comma empty - \{2,\}. | ||
+ | |||
+ | **Les explications** | ||
+ | |||
+ | **La première expression dit à Sed « Cherche n' | ||
The second expression tells Sed “Look for any 3 consecutive digits that are followed by a space and a letter or number, then insert a semi-colon”. What this matches is the date – the format of the date is always going to be so long that only one space is inserted between columns. Naturally, you could check for any number of spaces, but that could cause issues if you use numbers in your Projects. This will apply to any format of date where the year is at the end. This handles column 3 in our file. | The second expression tells Sed “Look for any 3 consecutive digits that are followed by a space and a letter or number, then insert a semi-colon”. What this matches is the date – the format of the date is always going to be so long that only one space is inserted between columns. Naturally, you could check for any number of spaces, but that could cause issues if you use numbers in your Projects. This will apply to any format of date where the year is at the end. This handles column 3 in our file. | ||
+ | |||
+ | **La seconde expression dit à Sed « Recherche 3 chiffres consécutifs qui sont suivis par une espace et une lettre ou un chiffre, puis insère un point-virgule ». Cela correspond à la date - le format de la date sera toujours si long qu'une seule espace sépare la date de la colonne suivante. Bien sûr, on pourrait chercher un nombre quelconque d' | ||
The third expression can be translated as “Find all letters followed by a 1 or 2 digit number, followed by a slash, and insert the semi-colon.” The only column that contains a slash is our formatted date column – this applies therefore to the column before it (Project). The reason why I didn't include numbers in this case, is because the second expression could handle this if you tell Sed to accept any number of spaces after the 3 digits. This handles column 2 in our file. | The third expression can be translated as “Find all letters followed by a 1 or 2 digit number, followed by a slash, and insert the semi-colon.” The only column that contains a slash is our formatted date column – this applies therefore to the column before it (Project). The reason why I didn't include numbers in this case, is because the second expression could handle this if you tell Sed to accept any number of spaces after the 3 digits. This handles column 2 in our file. | ||
+ | |||
+ | **La troisième expression peut être traduite comme « Recherche toutes lettres suivies par un nombre de 1 ou 2 chiffres, suivis d'un slash (ou barre oblique) et insère un point-virgule ». La seule colonne qui contient une barre oblique est notre colonne de date, cela s' | ||
The fourth expression handles the last line of the file, and inserting the 3 semi-colons before tasks. It essentially groups the entire line (10 tasks) and then inserts three semi-colons before that group. If you're adding semi-colons before any lines starting with numbers, then you should move this expression to the start of the list of expressions, | The fourth expression handles the last line of the file, and inserting the 3 semi-colons before tasks. It essentially groups the entire line (10 tasks) and then inserts three semi-colons before that group. If you're adding semi-colons before any lines starting with numbers, then you should move this expression to the start of the list of expressions, | ||
+ | |||
+ | **La quatrième expression gère la dernière ligne du fichier et insère les trois points-virgules avant « tasks ». En fait, il fait de la ligne entière (10 tasks) un groupe, puis insère trois points-virgules avant ce groupe. Si vous ajoutez un point-virgule avant toutes les lignes commençant par des chiffres, alors vous devez passer cette expression au début de la liste des expressions, | ||
+ | ** | ||
The fifth expression simply states “Find the line that starts with any number of capital letters, and insert a space afterwards”. I go a little more specific, and state “followed by any number of spaces and more letters”. However, it's not necessary in our example, and is simply there to be a bit more robust. | The fifth expression simply states “Find the line that starts with any number of capital letters, and insert a space afterwards”. I go a little more specific, and state “followed by any number of spaces and more letters”. However, it's not necessary in our example, and is simply there to be a bit more robust. | ||
+ | |||
+ | **La cinquième expression indique tout simplement « Trouve la ligne qui commence avec un nombre quelconque de lettres majuscules et insère une espace après ». Je deviens plus précis en annonçant « suivi par un nombre quelconque d' | ||
That about covers the steps I undertook in this scenario. I realize that this is a relatively specific occasion, and not everyone will want to have this exact formatting. My hope is that following my process will help you understand how to approach these sorts of problems. If it's wished for, I can spend an article focusing on short formatting problems, and working through it step by step. If anyone is interested in that sort of article, please let me know via email. As always, any questions/ | That about covers the steps I undertook in this scenario. I realize that this is a relatively specific occasion, and not everyone will want to have this exact formatting. My hope is that following my process will help you understand how to approach these sorts of problems. If it's wished for, I can spend an article focusing on short formatting problems, and working through it step by step. If anyone is interested in that sort of article, please let me know via email. As always, any questions/ | ||
+ | |||
+ | **Et voilà pour les étapes par lesquelles je suis passé dans ce scénario. Mais je me rends compte que c'est un peu spécifique et que cet exercice de mise en forme détaillée ne va pas être utile à tout le monde. Néanmoins, j' |
issue81/c_c.1399214318.txt.gz · Dernière modification : 2014/05/04 16:38 de andre_domenech