Ceci est une ancienne révision du document !
So I had to export something from MySQL at work and send it to myself so I could email that to one of our developers. While I still retained some of my know-how when it came to MySQL I almost hit a blank when it came to the compression part. How long has it been since I have archived something on the command line? Donkey’s years!! I felt like a total n00b, and here are my thoughts as a ‘n00b’ as I can totally relate. I’m not going to tell you muscle memory kicked in, rather the ugly truth, I went on to duckduckgo.com and looked it up. Most of this stuff I have not done in 10 years plus. I have been so spoiled by the archivers in the GUI, that I never needed to use it really. Un-archiving was not a problem, but zipping something with sane switches? Having been a wizard with ARJ back in the DOS days, I felt it was time to get reacquainted with compression algorithms. I will start you off with the basics, and try to keep switches out of it.
Let’s start with the one everyone knows, zip. For .zip files it is simple:
command <destination file> <source file>
Plain ol’ zip does not delete the source file once the destination file is created.
Just for completeness sake, zip’s opposite is:
unzip <filename.zip>
What stands out with plain ol’ zip is that you can split your zipped files. What you will find, more often-than-not is gzip.
Gzip aims to be a simpler zipper, with only one parameter needed.
Gzip <source file>
and bam! All your base are belong to us. Gzip DOES delete the source file once the destination file is created.
Again for completeness sake, the opposite is:
gunzip <filename.gz>
If you ever need help with gzip, simply type
gzip -h
Bzip or bzip2 that you will see on modern systems works the same as gzip or all intents and purposes. The opposite is bunzip2.
When you install a Linux OS you may have noticed xz. So it may be available everywhere too, and it works just the same as the above, so unxz it is.
I’m not getting into compression ratios or speed here, this is more like an overview to help you remember what goes where.
If you do not need any real compression, simply blobbing files together, there is always tar, the tape archiver. It blobs, it does not delete the source file once done. However, I prefer compressing things tightly that need to be transferred over a network. I hate waiting. With tar, you need to remember switches. IN it will be cfz (yes the c is for compression, but it is meh) and OUT it will be xvf. So honestly, I have probably used it twice in my life, though for some reason I remember the switches as cz for the old Czechoslovakia and fx for effects.
Since remembering what does what, is something I’d rather not do, I’d suggest picking a tool and sticking to it. Whilst it is not usually found on servers, unless the person setting it up had proper foresight, p7zip would be my poison. I remember that by: “It’s and a and an e and it’s non-destructive” meaning, you use an ‘a’ to archive and an ‘e’ to extract and it does not delete the source file. Though the package is name p7zip (not to be confused with peazip) the command is simply 7z.
That said, I was looking at our logs the other day and I was thinking, if I had to take those, (literally gigabytes in size) I’d use gzip. It is the fastest one on the list above.
Now… Your mission…. Should you choose to accept it…
Take one of your movie files and use ‘time’ to see how long each one takes to compress and uncompress that movie and draw your own conclusions. You know how to do this, I was three issues ago.
As to the switches. This is the reason I suggest you pick one and stick to it.
Now I’m not saying everything you read here is 100% accurate, my observation skills are not the best, probably why my role play characters always have a really high perception skill, but it should be close enough as damnit is to swearing. To add insult to injury, I just realised my seed brittle is a sesame seed brittle, that no-one but my budgies will like…
Back in the days of floppies, the split ability was very important, hence my ARJ obsession, but just so you know ARJ, LHA, RAR etc are all still valid. (You may have noticed some files, like when you use NZB are split up into smaller compressed ones.) Just not in a base Linux distro as they are not free or open source. So chances are that your alpine Linux container will not have any, but zip, gzip or xz. Keep that in mind. Should base Linux contain non-free or proprietary compression algorithms? Are free and open source compression algorithms behind the times? Is the network transfer time saved by using better compression, wasted on the compression time itself? Let us know your thoughts.
Did I make a boo-boo? misc@fullcirclemagazine.org