Ceci est une ancienne révision du document !
Somewhere in the back of my mind, I believe that I touched on sets in an article many, many years ago. If I in fact did, I don’t think that I really did the subject justice. So, I decided to fix that this month. What is a set? The easiest way to answer that is to show you by example. Assume we have two lists. List1 = [‘'S01E01', 'S01E02', 'S01E03', 'S01E04', 'S01E05', 'S01E06', 'S01E07', 'S01E08'] List2 = ['S01E01', 'S01E03', 'S01E05', 'S01E07', 'S01E08', ’S01E03’] Let’s say that List1 contains all of the episodes for a show called “My Life” that have aired so far. And let’s further say that List2 contains the episodes that we have recorded on our home PVR. (The Sxx stands for Season (Series for my friends outside of the US) of the show, and the Exx stands for Episode number.) Just looking at the lists written down, it’s easy to see that List2 (what we have recorded) is missing ‘S01E02’, ‘S01E04’ and ‘S01E06’, and that ‘S01E03’ was recorded twice. But how can we do it programmatically?
One solution is to use the in operator, and step through each of the items in List1 and see if the item is in List2. for epi in List1: if epi not in List2: print(f'Missing Episode {epi}') print('Finished') result: Missing Episode S01E02 Missing Episode S01E04 Missing Episode S01E06 Finished The not in method takes three lines (not including the definition of the lists) to find the missing episodes. We can, however, do it better by using sets. print(set(List1).difference(set(List2))) {'S01E06', 'S01E02', 'S01E04'}
Using sets, we can do the same thing much quicker, and with only one line of code. The difference method will show all the items from set(a) which are not in set(b). What happens if we reverse the sets in the difference statement, comparing List2 to List1? Since every item in List2 is in List1, we get an empty set returned to us. print(set(List2).difference(set(List1)) set() Now, what exactly is a set? In Python, a set is defined as “an unordered collection of items. All items are unique within the set.” If you remember, List2 has S01E03 duplicated, so how does List2 look when it’s converted to a set? print(set(List2)) {'S01E08', 'S01E03', 'S01E07', 'S01E05', 'S01E01'} You can see that the set is just five items rather than the six we defined, so the duplicate S01E03 is excluded. Also notice that, as we saw in the definition of a set, the order is totally different from what we defined in the list.
You can also use a shortened version of our set difference statement. To do this, we use the - operator. print(set(List1) - set(List2)) So what else can we do with a set? We can add an item to a set and remove one as well. set1 = set(List1) set1.add('S01E09') print(set1) {'S01E04', 'S01E05', 'S01E01', 'S01E06', 'S01E02', 'S01E08', 'S01E09', 'S01E07', 'S01E03'} set2 = set(List2) set2.discard('S01E05') print(set2) {'S01E01', 'S01E03', 'S01E07', 'S01E08'} The add operator works with only a single item. There is an update operator that can add multiple items which can be a list, strings, tuples, and other sets as well. You can also use remove to remove items, however if the item to be removed is not in the set, discard() will not return an error, but remove() will throw an error if it isn’t in the set. We have many other methods available to us that work with sets. They include intersection, union, symmetrical difference. For the use of the next examples, we will use the following values. SetA={1,2,3,4,5} SetB={5,6,7,8,9}
Union The union method returns all items from both Set A and Set B. For example: SetA.union(SetB) returns {1,2,3,4,5,6,7,8,9} Notice that the value 5 is in both sets, but because a set can not include duplicates, it is in the union only once. You can also use the | operator to perform the union operation. SetA | SetB Difference We already have seen the difference method earlier. It returns the values that are in SetA that are not in SetB. SetA.difference(SetB) returns {1, 2, 3, 4} You can also use the operator - to perform the difference operation. SetA - SetB
Intersection Intersection returns only the values that are in BOTH SetA and SetB. SetA.intersection(SetB) returns {5} You can also use the operator & to perform the intersection operation. SetA & SetB Symmetric Difference The Symmetric Difference method returns all the values in SetA and SetB that are not in both. SetA.symmetric_difference(SetB) returns {1, 2, 3, 4, 6, 7, 8, 9} You can also use the operator ^ to perform the SymmetricDifference operation SetA ^ SetB Sets don’t have indexes like lists have. So while you can do something like »> MyList = [1,2,3,4,5,6] »> print(MyList[4]) 5 »>
If you tried that with a set, the attempt will fail. »> myset={1,2,3,4,5,6} »> print(myset[4]) Traceback (most recent call last): File “<stdin>”, line 1, in <module> TypeError: 'set' object is not subscriptable 'set' object is not subscriptable »> There are also the methods issuperset() and issubset(). SetA is considered a superset if it contains all of the items in SetB. SetB is considered a subset of SetA if all of its items are in SetA »> SetA={1,2,3,4,5,6,7,8} »> SetB={1,2,3,4} »> SetA.issuperset(SetB) True »> SetB.issuperset(SetA) False »> SetB.issubset(SetA) True However, consider the following situation… »> SetC={1,2,3,4,10} »> SetC.issubset(SetA) False »> SetA.issuperset(SetC) False »> SetC is not a subset of SetA, since SetC contains the value of 10 and SetA does not. In the same way SetA is not a superset of SetC.
Frozenset Frozenset is a class that has many of the same characteristics of a set, but a frozenset is immutable so there are no add or remove methods. »> SetA=frozenset([1,2,3,4,6]) »> SetB=frozenset([5,6,7,8,9]) »> SetA | SetB frozenset({1, 2, 3, 4, 5, 6, 7, 8, 9}) »> SetA - SetB frozenset({1, 2, 3, 4}) »> SetA & SetB frozenset({6}) »> SetA ^ SetB frozenset({1, 2, 3, 4, 5, 7, 8, 9}) »> SetA.add(10) Traceback (most recent call last): File “<stdin>”, line 1, in <module> AttributeError: 'frozenset' object has no attribute 'add' 'frozenset' object has no attribute 'add' »> There is a lot more that you can do with sets, but I wanted to cover the basics. I hope that you can see how beneficial sets can be for you. Until next time, as always; stay safe, healthy, positive and creative!