All Python objects also respond to comparisons: tests for equality, relative magnitude, and so on. Python comparisons always inspect all parts of compound objects until a result can be determined. In fact, when nested objects are present, Python automatically traverses data structures to apply comparisons from left to right, and as deeply as needed. The first difference found along the way determines the comparison result.
This is sometimes called a recursive comparison—the same comparison requested on the top-level objects is applied to each of the nested objects, and to each of their nested objects, and so on, until a result is found. Later in this book—in Chapter 19—we’ll see how to write recursive functions of our own that work similarly on nested structures. For now, think about comparing all the linked pages at two websites if you want a metaphor for such structures, and a reason for writing recursive functions to process them.
In terms of core types, the recursion is automatic. For instance, a comparison of list objects compares all their components automatically until a mismatch is found or the end is reached:
>>>L1 = [1, ('a', 3)]
# Same value, unique objects
>>>L2 = [1, ('a', 3)]
>>>L1 == L2, L1 is L2
# Equivalent? Same object?
(True, False)
Here, L1
and L2
are assigned lists that are equivalent but distinct objects. As a review of what we saw
in Chapter 6, because of the nature of Python references, there are two ways to test for equality:
==
operator tests value equivalence. Python performs an equivalence
test, comparing all nested objects recursively.is
operator tests object identity. Python tests whether the
two are really the same object (i.e., live at the same address in memory).In the preceding example, L1
and L2
pass the ==
test (they have equivalent values because
all their components are equivalent) but fail the is
check (they reference two different objects, and hence two different
pieces of memory). Notice what happens for short strings, though:
>>>S1 = 'spam'
>>>S2 = 'spam'
>>>S1 == S2, S1 is S2
(True, True)
Here, we should again have two distinct objects that happen to have the same value: ==
should be true, and is
should be false. But because Python internally caches and reuses some strings as an optimization, there really is just a single string
'spam'
in memory, shared by S1
and S2
; hence, the is
identity test reports a true
result. To trigger the normal behavior, we need to use longer strings:
>>>S1 = 'a longer string'
>>>S2 = 'a longer string'
>>>S1 == S2, S1 is S2
(True, False)
Of course, because strings are immutable, the object caching mechanism is irrelevant to your code—strings can’t be changed in place, regardless of how many variables refer to them. If identity tests seem confusing, see Chapter 6 for a refresher on object reference concepts.
As a rule of thumb, the ==
operator is what you will want to use for almost all equality checks; is
is
reserved for highly specialized roles. We’ll see cases later in the book where both operators are put to use.
Relative magnitude comparisons are also applied recursively to nested data structures:
>>>L1 = [1, ('a', 3)]
>>>L2 = [1, ('a', 2)]
>>>L1 < L2, L1 == L2, L1 > L2
# Less, equal, greater: tuple of results
(False, False, True)
Here, L1
is greater than L2
because the nested 3
is greater than 2
. By now you
should know that the result of the last line is really a tuple of three objects—the results of the three expressions typed (an example
of a tuple without its enclosing parentheses).
More specifically, Python compares types as follows:
ord
), and character by
character until the end or first mismatch ("abc" < "ac"
).[2] > [1, 2]
).(
key
,
value
)
lists
are equal. Relative magnitude comparisons are not supported for dictionaries in Python 3.X, but they work in 2.X as though comparing
sorted (
key
,
value
)
lists.1 < 'spam'
) are errors in Python 3.X. They are allowed in Python
2.X, but use a fixed but arbitrary ordering rule based on type name string. By proxy, this also applies to sorts, which use comparisons
internally: nonnumeric mixed-type collections cannot be sorted in 3.X.In general, comparisons of structured objects proceed as though you had written the objects as literals and compared all their parts one at a time from left to right. In later chapters, we’ll see other object types that can change the way they get compared.
Per the last point in the preceding section’s list, the change in Python 3.X for nonnumeric mixed-type comparisons applies to magnitude tests, not equality, but it also applies by proxy to sorting, which does magnitude testing internally. In Python 2.X these all work, though mixed types compare by an arbitrary ordering:
c:\code>c:\python27\python
>>>11 == '11'
# Equality does not convert non-numbers
False >>>11 >= '11'
# 2.X compares by type name string: int, str
False >>>['11', '22'].sort()
# Ditto for sorts
>>>[11, '11'].sort()
But Python 3.X disallows mixed-type magnitude testing, except numeric types and manually converted types:
c:\code>c:\python33\python
>>>11 == '11'
# 3.X: equality works but magnitude does not
False >>>11 >= '11'
TypeError: unorderable types: int() > str() >>>['11', '22'].sort()
# Ditto for sorts
>>>[11, '11'].sort()
TypeError: unorderable types: str() < int() >>>11 > 9.123
# Mixed numbers convert to highest type
True >>>str(11) >= '11', 11 >= int('11')
# Manual conversions force the issue
(True, True)
The second-to-last point in the preceding section also merits illustration. In Python 2.X, dictionaries support magnitude comparisons, as though you were comparing sorted key/value lists:
C:\code>c:\python27\python
>>>D1 = {'a':1, 'b':2}
>>>D2 = {'a':1, 'b':3}
>>>D1 == D2
# Dictionary equality: 2.X + 3.X
False >>>D1 < D2
# Dictionary magnitude: 2.X only
True
As noted briefly in Chapter 8, though, magnitude comparisons for dictionaries are removed in Python 3.X because they incur too much overhead when equality is desired (equality uses an optimized scheme in 3.X that doesn’t literally compare sorted key/value lists):
C:\code>c:\python33\python
>>>D1 = {'a':1, 'b':2}
>>>D2 = {'a':1, 'b':3}
>>>D1 == D2
False >>>D1 < D2
TypeError: unorderable types: dict() < dict()
The alternative in 3.X is to either write loops to compare values by key, or compare the sorted key/value lists manually—the
items
dictionary method and sorted
built-in suffice:
>>>list(D1.items())
[('b', 2), ('a', 1)] >>>sorted(D1.items())
[('a', 1), ('b', 2)] >>> >>>sorted(D1.items()) < sorted(D2.items())
# Magnitude test in 3.X
True >>>sorted(D1.items()) > sorted(D2.items())
False
This takes more code, but in practice, most programs requiring this behavior will develop more efficient ways to compare data in dictionaries than either this workaround or the original behavior in Python 2.X.