- Published on
TIL: You can have "duplicate" keys in Python dicts
Lemma 1: You can override __hash__
for any object.
Lemma 2: You can also override the __str__
and __repr__
methods.
Combine these two, and you can end up in a situation like this-
...
>>> mydict
{"a": 1, "b": 2, "a": 3} # notice something funny?
Behind the scenes, we have two a
objects, each with a different hash, but with the same stringified value ("a"
). This issue becomes particularly relevant when using dataclasses with inheritance:
@dataclass(unsafe_hash=True)
class FunkyStr:
val: str
def __repr__(self):
return self.val
@dataclass(unsafe_hash=True, repr=False)
class SubclassStr(FunkyStr):
# no extra members, all I wanted
pass
mydict = {FunkyStr("a"): 1, FunkyStr("b"): 2, SubclassStr("a"): 3}
print(mydict) # gives {"a": 1, "b": 2, "a": 3}
Both the a
objects have different hashes because they are instances of different classes. The real title of this post could have been- Python hashing is nominal, not structural! (but that's not clickbait-y enough)
To be honest - in the above example, the real crime is abusing __repr__
to return the internal value. See docs for correct usage.
More info at: Multiple identical keys in a Python dict - yes, you can!