I've just spent 3 days trying to debug some new code in buildbot.
The code in question is to implement a change to how we do nightly builds such that they use the same revision for all platforms.
I was hitting a KeyError exception inside buildbot's util.loop code, specifically at a line where it is trying to delete a key from a dictionary. In simple form, the loop is doing this:
for k in d.keys(): if condition: del d[k] # Raises KeyError....sometimes...
Tricky bit was, it didn't happen every time. I'd have to wait at least 3 minutes between attempts.
So I added a bunch of debugging code:
print d print d.keys() for k in d.keys(): print k if condition: try: del d[k] # Raises KeyError....sometimes... except KeyError: print k in d # sanity check 1 print k in d.keys() # sanity check 2
Can you guess what the results of sanity checks 1 and 2 were?
'k in d' is
'k in d.keys()' is
whhhaaaaa? Much head scratching and hair pulling ensued. I tried many different variations of iterating through the loop, all with the same result.
In the end, I posted a question on Stack Overflow.
At the same time, Bear and Dustin were zeroing in on a solution. The crucial bit here is that the keys of d are (follow me here...) methods of instances of my new scheduler classes, which inherit from
cmp is used in the
'k in d.keys()' test, but
hash is used in the
'k in d' test.
Some further digging revealed that my scheduler was modifying state that
ComparableMixin.hash was referring to, resulting in the scheduler instances not having stable hashes over time.
In the end, the fix was simple, just a few extra lines of python code. Too bad it took me 3 days to figure out!