It's pretty common in Python 2 apps to see code like this:
if not hasattr(thing, '__iter__'):
thing = [thing]
This sort of code is used when input is expected to be either a thing or a
sequence of those things. The single-thing syntax is supported as an API
convenience usually. Often the "thing" being checked for absence of
__iter__ is a string.
Here's an example of using this pattern in a function that checks that a user has permission to perform an action based on an ACL:
def check(acl, username, permission):
for ace in acl:
ace_action, ace_username, ace_permissions = ace
if username == ace_username:
if not hasattr(ace_permissions, '__iter__'):
ace_permissions = [ace_permissions]
if permission in ace_permissions:
return ace_action == 'allow'
return False
Let's pretend you're got an existing Python-2-only codebase that contains the above function, and it's been working for a long time. Now you want the same code to also run on Python 3. To your delight, as you make your codebase cross-Py2/Py3 compatible, you need to make no changes to the above function! It "just works". Your existing tests pass. You move on.
But there's a problem:
acl = [
('allow', 'fred', 'edit_pictures'),
('allow', 'bob', ['view_pictures', 'delete_pictures']),
]
check(acl, 'fred', 'edit')
On Python 2, the above call to check will return False. This is
correct, because fred doesn't actually possess the edit permission.
He possesses the edit_pictures permission, but not the edit
permission.
On Python 3, however, the above call to check will incorrectly return True.
Why? Because the if not hasattr(ace_permissions, "__iter__") check will
evaluate to False. Why? In Python 3, instances of str have an
__iter__ attribute, unlike instances of str in Python 2. The
subsequent line if permission in ace_permissions will subsequently boil
down to if "edit" in "edit_picture", which will evaluate True via
substring checking used by in.
If such a bug makes it into a production release, it'll be a pretty embarrassing security hole, at least on Python 3. The current solution for cross-compatible code is to define a compatibility function like so:
if PY3:
def is_nonstr_iter(v):
if isinstance(v, str):
return False
return hasattr(v, '__iter__')
else:
def is_nonstr_iter(v):
return hasattr(v, '__iter__')
And to use it in the place you previously used
if not hasattr("__iter__") :
def check(acl, username, permission):
for ace in acl:
ace_action, ace_username, ace_permissions = ace
if username == ace_username:
if not is_nonstr_iter(ace_permissions):
ace_permissions = [ace_permissions]
if permission in ace_permissions:
return ace_action == 'allow'
return False
Bugs caused by this minor incompatibility will remain latent for long periods of time. You cannot rely on statement coverage, branch coverage, nor condition coverage to uncover it, and 2to3 won't help at all. Your test suite won't have an explicit test case for substring matching in the single-string case. Why would it?
If you're a porter, what can you do to avoid getting embarrassed by a bug caused by
this backwards incompatibility? You'll want to grep your codebases for __iter__,
ensuring in each usage that you don't use its presence to test if the value you're
being passed is not a string. You'll need to do this "by eye", there's no
automation for it.
It would have been better in general if Python 3 str
instances continued to have no __iter__, matching its absence in Python 2.
If that meant that you couldn't do for c in "abcdef", that would have been fine
by me, and even preferable; I've seen enough ["s", "t", "r", "i", "n", "g"] results in buggy code to know
that the feature is already a bug magnet. An explicit "to_iter" method on strings to
produce an iterable object for folks who really do want to iterate character by character
would have sufficed.
What's the origin of your hasattr idiom?