python - how to check if an iterable allows more than one pass? -



python - how to check if an iterable allows more than one pass? -

in python 3, how can check whether object container (rather iterator may allow 1 pass)?

here's example:

def renormalize(cont): ''' each value original container scaled same factor such total becomes 1.0 ''' total = sum(cont) v in cont: yield v/total list(renormalize(range(5))) # [0.0, 0.1, 0.2, 0.3, 0.4] list(renormalize(k k in range(5))) # [] - bug!

obviously, when renormalize function receives generator expression, not work intended. assumes can iterate through container multiple times, while generator allows 1 pass through it.

ideally, i'd this:

def renormalize(cont): if not is_container(cont): raise containerexpectedexception # ...

how can implement is_container?

i suppose check if argument empty right we're starting sec pass through it. approach doesn't work more complicated functions it's not obvious when sec pass starts. furthermore, i'd rather set validation @ function entrance, rather deep within function (and shift around whenever function modified).

i can of course of study rewrite renormalize function work correctly one-pass iterator. require copying input info container. performance impact of copying millions of big lists "just in case not lists" ridiculous.

edit: original illustration used weighted_average function:

def weighted_average(c): ''' returns weighted average of container c c contains values , weights in tuples weights don't need sum 1 (automatically renormalized) ''' homecoming sum((v * w v, w in c)) / sum((w v, w in c)) weighted_average([(0,1), (1,1)]) #0.5 weighted_average([(k, 1) k in range(2)]) #0.5 weighted_average((k, 1) k in range(2)) #mistake

but not best illustration since version of weighted_average rewritten utilize single pass arguably improve anyway:

def weighted_average(it): ''' returns weighted average of iterator yields values , weights in tuples weights don't need sum 1 (automatically renormalized) ''' total_value = 0 total_weight = 0 v, w in it: total_value += v total_weight += w homecoming total_value / total_weight

although iterables should subclass collections.iterable, not of them do, unfortunately. here reply based on interface objects implement, instead of "declare".

short answer:

a "container" phone call it, ie list/tuple can iterated on more 1 time opposed beingness generator exhausted, typically implement both __iter__ , __getitem__. hence can this:

>>> def is_container_iterable(o): ... homecoming hasattr(o, '__iter__') , hasattr(o, '__getitem__') ... >>> is_container_iterable([]) true >>> is_container_iterable(()) true >>> is_container_iterable({}) true >>> is_container_iterable(range(5)) true >>> is_container_iterable(iter([])) false

long answer:

however, can create iterable not exhausted , not back upwards getitem. example, function generates prime-numbers. repeat generation many times if want, having function retrieve 1065th prime take lot of calculation, may not want back upwards that. :-)

so there more "reliable" way?

well, iterables implement __iter__ function homecoming iterator. iterators have __next__ function. used when iterating on it. calling __next__ repeatedly in end exhaust iterator.

so if has __next__ function iterator, , exhausted.

>>> def foo(): ... x in range(5): ... yield x ... >>> f = foo() >>> f.__next__ <method-wrapper '__next__' of generator object @ 0xb73c02d4>

iterables not yet iterators not have __next__ function, implement __iter__ function, homecoming iterable:

>>> r = range(5) >>> r.__next__ traceback (most recent phone call last): file "<stdin>", line 1, in <module> attributeerror: 'range' object has no attribute '__next__' >>> ri = iter(r) >>> ri.__next__ <method-wrapper '__next__' of range_iterator object @ 0xb73bef80>

so can check object has __iter__ not have __next__.

>>> def is_container_iterable(o): ... homecoming hasattr(o, '__iter__') , not hasattr(o, '__next__') ... >>> is_container_iterable(()) true >>> is_container_iterable([]) true >>> is_container_iterable({}) true >>> is_container_iterable(range(5)) true >>> is_container_iterable(iter(range(5))) false

iterators has __iter__ function, homecoming self.

>>> iter(f) f true >>> iter(r) r false >>> iter(ri) ri true

hence, can these variations of checking:

>>> def is_container_iterable(o): ... homecoming iter(o) not o ... >>> is_container_iterable([]) true >>> is_container_iterable(()) true >>> is_container_iterable({}) true >>> is_container_iterable(range(5)) true >>> is_container_iterable(iter([])) false

that fail if implement object returns broken iterator, 1 not homecoming self when phone call iter() on again. (or third-party modules) code doing things wrong.

it depends on making iterator though, , hence calling objects __iter__, in theory may have side-effects, while above hasattr calls should not have side effects. ok, calls getattribute have. can prepare thusly:

>>> def is_container_iterable(o): ... try: ... object.__getattribute__(o, '__iter__') ... except attributeerror: ... homecoming false ... try: ... object.__getattribute__(o, '__next__') ... except attributeerror: ... homecoming true ... homecoming false ... >>> is_container_iterable([]) true >>> is_container_iterable(()) true >>> is_container_iterable({}) true >>> is_container_iterable(range(5)) true >>> is_container_iterable(iter(range(5))) false

this 1 reasonably safe, , should work in cases except if object generates __next__ or __iter__ dynamically on __getattribute__ calls, if insane. :-)

instinctively preferred version iter(o) o, haven't ever needed this, that's not based on experience.

python design types iterator python-3.x

Comments

Popular posts from this blog

How do I check if an insert was successful with MySQLdb in Python? -

delphi - blogger via idHTTP : error 400 bad request -

postgresql - ERROR: operator is not unique: unknown + unknown -