How does Python deal with functions that return objects?

Anyone who can answer that question will be able to decipher the following sketch:

>>> def arr():
	a = []
	print (id(a))
	return a

>>> print (id(arr()))

Is Python able to release the function on return, or like JavaScript, keep the function in memory because of the object binding? We know the above will be released because it is not assigned, but what if, as the case may be, there is an assignment that lives for the entire session? Is there memory leakage?

1 Like

Hmm, I’m unfamiliar with JavaScript’s behaviour so I might be misinterpreting the question. Apologies if so.

The function itself is definitely forgotten (the entire function frame is discarded) once it has finished executing along with any other local name bindings like the function local a. However any object(s) created within the function will persist so long as there’s at least one reference to them; that’s Python’s basic memory management utility of reference counting in action.

The current example behaves like a factory function for list objects at the moment. So after function execution the reference count will decrease by 1 as the function frame and any names it contains are discarded. If, as you mention it’s assigned then the object will not reach a reference count of zero. So no leak as far the function goes but long-term it depends what the developer then does with that reference.

1 Like

In JS, as I understand it, losing scope is not an option. So the function frame stays in place to provide the namespace.

Judging by our sketch, and your description, we get a pretty clear answer… The object itself is not tied to a particular scope once it is given a reference.

Thanks for your insight, @tgrtim.

1 Like

Ah yes, that does seem like different behaviour in JS.

So far as I’m aware Python’s objects (ignoring certain optimisations with constants and such) are created using malloc or similar dynamic allocation tools and simply passed around as object references. I don’t think typical objects are tied to a particular frame/scope (for CPython at least). Of course objects created this way must be made free at some point but Python generally handles that itself.

I’m not sure if every Python implementation follows that behaviour though.

It certainly does not follow JS behavior, whew! It gives the references the scope, not the objects. Much better philosophy by a stretch. Yet we must contend with this in the JS world.

The real reason I asked this question was to quell my concerns in the Hurricane Analysis project. Is it preferable to construct a global array or return one?

def darr():
  from data import db  # an iterator
  global dlu
  kys = ['name', 'month', 'year', 'max_sustained_winds', 'areas_affected', 'damages', 'deaths']
    while db:
      dlu.append(dict(zip(kys, next(db))))
  except StopIteration:
    print (len(dlu), 'records in store')
  return 1

dlu = []
def darr(arr):
  from data import db
  kys = ['name', 'month', 'year', 'max_sustained_winds', 'areas_affected', 'damages', 'deaths']
    while db:
      arr.append(dict(zip(kys, next(db))))
  except StopIteration:
    print (len(arr), 'records in store')
  return arr

dlu = darr([])

No difference it would seem in Python, but a big difference in JS. Needless, this is Python. Which approach would one best side with?

I suppose the object itself is arguably global in Python (it can be accessed and modifed from anywhere so long as a reference is available) no matter how you set it up so the only part you’d need to concern yourself with is the scope of the names that are bound to that object. So the question is whether a global name binding is preferred.

In this case I think not, without the global keyword the same function could be used in any scope with the same behaviour (e.g. used within a nested scope or imported to other modules without carrying that additional need for a global assignment). Conceivably one might wish to re-use the function but assign a new output to different names each with their own scope. To me that seems a little less fragile and more re-usable, at least so far as I can tell you don’t really lose anything by using it that way in Python.

1 Like

Little too chicken s h i t to try… Will two invocations of darr([]) collide?

No that should never be the case, no more than two (newly created) list objects would ever be the same object.

[] == []  # True (they are equivalent)
[] is []  # False (different object, different id)

New objects should always be new objects. Of course there a few exceptions to that rule for built-ins like None and there might be some optimisation tricks for built-in immutables (e.g. an empty tuple such that tuple() is tuple() is probably True) but that might be an implementation detail rather than a language feature (not entirely sure if that’s ever guaranteed by the language). Of course re-using the same object for mutables would mean that operations like a = []; a.append(stuff) would do some very troubling things to the only “empty” list.

The first couple of paragraphs in Memory Management — Python 3.10.1 documentation might be useful.

1 Like