HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

What does "prompt collection" mean in the context of memory management?

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
thewhatmeancollectionmanagementdoesmemorycontextprompt

Problem

I often see people assert that reference counting techniques such as shared_ptr in C++ provide prompt collection (e.g. here and here) but I am not sure what exactly is meant by this. Some people tell me that they mean "collects at the earliest possible point" but I think that is not true of scope-based reference counting because it defers collection to the end of scope even if a variable is dead. Therefore, I'm wondering if other people have a different interpretation that may be correct.

Do computer scientists use the term "prompt" in this context and, if so, what do they mean by it?

More concretely, I once studied the behaviour of various OCaml and F# programs and found that they often collect values before they fall out of scope and, therefore, more "promptly" than scope-based reference counting. For example, the following function runs in bounded memory:

let rec loop tmp i =
  if i<=0 then tmp else
    loop (Array.copy (loop (Array.copy tmp) (i-1))) (i-100)


Even though the argument tmp is in scope for the entire body of the function it is collected before even the first recursive call to loop.

EDIT Here is a simpler F# example with a single recursive call and some post-processing to ensure that the call is not in tail position so it cannot be eliminated:

let rec loop tmp i =
  if i<=0 then tmp else
    let tmp = loop (Array.copy tmp) (i-1)
    tmp.[0] <- tmp.[0] + 1
    tmp


Between the copying of the array and the call to loop the argument tmp dies and, indeed, I find that it is garbage collected so this program requires only O(1) space.

Here is the equivalent reference counted C++:

shared_ptr > loop(shared_ptr > tmp, int i) {
    if (i > tmp1(new vector(*tmp));
        shared_ptr > tmp2 = loop(tmp1, i-1);
        ++(*tmp2)[0];
        return tmp2;
    }
}


Although tmp dies between the creation of tmp1 and the call to loop, scope-based reference counting keeps it allocated until the end of scope. So n recursive

Solution

Do computer scientists use the term "prompt" in this context and, if so, what do they mean by it?

What is meant by prompt, is that it is guaranteed to be deleted and more importantly, destructed, when it goes out of scope. The main point of this, is to take advantage of the mainly-C++-ish paradigm of Resource Acquisition Is Initialization.

Look at the following program in pythonish:

def procedure():
  f = open('path')

  ... #use f

def main():
  procedure()
  #f *might* still be alive here, which means it is hogging a file resource


So, languages like python therefore must supply some manual cleanup methods. One is supposed to call f.close() before leaving procedure(). However, this can be error prone, because of early-return (you must remember to close at every return, and remember this when changing code later; it is very cumbersome) and exception safety (what if somewhere before returning, something throws an exception?, then your f.close();return isn't called at all).

The C++ solution to this is very elegant; when things go out of scope, they are guaranteed to call a destructor. The destructor of a resource object will/should clean up the resource. So you have the following code in C++:

void procedure(){
  std::ofstream f("path");
  ... //do stuff with f
  // we *can* call f.close(), but we don't have to, because f::~() (the destructor) will clean it up, and it is guaranteed to be called when going out of scope.
}


The python solution to this is to use the finally exception block like so:

def procedure():
  f = open('path')
  try:
    ... #use f
  finally:
    f.close()
def main():
  procedure()


Alternatively, there is some syntax sugar for some special resource classes (you can add this to any custom class as well) that accomplishes the same thing:

def procedure():
  with open('path') as f:
    ... #use f
  #f.close() is called once out of "scope" of the with; it is essentially the same as the finally block.
def main():
  procedure()


So it is simply two different paradigms; C++'s concept of the "on-the-stack" objects allows self-cleanup, python and most other languages require manual cleanup, but enjoy garbage collection.

Now, the issue is, pointers don't have destructors; they are essentially POD (plain old data). Therefore, if you want to keep an object around longer, you would allocate it on the heap, and save a pointer to it, but you must manually call the destructor (via delete ptr;), but only when you "know" you are done with it. In many situations, it is hard to know when you are done with a pointer, for example, if it is used in many places ("shared ownership"); therefore a reference counter such as shared_ptr can be used, and the underlying objected that is pointed to will be destructed (the destructor will be called) "promptly" when it goes out of scope, ie. when the last user gets rid of the pointer.


More concretely, I once studied the behaviour of various OCaml and F# programs and found that they often collect values before they fall out of scope and, therefore, more "promptly" than scope-based reference counting.

This is probably not guaranteed though. A garbage collector can decide to clean things up whenever it wants to, they usually don't take resources other than memory into consideration, and AFAIK, never provide any such guarantees. C++, and the finally blocks provide guarantees of resource cleanup.

Code Snippets

def procedure():
  f = open('path')

  ... #use f

def main():
  procedure()
  #f *might* still be alive here, which means it is hogging a file resource
void procedure(){
  std::ofstream f("path");
  ... //do stuff with f
  // we *can* call f.close(), but we don't have to, because f::~() (the destructor) will clean it up, and it is guaranteed to be called when going out of scope.
}
def procedure():
  f = open('path')
  try:
    ... #use f
  finally:
    f.close()
def main():
  procedure()
def procedure():
  with open('path') as f:
    ... #use f
  #f.close() is called once out of "scope" of the with; it is essentially the same as the finally block.
def main():
  procedure()

Context

StackExchange Computer Science Q#16039, answer score: 2

Revisions (0)

No revisions yet.