HiveBrain v1.2.0
Get Started
← Back to all entries
gotchaMinor

Why does garbage collection extend only to memory and not other resource types?

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
typeswhyextendresourcecollectiondoesmemoryandgarbageother

Problem

It seems like people got tired of manual memory management, so they invented garbage collection, and life was reasonably good. But what about every other resource types? File descriptors, sockets, or even user created data like database connections?

This feels like a naive question but I cannot find any place where anyone has asked it. Let's consider file descriptors. Say a program knows that it will only be allowed to have 4000 fds available when it starts. Whenever it performs an operation that will open a file descriptor, what if it would

  • Check to make sure that it isn't about to run out.



  • If it is, trigger the garbage collector, which will free a bunch of memory.



  • If some of the memory freed held references to file descriptors, close them immediately. It knows the memory belonged to a resource because the memory tied to that resource was registered into a 'file descriptor registry', for lack of a better term, when it was first opened.



  • Open a new file descriptor, copy it into new memory, register that memory location into the 'file descriptor registry' and return it to the user.



So the resource would not be freed promptly, but it would be freed whenever the gc ran which includes at the very least, right before the resource was about to run out, assuming it isn't being entirely utilized.

And it seems like that would be sufficient for many user defined resource cleanup issues. I managed to find a single comment here that references doing cleanup similar to this in C++ with a thread that contains a reference to a resource and cleans it up when only it has a single reference remaining (from the cleanup thread), but I can't find any evidence of this being a library or part of any existing language.

Solution

GC deals with a predictable and reserved resource. The VM has total control over it and has total control over what instances are created and when. The keywords here are "reserved" and "total control". Handles are allocated by the OS, and pointers are... well pointers to resources allocated outside the managed space. Because of that, handles and pointers are not restricted to be used inside managed code. They can be used - and often are - by managed and unmanaged code running on the same process.

A "Resource Collector" would be able to verify if a handle/pointer is being used within a managed space or not, but it by definition is unaware of what's happening outside it's memory space (and, to make things worse some handles can be used across process boundaries).

A practical example is the .NET CLR. One can use flavored C++ to write code which works with both managed and unmanaged memory spaces; handles, pointers and references can be passed around between managed and unmanaged code. The unmanaged code must use special constructs/types to allow the CLR to keep tracking of references being made to it's managed resources. But that's the best it can do. It cannot do the same with handles and pointers, and because of that said Resource Collector would not know if it's ok to release a particular handle or pointer.

edit: Regarding the .NET CLR, I'm not experienced with C++ development with the .NET platform. Maybe there are special mechanisms in place that allows the CLR to keep tracking of references to handles/pointers between managed and unmanaged code. If that's the case then the CLR could take care of the lifetime of those resources and release them when there all references to them are cleared (well, at least in some scenarios it could). Either way, best practices dictate that handles (especially those pointing to files) and pointers should be released as soon as they are not needed. A Resource Collector would be not-complying with that, that's another reason to not have one.

edit 2: It's relatively trivial on the CLR/JVM/VMs-in-general to write some code to free up a particular handle if it's used only inside the managed space. In .NET would be something like:

// This class offends many best practices, but it would do the job.
public class AutoReleaseFileHandle {
    // keeps track of how many instances of this class is in memory
    private static int _toBeReleased = 0;

    // the threshold when a garbage collection should be forced
    private const int MAX_FILES = 100;

    public AutoReleaseFileHandle(FileStream fileStream) {
       // Force garbage collection if max files are reached.
       if (_toBeReleased >= MAX_FILES) {
          GC.Collect();
       }
       // increment counter
       Interlocked.Increment(ref _toBeReleased);
       FileStream = fileStream;
    }

    public FileStream { get; private set; }

    private void ReleaseFileStream(FileStream fs) {
       // decrement counter
       Interlocked.Decrement(ref _toBeReleased);
       FileStream.Close();
       FileStream.Dispose();
       FileStream = null;
    }

    // Close and Dispose the Stream when this class is collected by the GC.
    ~AutoReleaseFileHandle() {
       ReleaseFileStream(FileStream);
    }

    // because it's .NET this class should also implement IDisposable
    // to allow the user to dispose the resources imperatively if s/he wants 
    // to.
    private bool _disposed = false;
    public void Dispose() {
      if (_disposed) {
        return;
      }
      _disposed = true;
      // tells GC to not call the finalizer for this instance.
      GC.SupressFinalizer(this);

      ReleaseFileStream(FileStream);
    }
}

// use it
// for it to work, fs.Dispose() should not be called directly,
var fs = File.Open("path/to/file"); 
var autoRelease = new AutoReleaseFileHandle(fs);

Code Snippets

// This class offends many best practices, but it would do the job.
public class AutoReleaseFileHandle {
    // keeps track of how many instances of this class is in memory
    private static int _toBeReleased = 0;

    // the threshold when a garbage collection should be forced
    private const int MAX_FILES = 100;

    public AutoReleaseFileHandle(FileStream fileStream) {
       // Force garbage collection if max files are reached.
       if (_toBeReleased >= MAX_FILES) {
          GC.Collect();
       }
       // increment counter
       Interlocked.Increment(ref _toBeReleased);
       FileStream = fileStream;
    }

    public FileStream { get; private set; }

    private void ReleaseFileStream(FileStream fs) {
       // decrement counter
       Interlocked.Decrement(ref _toBeReleased);
       FileStream.Close();
       FileStream.Dispose();
       FileStream = null;
    }

    // Close and Dispose the Stream when this class is collected by the GC.
    ~AutoReleaseFileHandle() {
       ReleaseFileStream(FileStream);
    }

    // because it's .NET this class should also implement IDisposable
    // to allow the user to dispose the resources imperatively if s/he wants 
    // to.
    private bool _disposed = false;
    public void Dispose() {
      if (_disposed) {
        return;
      }
      _disposed = true;
      // tells GC to not call the finalizer for this instance.
      GC.SupressFinalizer(this);

      ReleaseFileStream(FileStream);
    }
}

// use it
// for it to work, fs.Dispose() should not be called directly,
var fs = File.Open("path/to/file"); 
var autoRelease = new AutoReleaseFileHandle(fs);

Context

StackExchange Computer Science Q#52735, answer score: 4

Revisions (0)

No revisions yet.