The Tragedy of Finalizers

2018-04-04, David Crawshaw

Like many garbage collected languages, Go lets you register a finalizer on an object. The finalizer is a function that the language runtime calls when the object is garbage collected.

Finalizers are deeply unsatisfying. They are almost impossible to use well.

The obvious use of finalizers is to tie resource management to object lifetime. This is such an obvious use that you can even find an unfortunate example of it in the Go standard library. In package os:

runtime.SetFinalizer(f.file, (*file).close)

The idea here is an *os.File has a resource from the operating system, a file descriptor. At some point those OS resources need to be cleaned up. Some short-lived small programs can ignore the problem because the OS will clean up the descriptors when the process exits. But for long-lived programs, or programs that want to open lots of descriptors, the program eventually has to call the Close method.

Tracking when precisely a file should be closed is usually straightforward, but occasionally it involves a great deal of busywork. So a common instinct we all have is to say "gee, it sure would be nice if the file descriptor were closed when the File is garbage collected, let's use a finalizer". This does not work.

Here is an example where the finalizer fails:

package main

import (
	"fmt"
	"io/ioutil"
	"os"
	"path/filepath"
)

func fatal(err error) {
	fmt.Fprintf(os.Stderr, "%v\n", err)
	os.Exit(2)
}

func main() {
	dir, err := ioutil.TempDir("", "finalizers-")
	if err != nil {
		fatal(err)
	}
	defer os.RemoveAll(dir)
	fmt.Printf("temp directory: %s\n", dir)
	for i := 0; i < 2000; i++ {
		path := filepath.Join(dir, fmt.Sprintf("tmp-%d", i))
		f, err := os.Create(path)
		if err != nil {
			fatal(err)
		}
		fmt.Fprintf(f, "temp file %d\n", i)

		// f is no longer live, can be GCed
	}
}

On macOS the result is:

$ go run junk.go 
temp directory: /tmp/finalizers-802262722
open /tmp/finalizers-802262722/tmp-252: too many open files
exit status 2

Oops.

Garbage collectors keep their own hours

The garbage collection contract says nothing about when collection occurs. So while there are GC algorithms that free memory the moment it is no longer used (for example, reference counting, it is typical to batch process memory to reduce CPU cycles spent collecting garbage.

If there is not a lot of demand for new heap space, the Go GC may take a very long time to free memory, and thus get around to calling the finalizer. The cleanup time of objects is open ended. It is a bug to depend on the GC to run before your resources are exhausted.

What are finalizers good for?

Nothing generally.

If you are deeply familiar with the GC algorithm being used in your runtime, you may be able to use them to manage a resource whose use is tied closely to heap use. Maybe C heap.

If you are reasonably sure the algorithm will eventually process all objects (which is not required by the GC contract!), you could manage effectively-inexhaustible resources with finalizers. Maybe very small C objects.

Conceptually you could use finalizers for statistics and book-keeping, but in practice they cost too many CPU cycles.

A hypothetical Resource Collector

I can imagine a programming language where finalizers are useful.

The language's runtime would, in addition to tracking statistics about heap use, allow the programmer to specify custom resource trackers. An object that registers a finalizer would also tell the runtime how much pressure the new object applies to a resource. If the resource is near exhaustion, the runtime collects objects that are both unreachable in memory and use the particular resource and collects them.

If you build such a language, I would like to try it. You could even do it as an extension to an existing language like Go, though you would have to fork the runtime.

Otherwise, I believe I have one very small use for finalizers as they exist today, which I will talk about in a followup blog post.


Index
github.com/crawshaw
twitter.com/davidcrawshaw
david@zentus.com