Java ThreadLocal – Proper Usage

Learn how to properly create and use ThreadLocal to share data between threads.

“A lie gets halfway around the world before the truth has a chance to get its pants on.”
― Winston S. Churchill

1. Introduction

Getting multi-threaded programming right is something of an art; it can be hard to foresee and avoid the many pitfalls. One such hazard comes from attempting to share a variable across multiple threads.

First of all, why do we need to share a variable among threads? Why not just create the object in each thread where it is needed, use it and dispose of it when done?

2. Avoiding Initialization Cost

Well, some objects are expensive to create. Maybe there is a non-trivial initialization sequence that the object needs to go through during construction.

(Yes, I am looking at you, SimpleDateFormat.)

Imagine using this kind of object when servicing each request, say when processing a HTTP request. Such requests could arrive at the rate of several 10s to 100s per second. Typical architectures for servicing these requests demand the use of threads.

3. Caching Objects

And herein lies the rub.

When an object construction does not depend on any data in the request, it makes sense to “pre-construct” such an object and re-use it to service each request. Add threads to the mix and you have a recipe for disaster.

The standard way of avoiding threads stepping over each other in these cases is to use synchronization.

Which brings its own problems in the form of thread contention.
Each synchronized block is guaranteed to be executed at a time by at most one thread; other threads wanting to execute the same code block must wait.

4. The Case for ThreadLocal

To avoid this, you can wrap such objects in a ThreadLocal instance. The ThreadLocal variable can be shared by multiple threads without resorting to thread synchronization.

How does it work?

The ThreadLocal instance maintains an instance of the wrapped object for each thread; when the code extracts the object from the ThreadLocal, it retrieves the copy specific to that thread.

5. Demonstrating the Problem

The following code demonstrates the problems arising from sharing a variable between threads without proper care.
The SimpleDateFormat instance below (df) is shared between the tasks (taskA, taskB, and taskC).
Each task resets the format before sleeping for a small time.
On wakeup, the date format likely is different in each task from what was set.
This is because another thread could have stepped in and modified the shared variable.

SimpleDateFormat df = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSZZZZ");
ExecutorService esvc = Executors.newCachedThreadPool();
List<Callable<String>> tasks =
    Arrays
    .asList(() -> {
        df.applyPattern("HH:mm");
        TimeUnit.MILLISECONDS.sleep(300);
        return String.format("%s => %s [%s]",
                 "taskA", df.format(new Date()),
                 Thread.currentThread().getName());
    },
    () -> {
        df.applyPattern("HH:mm:ss");
        TimeUnit.MILLISECONDS.sleep(200);
        return String.format("%s => %s [%s]",
                 "taskB", df.format(new Date()),
                 Thread.currentThread().getName());
    },
    () -> {
        df.applyPattern("HH:mm:ss.SSS");
        TimeUnit.MILLISECONDS.sleep(100);
        return String.format("%s => %s [%s]",
                 "taskC", df.format(new Date()),
                 Thread.currentThread().getName());
    },
    () -> { esvc.shutdown(); return "shutdown"; } );
esvc.invokeAll(tasks)
    .forEach(f -> {
        try { System.out.println(f.get()); }
        catch(Exception ex) {
        System.out.println("get failed: " + ex.getMessage());
        }
    });

The following output from the code demonstrates that format in all threads is set to the value in taskC.

taskA => 09:42:21.123 [pool-1-thread-1]
taskB => 09:42:21.023 [pool-1-thread-2]
taskC => 09:42:20.924 [pool-1-thread-3]
shutdown

6. Solving it with ThreadLocal

How to solve it with ThreadLocal?

The following code shows how to use ThreadLocal to store and use the SimpleDateFormat instance.

Store the SimpleDateFormat instance within a ThreadLocal as follows:

ThreadLocal<SimpleDateFormat> df = new ThreadLocal<SimpleDateFormat>() {
    @Override protected SimpleDateFormat initialValue() {
        return new SimpleDateFormat();
    }
};

Other than this, the only change from the above code is referring to the SimpleDateFormat instance as df.get() instead of df.
Here is the complete code.

ThreadLocal<SimpleDateFormat> df = new ThreadLocal<SimpleDateFormat>() {
    @Override protected SimpleDateFormat initialValue() {
        return new SimpleDateFormat();
    }
    };
ExecutorService esvc = Executors.newCachedThreadPool();
List<Callable<String>> tasks =
    Arrays
    .asList(() -> {
        df.get().applyPattern("HH:mm");
        TimeUnit.MILLISECONDS.sleep(300);
        return String.format("%s => %s [%s]",
                 "taskA", df.get().format(new Date()),
                 Thread.currentThread().getName());
    },
    () -> {
        df.get().applyPattern("HH:mm:ss");
        TimeUnit.MILLISECONDS.sleep(200);
        return String.format("%s => %s [%s]",
                 "taskB", df.get().format(new Date()),
                 Thread.currentThread().getName());
    },
    () -> {
        df.get().applyPattern("HH:mm:ss.SSS");
        TimeUnit.MILLISECONDS.sleep(100);
        return String.format("%s => %s [%s]",
                 "taskC", df.get().format(new Date()),
                 Thread.currentThread().getName());
    },
    () -> { esvc.shutdown(); return "shutdown"; } );
esvc.invokeAll(tasks)
    .forEach(f -> {
        try { System.out.println(f.get()); }
        catch(Exception ex) {
        System.out.println("get failed: " + ex.getMessage());
        }
    });

The output from running this code shows that each task maintains the SimpleDateFormat before and after sleeping.

taskA => 09:50 [pool-2-thread-1]
taskB => 09:50:57 [pool-2-thread-2]
taskC => 09:50:57.704 [pool-2-thread-3]

7. ThreadLocal Gotchas

We can now come to the following conclusions.

  • With ThreadLocal, there is no chance that threads can step on each other’s instances.
  • Since synchronization is not required, thread contention is eliminated.
  • And the original problem of expensive construction per request is eliminated.

So everything is peachy, right?

With the ThreadLocal hammer in hand, everything is looking like a nail!

Not so fast.

ThreadLocal stores objects as long as the thread is alive. With long-lived threads (such as from a thread pool inside an application server or web server), this means the objects could be held without being cleaned up for a long time.

Memory leaks, anyone?

The ThreadLocal class does offer a remove() method for removing the instance for the current thread, the problem is when and where should this method be invoked from?

At what point can a worker thread decide to remove a ThreadLocal?

After 10 requests? After 100 requests? Why such an arbitrary number?

In other words, there is no reasonable way to determine when to cleanup the resources held inside a ThreadLocal.

To avoid these problems, it is recommended that ThreadLocal not be used in code that can be long-lived (such as an application server or web server).

And that is where (the situation stands)[https://dzone.com/articles/memory-leak-protection-tomcat] today.

Summary

Service objects that do not depend on a request should be created outside of the request processing loop. When using multiple threads to handle requests, access to these service objects must be synchronized. This leads to slowing down of the application due to thread contention. Eliminate thread contention entirely by wrapping service objects in a ThreadLocal. However, cleaning up objects wrapped in a ThreadLocal is a challenge since, in a long lived application, it is hard to determine when to do it.