Java Concurrency: An Introduction to Project Loom

Posted on 17 June 2022

Project Loom is an experimental version of the JDK. It extends Java with virtual threads that allow lightweight concurrency. Preview releases are available and show what’ll be possible.

Server-side Java applications should be able to process a large number of requests in parallel. But the model that’s still most widely used in server-side Java programming today is thread-per-request. An incoming request is assigned a thread, and everything that needs to be done to produce a suitable response is processed on this thread. However, this severely limits the maximum number of requests that can be processed concurrently. Java threads, each of which uses a native operating system thread, aren’t exactly lightweight. Neither in terms of memory consumption: Each thread takes up more than one megabyte of memory by default. Nor in terms of the cost of switching between threads (context switch).

What are the others doing?
Virtual threads may be new to Java, but they aren't new to the JVM. Those who know Clojure or Kotlin probably feel reminded of "coroutines" (and if you've heard of Flix, you might think of "processes"). Those are technically very similar and address the same problem. However, there's at least one small but interesting difference from a developer's perspective. For coroutines, there are special keywords in the respective languages (in Clojure a macro for a "go block", in Kotlin the "suspend" keyword). The virtual threads in Loom come without additional syntax. The same method can be executed unmodified by a virtual thread, or directly by a native thread.

In response to these drawbacks, many asynchronous libraries have emerged in recent years, for example using CompletableFuture. As have entire reactive frameworks, such as RxJava, Reactor, or Akka Streams. While they all make far more effective use of resources, developers need to adapt to a somewhat different programming model. Many developers perceive the different style as “cognitive ballast”. Instead of dealing with callbacks, observables, or flows, they would rather stick to a sequential list of instructions.

Is it possible to combine some desirable characteristics of the two worlds? Be as effective as asynchronous or reactive programming, but in a way that one can program in the familiar, sequential command sequence? Oracle’s Project Loom aims to explore exactly this option with a modified JDK. It brings a new lightweight construct for concurrency, named virtual threads. And a modified standard library based on them.

Virtual Threads

Does anyone remember multithreading in Java 1.1? At that time Java only knew the so-called “Green Threads”. The possibility to use multiple operating system threads wasn’t used. Threads were only emulated within the JVM. Only from Java 1.2 onwards, a native operating system thread was used for each Java thread. And now, some 20 years later, the “Green Threads” are celebrating a revival! Albeit in a significantly changed, modernized form. On paper, virtual threads aren’t that different: they’re threads that are managed within the JVM. However, they’re now no longer a replacement for the native threads, but a supplement. A (relatively small) set of native threads is used as “carrier threads” to process an (almost arbitrarily large) number of virtual threads. The overhead of the virtual threads is so little that the programmer doesn’t need to worry about how many of them to start.

A native thread in a 64-bit JVM with default settings reserves one megabyte alone for the call stack (the “thread stack size”, which can also be set explicitly with the -Xss option). On top of that, there’s some metadata. And if the memory isn’t the limit, the operating system will stop at a few thousand.

   // Attention, can freeze computer...
   void platformThreads() throws InterruptedException {
        ThreadFactory factory = Thread.ofPlatform().factory();
        ExecutorService executor = Executors.newFixedThreadPool(10000, factory);
        IntStream.range(0, 10000).forEach((num) -> {
            executor.submit(() -> {
                try {
                    out.println(num);
                    // We wait a bit, so that the threads really all
                    // run simultaneously
                    Thread.sleep(10000);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            });
        });
        executor.shutdown();
        executor.awaitTermination(Integer.MAX_VALUE, TimeUnit.SECONDS);
    }

Listing 1

The attempt in listing 1 to start 10,000 threads will bring most computers to their knees (or crash the JVM). Attention - possibly the program reaches the thread limit of your operating system, and your computer might actually “freeze”. Or, more likely, the program will crash with an error message like the one below.

Failed to start thread "Unknown thread" - pthread_create failed (EAGAIN) for attributes: stacksize: 1024k, guardsize: 4k, detached.
[2,459s][warning][os,thread] Failed to start the native thread for java.lang.Thread "Thread-8163"
Exception in thread "main" java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached

With virtual threads on the other hand it’s no problem to start a whole million threads. Listing 2 will run on the Project Loom JVM without any problems.

    void virtualThreads() throws InterruptedException {
        // Virtual Threads Factory
        ThreadFactory factory = Thread.ofVirtual().factory();
        ExecutorService executor = Executors.newFixedThreadPool(1000000, factory);
        IntStream.range(0, 1000000).forEach((num) -> {
            executor.submit(() -> {
                try {
                    out.println(num);
                    // Thread.sleep sends only the virtual thread to sleep here
                    Thread.sleep(10000);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            });
        });
        executor.shutdown();
        executor.awaitTermination(Integer.MAX_VALUE, TimeUnit.SECONDS);
    }

Listing 2

JDK APIs

So now we can start a million threads at the same time. This may be a nice effect to show off, but is probably of little value for the programs we need to write.

When are threads coming for everyone?
Project Loom is keeping a very low profile when it comes to in which Java release the features will be included. At the moment everything is still experimental and APIs may still change. However, if you want to try it out, you can either check out the source code from Loom Github and build the JDK yourself, or download an early access build. The source code in this article was run on build 19-loom+6-625.

Things become interesting when all these virtual threads only use the CPU for a short time. Most server-side applications aren’t CPU-bound, but I/O-bound. There might be some input validation, but then it’s mostly fetching (or writing) data over the network, for example from the database, or over HTTP from another service.

In the thread-per-request model with synchronous I/O, this results in the thread being “blocked” for the duration of the I/O operation. The operating system recognizes that the thread is waiting for I/O, and the scheduler switches directly to the next one. This might not seem like a big deal, as the blocked thread doesn’t occupy the CPU. However, each context switch between threads involves an overhead. By the way, this effect has become relatively worse with modern, complex CPU architectures with multiple cache layers (“non-uniform memory access”, NUMA for short).

To utilize the CPU effectively, the number of context switches should be minimized. From the CPU’s point of view, it would be perfect if exactly one thread ran permanently on each core and was never replaced. We won’t usually be able to achieve this state, since there are other processes running on the server besides the JVM. But “the more, the merrier” doesn’t apply for native threads - you can definitely overdo it.

To be able to execute many parallel requests with few native threads, the virtual thread introduced in Project Loom voluntarily hands over control when waiting for I/O and pauses. However, it doesn’t block the underlying native thread, which executes the virtual thread as a “worker”. Rather, the virtual thread signals that it can’t do anything right now, and the native thread can grab the next virtual thread, without CPU context switching. But how can this be done without using asynchronous I/O APIs? After all, Project Loom is determined to save programmers from “callback hell”.

This is where the advantage of providing the new functionality in the form of a new JDK version becomes apparent. A third-party library for a current JDK is dependent on using an asynchronous programming model. Project Loom, on the other hand, comes with a modified standard library instead. Many I/O libraries have been rewritten to use virtual threads internally (see box below). A familiar network call is - without any change to the program code - suddenly no longer “blocking I/O”. Only the virtual thread pauses. This trick means that existing programs also benefit from the virtual threads without the need for any adaptations. The following classes in the standard library have been modified so that blocking calls in them no longer block the native thread, but only the virtual thread:

java.net.Socket
java.net.ServerSocket
java.net.DatagramSocket/MulticastSocket
java.nio.channels.SocketChannel
java.nio.channels.ServerSocketChannel
java.nio.channels.DatagramChannel
java.nio.channels.Pipe.SourceChannel
java.nio.channels.Pipe.SinkChannel
java.net.InetAddress

Standard Library Classes Modified to Use Virtual Threads

Continuations

The concept that forms the basis for the implementation of virtual threads is called “delimited continuations”. Most of you will have used a debugger before to set a breakpoint in the code. When this point is reached, the execution is stopped and the current state of the program is displayed in the debugger. It would now be conceivable to “freeze” this exact state. This is the basic idea of a continuation: stop at a point in the flow, take the state (of the current thread, i.e. the call stack, the current position in the code, etc.) and convert it into a function, the “pick up where you left off” function. This can then be called at some later point in time and the process that was started can be resumed. This is exactly what’s needed for virtual threads: The ability to stop a program at any point in time and resume it later, in the remembered state.

Continuations have a justification beyond virtual threads and are a powerful construct to influence the flow of a program. Project Loom includes an API for working with continuations, but it’s not meant for application development and is locked away in the jdk.internal.vm package. It’s the low-level construct that makes virtual threads possible. However, those who want to experiment with it have the option, see listing 3.

    void continuations() {
    // The scope is a tool for creating nested continuations.
    // to enable.
        ContinuationScope scope = new ContinuationScope("demo");
        Continuation a = new Continuation(scope, () -> {
            out.print("To be");
    // the function is "frozen" here
    // and gives control to the caller.
            Continuation.yield(scope);
            out.println("continued!");
        });
        a.run();
        out.print(" ... ");
    // the continuation can be moved from where it was stopped,
    // be continued.
        a.run();
    // ...
    }

Listing 3

It’s worth mentioning that virtual threads are a form of “cooperative multitasking”. Native threads are kicked off the CPU by the operating system, regardless of what they’re doing (preemptive multitasking). Even an infinite loop will not block the CPU core this way, others will still get their turn. On the virtual thread level, however, there’s no such scheduler - the virtual thread itself must return control to the native thread.

Beyond virtual threads

Wasn't there something about "fibers"?
If you've already heard of Project Loom a while ago, you might have come across the term fibers. In the first versions of Project Loom, fiber was the name for the virtual thread. It goes back to a previous project of the current Loom project leader Ron Pressler, the Quasar Fibers. However, the name fiber was discarded at the end of 2019, as was the alternative coroutine, and virtual thread prevailed.

The problems with threads described at the beginning refer to efficiency alone. A completely different challenge hasn’t been considered yet: the communication between threads. Programming with the current Java mechanisms isn’t easy and consequently error-prone. Threads communicate via shared variables (“shared mutable state”). To avoid race conditions, these must be protected by synchronized or explicit locks. If errors occur here, they’re particularly difficult to find at runtime due to the non-determinism. And even if everything has been done correctly, these locks often represent a “point of contention”, a bottleneck in execution. This is because potentially many threads have to wait for the one who is currently holding the lock.

There are alternative models. In the context of virtual threads, “channels” are particularly worth mentioning here. Kotlin and Clojure offer these as the preferred communication model for their coroutines. Instead of shared, mutable state, they rely on immutable messages that are written (preferably asynchronously) to a channel and received from there by the receiver. Whether channels will become part of Project Loom, however, is still open. Then again, it may not be necessary for Project Loom to solve all problems - any gaps will certainly be filled by new third-party libraries that provide solutions at a higher level of abstraction using virtual threads as a basis. For example, the experimental “Fibry” is an actor library for Loom.

The Unique Selling Point of Project Loom

A "classic" to learn about concurrency in Java pre "Project Loom" is "Java Concurrency in Practice"

There’s already an established solution for the problem that Project Loom solves: asynchronous I/O, either through callbacks or through reactive frameworks. However, using these means getting involved in a different programming model. Not all developers find it easy to switch to an asynchronous way of thinking. There’s also a partial lack of support in common libraries - everything that stores data in “ThreadLocal” becomes suddenly unusable. And in tooling - debugging asynchronous code often results in several “aha”-experiences in the sense that the code to be examined isn’t executed on the very thread you just let the debugger go through in single steps.

The special sauce of Project Loom is that it makes the changes at the JDK level, so the program code can remain unchanged. A program that is inefficient today, consuming a native thread for each HTTP connection, could run unchanged on the Project Loom JDK and suddenly be efficient and scalable. Thanks to the changed java.net/java.io libraries, which are then using virtual threads.

The source code examples for this article are available on Codeberg