Musings about Jigsaw – The Java 9 Module System

Hooray, JavaTM 9 was released on 21 September! Arguably the most visible and most controversial new feature of the Java platform is the Java Module System Jigsaw. Project Jigsaw was initiated in 2008 and almost 10 years after the kick off, Java eventually got its new and shiny module system as part of the platform.

Jigsaw-Puzzle.jpg


The main goal of that tremendous effort was to make the Java Development Kit (JDK) easily scalable and thereby more attractive to run Java on small devices. Modules allow strong encapsulation within the JDK which enables greater security and maintainability. Evolving the Java platform becomes easier if the standard libraries have clear module boundaries, too. Furthermore, performance – especially the startup performance of Java applications – can be improved by statically employing information that is known to be true about modules and dependencies among modules. One of the main challenges for Jigsaw was the backwards compatibility. After all, we don't want to throw away the tons of Java code and libraries that have been written prior to Java 9. In the end, backwards compatibility is a crucial asset and important for Java's success as a language and as a platform.

Now, since Jigsaw is available, basically each and every application and library developer will benefit from well defined modules. Or won't they?

What is Jigsaw?

Let's have a quick look at the status as of Java 8. Every reasonably complex Java application suffers from this problem: When you startup the Java Virtual Machine (JVM), you have to assemble the classpath for the process. Being it statically as a command line argument, or with more sophisticated means at runtime, at one point in time during the application's lifetime – and usually quite early in the startup procedure – the referenced libraries have to be available and become known to the JVM.

At build time, tools like Maven, Ivy or Gradle help to manage the dependency graph and referenced library versions. But as these tools are build tools, they work apparently best at build time. Yes, they can be (ab)used to assemble the classpath for the startup of the application, too, but usually you do not want to depend upon that in a production environment. So what we end up with are library folders with tons of JAR files that will be put on the classpath and scripts that gather these libs and launch JVMs. The more libraries are referenced, the longer the list and the higher the risk to have duplicate libraries in different versions on the classpath. In most environments, out of the sudden the successful startup of the application depends on the order of the files on the classpath because multiple versions of the same library are on the classpath. Welcome to JAR hell.

Jigsaw aims at mitigating the issue. The newly introduced module path supersedes the plain classpath. Every module does formally declare its dependencies and the Java virtual machine validates the proper configuration already at startup time. Since the notion of a module now is a first class citizen in the Java ecosystem, the Java compiler as well as the Java runtime can use the available information to produce elaborate feedback in case of misconfiguration. When a class is referenced but not exported from its module, the compiler will already announce that and so does the JVM at runtime. Compile time and runtime become symmetric and the risk for misconfiguration is reduced.

This feature is possible due to the new notion of module-info files. Each Jigsaw module – which is basically a plain Jar file – contains a module-info.java, a descriptor that is supposed to be located at the module's root. And among a few nifty details it contains

  • the module's name
  • the exported packages that will become visible by clients of the module
  • its dependencies and re-exported dependencies (Jigsaw uses the term requires transitive for that purpose)
  • the consumed and provided services, a clause specific to the ServiceLoader interface.
module java.sql.rowset {
  // Required modules, that will be re-exported, if there are transitive
  requires transitive java.logging;
  requires transitive java.naming;
  requires transitive java.sql;

  // Here we list the exported packages. Internal packages are omitted
  exports javax.sql.rowset;
  exports javax.sql.rowset.serial;
  exports javax.sql.rowset.spi;

  // A service that is used by this module. Instances will be obtained from ServiceProviders
  uses javax.sql.rowset.RowSetFactory;
}

This exemplary module declaration from the JDK module 'java.sql.rowset' above illustrates some of the possible declarations in a module descriptor. The module exports three packages to the public, has three re-exported, transitive dependencies and uses a service lookup for the type RowSetFactory. The module nicely encapsulates all the types from the internal com.sun.rowset.* packages thus removing the coupling to their types. But this is just a quick glance at the capabilities of module descriptors. The excellent JavaDoc provides more details on those.

Since not every JAR on Maven Central does contain a module descriptor, we have to talk about backwards compatibility – one of the biggest challenges when you develop a new abstraction like Jigsaw for a widely adopted platform. Compatibility is retained by means of two special module types.

  1. Automatic modules
    All JARs on the module path, that do not contain a module descriptor yet, will be converted to an automatic module. The fulfil the weakest possible module contract. That is, all packages in the JAR are marked as exported packages and all other modules on the module path are implicitly required by automatic modules. The module name itself is derived from the filename of the JAR. A few common naming patterns are understood by that heuristical approach.

  2. Unnamed modules
    The retain backwards compatibility also with existing startup scripts, the classpath is still a valid way to configure the JVM. All libraries that are configured on the classpath – contrary to the module path – end up in a so called unnamed module (pun intended). Each classloader in the JVM is associated with one unnamed module. It also exports all its packages and has access to all named modules. It is implied, that named modules cannot define a dependency to an unnamed module, since it is just not possible to list it as a requirement in the module descriptor. By transitivity, it is not possible to release a library as a module, that has at least one dependency that is not yet available as a module.

The concept of automatic and unnamed modules is a nice compromise between backwards

compatibility and strictness. Here you can find more elaborate explanation of the idea.

What is Jigsaw not?

The short answer: A tool to handle multiple library versions on the module path. Versioning is always a pain point when it comes to the correct configuration of the classpath. The more libraries you use, the bigger the risk to have conflicting transitive dependencies. One library depends on version X, the other library depends on version Y of the same open source project. Jigsaw will not help to mitigate the issue. At least not out of the box. There are means to implement a solution to this problem on top of modules, but these do not come out of the box. Build tools are a better solution to the version selection problem, according to Mark Reinholds.

Exactly that decision caused probably the most grief in the Java community. Existing module systems like OSGi do already provide great solutions to this problem yet it is still unaddressed by the to-become standard module system in the Java ecosystem. Due to the unique properties of Jigsaw modules and OSGi modules, it is not even possible to bridge seamlessly between both worlds. Jigsaw modules strictly prohibit circular dependencies, OSGi bundles and bundle fragments are exactly a special form of circular deps. OSGi bundles describe their imported packages or required bundles including a minimum or maximum version number whereas the module path in Jigsaw does not support version constraints at all. It will be interesting to see how his will evolve in the near future and which compromises will be the most sustainable.

As a sidenote, we can find some interesting scenarios, where the defined semantics of modules have an impact on popular existing projects. We learned, that a library can only be released as a module, if all of its dependencies are available as modules. Being a proper citizen in the module ecosystem bubbles up from the bottom along the dependency hierarchy. If one of the bottommost dependencies cannot be modularized, none of the downstream projects can be converted to Jigsaw modules.

It’s a curiosity that even the JDK itself is affected by this. Consider SWT, the widget toolkit abstraction that is used at Eclipse. At its core, the architecture is based on OSGi bundle fragments to decouple the SWT-API from the bindings for a concrete implementation, let's say for Linux, Windows, or Mac. In other words, we have a circular dependency as the architectural paradigm to implement SWT bindings. Therefore, it appears to be unlikely that we will ever see SWT modules. The integration of JavaFX into SWT (javafx-swt.jar) is a part of JavaFX and of course depends on SWT. While JavaFX as a framework is pretty close to bottommost in a dependency hierarchy, a small part of it will thus never be released as a module.

What can we expect from Jigsaw?

When Jigsaw does not help with multiple versions of the same library or provide means to resolve version conflicts, what does it do for us in today's Java projects?

Looking into the crystal ball I'd assume that we will see some interesting micro-architecture patterns and -idioms emerge from the mere existence of the new module system. A proper modular architecture based on Jigsaw is way easier to enforce than without it. Since the Java compiler and the runtime system now follow the same semantics, it's more natural to enforce proper encapsulation. A module that only exposes interfaces and no implementation classes makes it just impossible to refer to implementation details by accident. No downcasting, debugger-driven development (DDD) or reflective magic will help to gain access to the types that are not exported from the module.

My take on this is, that the strict form of encapsulation will lead to cleaner interfaces and well defined abstractions. Library developers are now in the position to really hide implementation details thus it will be easier to evolve the libraries further. There cannot be an accidental reference to internals since neither the compiler nor the JVM do allow this anymore. This will certainly give the opportunity to move forward faster and more sustainable. Library developers and especially the maintainers of the JDK itself do no longer expose all their internals. Eventually they are free to refactor code, get rid of technical debt or avoid to introduce technical debt in the first place. In that sense, Java 9 modules are a nice next step towards a cleaner and more powerful, fast paced JDK evolution. Default methods in interfaces have only been the beginning.

Whether this prediction will generally become true or whether this is only wishful thinking - the future will tell. What is your take on this? Is the future with Jigsaw bright? And how do you plan your migration to Java 9? Please let us know on Twitter or in the comments section.

About Sebastian Zarnekow

Sebastian Zarnekow works as a consultant and software architect in Berlin. He has been a committer at Eclipse for 9 years. As a co-architect of the Xtext Framework and the Xtend programming language, he specializes in language design, implementation and IDE development, but is also interested in the latest trends and technologies. Sebastian regularly speaks at international conferences.