A Bird’s View on Language Servers

To write computer programs different programming languages are used, often multiple languages for one piece of software. While programs can be written with very basic text editor like NotePad or vi, normally dedicated programming tools are used that give developers a more integrated and guided way of editing code. IDEs (Integrated Development Environment) usually support several languages at once, or even can be extended to support more languages. Some IDEs are heavy weight desktop applications (e.g. Eclipse IDE, Visual Studio, IntelliJ IDEA), others more light weight (e.g. Atom, Sublime, Visual Studio Code), or even web based (Eclipse Che, Cloud9).

All these tools share a common problem: For each programming language they support it usually needs quite sophisticated integration into that tool to support rich editing features like syntax coloring, syntactic and semantic code analysis, error reporting, code proposals and many more. For each supported language, for each supported version of these languages, for each supported feature, for each tool! While in the past it was sufficient for these tools to specialize on a small subset of languages, e.g. dedicated tools for programming Java or C, or on a small subset of features like syntax coloring, it is nowadays expected from these tools to provide rich support for many languages and features.

The increasing speed of language evolution and creation of new languages makes this even worse. For most languages tool integration is hand crafted, and tool providers are struggling hard to support a broader set of languages. For domain specific languages often language development frameworks like Xtext are used. They address this problem through a combination of generic framework services and code generation: The necessary glue code for all supported editing features is produced automatically. However, these frameworks support only subset of potential IDEs.

man-working-on-laptop-language-servers.jpg

Language Servers to the rescue!

In 2015 Microsoft started development of Visual Studio Code (short: VS Code), a modern open sourced light-weight IDE based on web technology. The core team is lead by Erich Gamma, who was already one of the masterminds behind Eclipse IDE in the past and knew how to build scalable and extensible tool platforms. And they had again the same issue: How to integrate support for multiple languages?

But then they had a great idea: They defined not just an API implemented in the main language of VS Code, TypeScript, but a technology agnostic protocol based on JSON RPC. And they open sourced the protocol definition. This protocol is known as the Language Server Protocol (LSP).

The protocol itself defines messages and structures that are exchanged between an editor (the client) and a “language smartness provider” (the language server). The information exchanged between those two parties is a common subset of interactions that code editors and languages usually do: open/save/close files, editing text, validating code, reporting errors or log messages, search for symbol references, follow references and many more.

VS Code was of course the first programming tool that used this protocol for its own integration of TypeScript, C# and other languages, but the software development community discovered the value of such a protocol for both sides fast: tool developers could integrate any language that offers the server side of the protocol, and language developers could concentrate on designing and implementing their language without caring much on their integration into certain tools. And due to the technology agnostic nature of the protocol both sides could be implemented in any language of their choice without dealing with all the nastiness of integrating programs written in different languages.

A word on the term “server”

When people read “server” they often think that there is some kind of server process running on a remote machine, maybe even a cloud based service. When hearing “language server” this could lead to the misunderstanding that a language server could just run on any machine and clients just need to connect and talk to that through remote communication. This is often just not true.

data-exchange-worldwide.jpg

A language server often runs on the same machine as the client, often but not necessary in a separate (child) process. Besides performance aspects that developers expect when editing code, the main reason is that client and server have to share the same file system and work on the same files. When client and server are running on different machine, they need to work on a distributed file system. Some solutions follow this approach, but for most implementations nowadays just think of a language server running on the same machine and thus working on the same files as the client.

The separation of a language server process from the editing tool has another great advantage: The tool can operate with far less resources and will be more responsive while still supporting multiple languages, while the language server’s resources can be scaled independently. Users will get a fast editing experience while all the heavy-lifting stuff is performed in a background process asynchronously.

Write once – run anywhere!

Although it was not the primary intention from Microsoft’s team initially, the open and modern approach of the protocol convinced many tool and language developers to work and support on this common protocol. That way it has nowadays evolved to a de-facto standard. Java in VS Code, C# in Eclipse, TypeScript in Atom – even domain specific languages (DSLs) can be supported in any editor now. All that it takes is to implement a single protocol! And a bit tool-specific glue code to integrate a language server into specific clients.

Even if this doesn't sound that difficult, there is still a lot to be done for the providers – IDE providers need to generify their editing facilities and offer pluggable extensions for their tools. While VS Code as first provider naturally supports all features, others have to implement this.

Eclipse as a major player has various initiatives for supporting the LSP: Since Eclipse is implemented in Java, a Java implementation of the protocol is required. The LSP4J project provides a framework that can be used for both Java based client and server implementations. With LSP4E a common API is offered to integrate language servers through the Eclipse plug-in mechanism, and the new Generic Editor interacts with language servers connected to the IDE. On the language provider side the Xtext framework is implementing the protocol’s server side for textual domain specific languages, opening up the vast amount of Xtext based languages to completely new editing environments.

Other widely used editing environments like Emacs, Eclipse Eclipse Che, Sublime are working on or already have support for the protocol. However, IntelliJ IDEA as one of the most used IDEs currently does not have plans to support the protocol. A feature request was opened but until today it does not seem that IntelliJ will work on that in the near future. While IntelliJ did a great job on attracting developers to use their IDEs, making it even more popular than Eclipse for Java and Web Development, it bares the risk that they miss jumping onto the express train where all the other cool guys are already on and celebrating parties. With the increasing speed of language evolvement they will have a hard way to keep up the speed, and will miss support of new languages.

This is a chance for a renaissance of Eclipse, and the Eclipse community is doing a great job on modernizing and stabilizing the core IDE. Other IDEs have the chance to get a bigger piece of the pizza, and developers are free to choose whatever tool and language they favor without tool lock-in.

About Karsten Thoms

Karsten is a software architect at itemis and part of the Xtext team. He strongly believes that Model Driven Software Development helps to be more efficient in mission critical projects.