An Introduction to Modeling and Language Engineering

by Dirk Leopold Jan 23, 2023

9 min. reading time

Now that the basics of modeling have been covered in Part 1, we can look at something else that itemis does regularly to solve tricky problems: Language Engineering.

About Models and Languages

As we have seen, modeling always implies the presence of an underlying metamodel. In the LEGO® world, the metamodel includes the different types, shapes and colors of the building blocks and the principles of how they can be connected. Thus, the metamodel of any LEGO® set is defined by the types of parts that are included in the box and the principles of how the parts can be connected. The parts and the connection mechanisms can be considered the "language" of any LEGO® set.

In software development, models are often created based on text-based programming languages. In this case, the metamodel or "language" defines the syntax, grammar and the possibility to include functions in the process of modeling.

Programming languages differ not only in their syntax and grammar, they also differ at the level of underlying features and concepts.

C is often used on specialized hardware such as microcontrollers. These controllers have limited resources and are often used in vehicles, industrial equipment, or home automation. The concepts and structure of C reflect the hardware instruction set and are optimized for resource efficiency.

Java, on the other hand, is a higher-level, object-oriented programming language. It typically does not run on resource-constrained microcontrollers, but on servers with many resources. Because Java programs are often larger than software written in C, it incorporates higher-level concepts that help structure and reuse software functions, such as by introducing classes with the ability to inherit capabilities.

SQL - Structured Query Language - is another type of programming language. SQL was developed for a specific purpose: handling information stored in relational databases. Therefore, the syntax and the functions provided are tailored to the domain of relational databases.

Although they differ in scope and structure, C and Java both fall into the category of general-purpose languages. They are not designed to solve a specific type of problem, but can be used in many domains. Languages such as C++, C#, Python, or XML fall into the same category.

SQL, on the other hand, is tailored to one set of tasks in a domain: querying and manipulating relational databases. It therefore falls into the category of domain-specific languages or DSLs. Other DSLs include HTML or MATLAB.

What is the equivalent of general-purpose languages in the world of LEGO®? Well, when I grew up there were only 5 or 6 LEGO® colors and the shapes of the blocks were mostly rectangular. LEGO® figures looked very generic. Building a spaceship required some creativity and some compromise.

Today, a variety of domain-specific LEGO® "DSLs" are available. Pirates with their ships, policemen including sheepdogs - and even our Millennium Falcon.

(Picture by Kim Do-hyun: https://www.flickr.com/photos/stickkim/6966161674)

Note the custom parts used in the spaceship, as well as the custom figures and their accessories. For a very specific task - building a Millennium Falcon - custom parts like antennas or cockpit windows make the modeling process much more efficient and the results more impressive.

Building a LEGO® Millennium Falcon in 1976 would have been a much more challenging task. I wonder if anyone would have recognized Chewbacca wearing a red top and blue pants ...

So who gets to enjoy the benefits of domain-specific languages - whether in computer languages, systems engineering, or LEGO®? Well, not everyone.

In the past, there had to be a significant number of potential users (= market size) of programmers, engineers or gamers to justify the development and maintenance of a DSL. Consequently, DSLs often retained a rather broad and technical character - the common denominator of the targeted expert group.

So if you're one of those specialists who has to write queries for relational databases (SQL), or if you're a Star Wars fan, you'll be lucky to get your DSL.

However, if your area of expertise or interest is "niche", you'll have to settle for general-purpose languages and go back to using square and rectangular Lego bricks.

Take away points:

The modeling language and the meta model are closely related. They define the functions and concepts that can be expressed by the model.
Domain-specific languages (DSLs) provide customized notations, syntax, and meta models to efficiently express functions and concepts for specific domains.
In the past, creating DSLs for computer languages or systems was cumbersome and expensive, and therefore limited to larger user groups and/or broad, general domains.

Language Workbenches

Now that the benefits of DSLs are known, the question is how to give more experts and specialists in their fields access to more efficient modeling methods.

Fortunately, we have finally arrived in the age of 3D printing!

Licensing and copyright issues aside, a 3D printer combined with the right software would allow LEGO® players to define and print any LEGO® parts of their choice. Unicorns instead of horses: Just print them out! Star Trek instead of Star Wars figures: Just print!

The new parts can now be constructed exactly to the player's needs.

Fortunately, there has also been considerable progress in recent years with respect to the creation of domain-specific languages in modeling and software engineering.

Martin Fowler's concept of Language Workbenches (LWB) is already more than a decade old. His visions and ideas have now become reality.

Language workbenches are software tools developed to efficiently create new programming languages including their syntax, grammar and underlying concepts (= metamodels).
Language workbenches are the "3D printers of software development". They enable the efficient creation of DSLs - even for very small groups of experts with very specific modeling requirements. The user group for DSLs can now be as small as a team in a department.

A number of language workbenches are available today. One category focuses on a textual modeling approach. A popular example of a textual language workbench is Xtext.

Projectional Language Workbenches, on the other hand, are not limited to one type of notation. Instead, they can flexibly project the content of models in any type of representation according to the user's needs and preferences. Jetbrain's Meta Programming System (MPS) is the most popular projectional language workbench today.

Projectional LWBs store model data in trees. This special tree is also called an Abstract Syntax Tree (AST). Modeling information is stored in the elements of the AST and can then be projected onto any notation: a text, a table, a mathematical formula, or any form of graphical representation. As part of the projection process, the system can highlight certain aspects of the underlying model and hide other aspects. The user can flexibly select the abstractions relevant to the task at hand - "abstractions on demand".

Let's take a quick look at a real-world example.

Security experts analyze and evaluate the security properties of technical systems. This can be a vehicle or an IoT device. To do this, they first need to understand the basic structure and main functions of the system. Here, they work closely with the engineering team, which is familiar with the system's components, interfaces and data. Based on this structure, security objectives, attack vectors, damage potential and propagation paths must be modeled iteratively. Identified risks are then mitigated, for example, by adding encryption and the necessary keys. This in turn affects the original architecture and functions of the system.

While it is not important to fully understand every detail of the specific workflow above, hopefully the modeling challenges become clear. We need to define a model that designers, architects, and safety engineers can work on simultaneously while focusing on different aspects of the model. And different tasks within the process benefit from different notations: Graphs, tables, text, diagrams, etc.

What a DSL might look like in a custom editor created with a projection language workbench is shown below. Please note that these are different projections of ONE model, emphasizing different abstractions for different steps in the analysis phase, including a specific DSL for the security domain.

Take away points

Language Workbenches are tools for efficiently creating domain-specific languages. They are the "3D printers of software development".
Projectional Language Workbenches offer advanced modeling options as they can provide different views and notations for the same model.

So what is the job of a Language Engineer?

Language engineers solve challenging system or software modeling problems by creating domain-specific languages and models.

Often, these problems are the result of increased complexity and the inability to manage that complexity with established software tools and methods.

The reasons for increased complexity can vary. From a growing number of features and product variants, to more stringent regulatory documentation requirements, or the requirement to analyze the dependencies of cross-cutting issues such as cost, performance, safety and security along the entire product development process.

Often weak meta-models, e.g. requirements written in plain text, and the lack of tool support for the experts to "get their models right" are the cause for the need to evolve in the area of methods and tools.
Language engineers are usually part of a multidisciplinary team that includes team members and subject matter experts from the client side. This invariably means that language engineers gain deep insights into many different areas: from automotive to industrial automation, medical, insurance or telecommunications. Language engineers also need to learn and understand dependencies of the work results of different user groups on the customer side. Together with domain experts, language engineers then help find the right abstractions, including syntax, grammar, and functions, to model all relevant aspects of a domain. Language engineers are also typically involved in designing and building the tools that enable domain experts to create and work with the models.

The DSLs and the tool created for the purpose of the security analysis above are a typical example of language engineers' work. Security methodology experts, potential users from various customers, and itemis' language engineers defined the model, including relevant abstractions, appropriate notations, and projections. itemis then built the editor, in this case based on a projection LWB MPS from Jetbrains, to model, analyze, and document security analysis projects.

If DSLs and language development sound like viable business approaches or exciting personal challenges for 2018 to you, don't hesitate to contact us to learn more.

But for the new year, don't forget to balance our stressful daily lives and play more! Maybe with LEGO®.

An Introduction to Modeling and Language Engineering – Part 2

About Models and Languages

Language Workbenches

So what is the job of a Language Engineer?

Comments

Recent posts

SysML/KerML: modeling languages for Ecore based tools

itemis CREATE - now available on Cloud, Visual Studio Code, and Eclipse

Vibe coding and project duration: Micro-efficiency vs. macro-complexity