Xtext and Controlled Natural Languages for Software Requirements (Part 1)

Stakeholders usually document requirements informally, i.e. in natural language. Often using text processing programs which do not provide any input assistance related to the requirements and do not allow their automated validation or post-processing. This leads either to higher efforts for cost intensive and time consuming human review processes or to reduced quality which can have a negative impact on subsequent development phases. To compensate these disadvantages of the usage of natural language in requirements documentation, various approaches exist. One of these approaches is to control the use of natural language by using templates in order to create acceptable requirements as they are written. This series shows how to create a controlled natural language based on sentence templates (we call them 'boilerplates') using Xtext.


The language will allow the usage of free text in combination with references to model elements at specific parts of the boilerplates. In comprehension to text processing programs the language will support the user through the usability functions of Xtext and will ensure that the requirements match the used boilerplates.

Part 1 of this series is about the formalization of informal natural language and the realization of the boilerplates in the Xtext grammar. Besides the grammar we will see how to use strings as cross- references in a convenient and readable way.

Part 2 deals with the validation of the natural language requirements using Natural Language Processing (NLP). This part shows how to include external libraries into a Xtext project and how to use the validation API.

Part 3 will show how to synchronize domain specific concepts used in specific free text parts of the boilerplates with a glossary using NLP techniques. The focus of this part lies on the QuickFix API of Xtext.

Part 1: Requirement boilerplates

Requirement boilerplates aim to increase the quality of textual requirements by defining a sentence template with placeholders for specific words or phrases that define the particular requirement. There is a wide range of boilerplates used for requirements documentation. For example User Stories in Agile Software Development.

As a <role>, I want <goal/desire> so that <benefit>

Such boilerplates allow the documentation of requirements in a standardized way without the knowledge of a specific requirements language and therefore can be used by domain experts as well as requirements
engineers. Furthermore, they reduce the risk of ambiguous, inconsistent or vague requirements arising from the use of natural language.

The boilerplate used in this series is similar to the one defined by the International Requirements Engineering Board (IREB).


Figure 1: The used boilerplate.

The following sentences are examples for requirements based on this template. Keywords are bold and references to AST elements are encapsulated in quotes.

  • The "printing module" shall "print" "charts" and "documents".
  • If the "analyst" created the "chart" or "document" the "printing module" shall provide the "analyst"with the ability to "print" this "chart" or "document".
  • If the "charts" where created by someone else the "printing module" shall hide the "print button".
  • The "printing module" will provide the "client" with the ability to "print" "documents".
  • If the licence is basic the "printing module" will provide the "client" with the ability to "print" "greyscale documents".

In order to realize such a language we have to define a Xtext grammar which should look as close as possible like natural language.


The Grammar consists of referable entities and glossary concepts, two types of boilerplates and the rules needed to use free text.The Grammar defines the entities Actor and System which can be referenced in boilerplates by their name.

Actor: 'Actor' ':' name=Text
description = Description;
System: 'System' ':' name=Text
description = Description;
Description: 'Description' ':' text+=SentenceWithReferences*;

The user can use the type Text for the name of an entity which can consist of multiple words and is not limited by the Java-ID conventions. Keywords that should be used in Text must be added explicit.

Text: ( 'To' | 'to' | 'A' | 'a' | 'the' | 'The' | WORD | ANY_OTHER)+; 

Here we have to be careful regarding the limitations of the parser. Since the rule text has no distinct terminator, only keywords which are not used as terminators for Text types can be added. To add further information for an entity the user can create a description for each entity using the typeSentenceWithReferences. Such a sentence consists of textWithReferences and a punctuation.

SentenceWithReferences: textWithReferences=TextWithReferences punctuation=('.' | '!' | '?');

TextWithReferences allows the combined use of references to entities and plain text. In order to recover the plain text representation of TextWithReferences the rule is defined as follows.

   (onlyRefs+=[Entities|STRING]+ |
   refBefore+=[Entities|STRING]* text+=Text

ReferenceCombination: (refs+=[Entities|STRING]+ text+=Text);

References to entities have the type STRING. This allows the referencing of entities with a name consisting of multiple words. For example the System "printing module". The core syntax elements of the language are the Requirement rules. The following rules are the realization of the boilerplate shown in Figure 1.

    ConditionalRequirement | UnconditionalRequirement;

    condition=Precondition system=[System|STRING] liability=Liability end=RequirementEnd;

    the='The' system=[System|STRING] liability=Liability end=RequirementEnd;

    conditional=Conditional condition=TextWithReferences;

enum Liability:
    shall | should | will;

    provide='provide' the1='the'? actor=[Actor|STRING] ^with='with' the2='the' ability='ability' to='to';

Since an Independent System Activity only consists of a process phrase we don't need to create a separate rule for it. Instead, this information is captured in the attribute objectWithReferences of the Type TextWithConceptsOrSynonymsin the rule RequirementEnd.

    ai=ActorInteraction? objectWithDetails=TextWithConceptsOrSynonyms '.';

Such TextWithConceptsOrSynonyms has the same structure than TextWithReferences but only GlossaryConceptsand their synonyms can be referenced. Such a glossary concept can be a Function representing a "process" or aDomainObject representing an "object" in Figure 1.

    {Glossary} 'Glossary' concepts+=(Concept)*;

    Function  | DomainObject ;

    'Function' ':' name=Text
    ('Synonyms' ':' synonyms+=FunctionSynonym (',' synonyms+=FunctionSynonym)*)?
    ('Description' ':' description+=SentenceWithReferences*)?;

    'Object' ':' name=Text
    ('Synonyms' ':' synonyms+=DomainObjectSynonym (',' synonyms+=DomainObjectSynonym)*)?
    ('Description' ':' description+=SentenceWithReferences+)?
    ('Properties' ':' properties+=Property (',' properties+=Property)*)?;



Synonyms are needed to ensure a consistent usage and description of terms and also to ensure to readability of the requirements when it comes to flexions or singular and plural nouns. They are nested in Concepts. This leads us to the next topic: referencing of such nested types.

Cross-References using simple names

Xtext uses by default qualified names for cross-referencing nested types. This means for our language if you want to reference the synonym of a Concept you have to use a dot notation. The following example demonstrates this for the object document and his synonym documents.

The "printing module" will provide the "client" with the ability to "print" "document.documents".

    Object: document
    Synonyms: documents   

The use of such qualified names would decrease the readability. To avoid this we add the SimpleNameFragment2 to the language part of .mwe2 generator workflow file of the runtime project.

language = StandardLanguage {
        fragment = exporting.SimpleNamesFragment2 {}

Finally we register our newly introduced class.

class UiModule extends AbstractUiModule {

    override Class<? extends AbstractEditStrategyProvider> bindAbstractEditStrategyProvider() {
        return MyAutoEditStrategy

Summary and outlook

In this post we saw how to realize boilerplates in the Xtext grammar and allow the usage of free text combined with references to entities in these boilerplates. The resulting language controls the use of natural language by defining a grammatical sentence structure and allows the user to document requirements in a standardized, convenient and readable way.

In part two we will see how to further control the usage of natural language specially in the free text parts of the boilerplates. According to Figure 1, we will make sure that each requirement contains a free text phrase which describes a process or Function of the system and at least one involved DomainObject. Since functions can be described by verbs and domain objects by nouns, we will use NLP techniques to ensure that each boilerplate contains exactly one Function and at least one DomainObject. Therefore, we will integrate an external library and define validation rules based on the Xtext validation API. 

Share this blog post


About The Author

Christoph is a software developer at itemis. He is interested in DSLs and MDSD as well as software engineering in general.