When we refer to the term “usability” we usually have in mind the ease of utilizing user interfaces which vary from desktop to mobile applications. However, in the field of software development it is also interesting to examine if the programming languages are actually usable enough to let the developers work efficiently and effectively.
Can we really measure the usability of programming languages? And if yes, how?
The short answer is yes, but evaluating a programming language from a usability point of view is much more complex than evaluating a classical user interface. Several aspects need to be considered:
The test users need to be familiar with the language first, especially if it is a new language. This can be time-consuming depending on the complexity of the language and thus makes it more difficult to find test subjects for the evaluation.
We were facing these challenges when we were supposed to evaluate the VistraQ language – a query and analysis language for efficient recording and use of traceability information.
We decided to compare VistraQ with other similar query languages because a comparison would give us more reliable and complete results. We chose Cypher and SPARQL as the two comparison languages, since VistraQ
The question is now: what are we doing with these three languages? How can we say that one of these languages is more usable than another? In general, there are three aspects that should be taken into account when it comes to usability:
But how can we measure that for a programming language?
A common method for measuring the parameters above is to conduct a usability test with real users.
Considering the fact that usability tests are usually taking a lot of time and need a big amount of resources, we agreed that the type of our evaluation will be an online questionnaire. This will save us a significant amount of time, facilitate our process and help us gather as many participants as possible.
For our analysis of the survey results, in our opinion the most important criteria is the time the user spends completing the questionnaire as well as the time for answering each question. This measurement could be an indicator of how fast the user can learn the language (learnability).
Moreover, a crucial aspect of the analysis is the definition of correct/wrong answers. Questions like, “What is considered to be an error?”, “Does a syntax error have the same importance like a semantic error?” or “If the correct answer is “LINKED TO” and the user writes “Linked to”, how correct or wrong is it?” have to be examined and analysed in depth. Together with the time measurement, the analysis of correct/wrong answers is an indicator for the understandability of the language.
Through these measurements, we want to answer the following questions:
To start our survey, based on collected user requirements and needs, we created different types of questions to answer the above mentioned points. The different types give variability to the survey, as the user do not always have to answer the same question pattern and we avoid the risk of getting random answers after some point. We also have a variety of results, a fact that is helping us to measure different evaluation criteria e.g. types of errors, understandability and time. The set of questions we are using in our questionnaire are:
Part of our online survey is also the System Usability Scale (SUS), where the participants have to subjectively assess their experience on each one of the three languages. SUS is a cheap and quick method to gather valid statistical data about the level of user satisfaction and that’s why it is used for our qualitative evaluation.
What we learned (and still learn) from the whole process is that evaluating the usability of a programming language (or a query language in our case) is a rather challenging and tough task and differs from the conventional and usual usability evaluation of digital user interfaces.
First of all, difficulties in evaluating programming languages arise, because of their complex nature and the lack of established usability evaluation methods for programming languages.
Ιt also requires a lot of effort to decide on which elements of the language have to be tested and examine if these are actually representative of the whole language. In this case, the support of a technical expert is helpful to make such decisions.
This also helps when defining the correct answers in the questionnaire: as usability engineers with a standard usability knowledge it is hard to find out and test the correct matching queries to each question - this requires a lot of time but also increases the technical knowledge about the query languages.
As the work is still in progress, we are excited to see the results of our evaluation and we hope that we can come to meaningful conclusions about the usability of query languages and the potential to improve them regarding their “ease-of-use”.
Stay tuned!