A previous post outlined an example project and a set of key metrics (KPIs) that I want to be able to calculate based on information gathered from multiple different development tools. In this follow up post, I will illustrate how to calculate one of those metrics using traceability information and an example query over the trace graph.
I need information from multiple tools in order to get a complete picture of the project status. In fact, I can produce a diagram of all the necessary information that is required, and how it is interrelated. This is known as the Traceability Information Model (TIM). For my example project, the TIM is shown in the diagram below.
Fig 1: Traceability Information Model
Using a traceability analysis tool (as is being developed under the VisTraQ Project as an extension to YAKINDU Traceability (now itemis ANALYZE)), I can collect data (information), that populates the TIM, on a regular basis (e.g. Weekly or Monthly) and then using appropriate queries over the collected data, I can calculate the values that are used to provide my Project Status Schedule Metric (and of course, the other status values).
The instances of the types defined in this TIM form a graph of Artefacts and Links known as the Traceability Information Graph (TIG). Artefacts are instances of the types defined in the TIM, and Links are instances of relationships between the types.
The Artefacts and Links are created from information that is defined in the various tools that are used by the project team throughout their daily work, as follows:
VisTraQ has defined a query language that is appropriate for writing queries that collect information from such a TIG. The language is based on the same concepts as Cypher (a graph query language) though it has a few extensions and a more natural-language like syntax. The basis of the language is to define a subgraph pattern which is matched in the TIG, and to perform some calculation on the resulting matched subgraphs.
Using this language I will show how to define queries that collect the required information for calculating the example Schedule Metric as defined above.
As a reminder, I defined my Schedule Metric as follows:
The Project can be considered to be 100% on schedule, if every Milestone that is due by the current date has been met. A Milestone is met if all the Features allocated to it have been implemented and tested by the defined Milestone date and a new Binary Artifact has been published. Breaking this down we can state that:
We start by matching the Milestones that are currently due to be complete,
MATCH Milestone AS ms WHERE ms.dueDate <= now
And we compute an aggregation on this to calculate the total number of milestones, as we will need that in order to provide our end metric.
MATCH Milestone AS ms WHERE ms.dueDate <= now RETURN COLUMN Due CONTAINING COUNT ms
Now, for each released milestone we calculate if the Milestone is met. We have to find, for each released Milestone, a subgraph of the other relevant information that is connected to the milestone. And, we must filter those subgraphs with certain constraints.
The required subgraph is most easily understood using a diagram
Fig 2: Diagramatic expression to match a subgraph
Which we can also express using a textual query language MATCH expression, as follows:
We want to match Milestones that have Features assigned and/or a number of Bugs. This gives two branches in the subgraph to match, we connect the alternatives with an (inclusive) ‘OR’ because we want to match either or both branches. A ‘LINKED WITH’ expression defines the link relationships to match (‘LINKED TO’ or ‘LINKED FROM’ would constrain the link to a particular direction, however in this case I am not worried about the direction, just the presence of a link), giving us the following:
MATCH Milestone AS ms ( LINKED 0..* TIMES WITH Bug AS bug OR LINKED 0..* TIMES WITH Feature AS ft )
A Bug is linked to one or more UnitTests, each of which has exactly 1 TestResult (produced by a Build). We could define our MATCH expression to match the path from Bug to UnitTest to TestResult. However, we are not really interested in the UnitTest definition itself, rather we are interested in the TestResult. The result cannot, of course, exist without the definition, however we do not need to explicitly match the UnitTest. Rather we match a path from the Bug to one or more TestResults. The path match is expressed by the dashed line in the diagram, and a ‘USING 1..* LINKS’ part of a ‘LINKED WITH’ expression. Adding this to the overall expression, we get:
MATCH Milestone AS ms ( LINKED 0..* TIMES WITH Bug AS bug LINKED USING 1..* LINKS WITH TestResult AS bResult OR LINKED 0..* TIMES WITH Feature AS ft )
I will not explain the rest in detail, however, the complete MATCH expression is as follows:
MATCH Milestone AS ms ( LINKED 0..* TIMES WITH Bug As bug LINKED USING 1..* LINKS WITH TestResult AS bResult OR LINKED WITH Feature AS ft ( LINKED 0..* TIMES WITH SubTask AS task OR LINKED WITH Requirement AS req ( LINKED USING 1..* LINKS WITH TestResult AS iResult AND LINKED TO CodeFile AS code LINKED USING 1..* LINKS WITH TestResult AS uResult ) ) ) LINKED 1 TIMES TO Build AS build LINKED 1 TIMES TO Binary AS binary
This expression matches all the subgraphs for each Milestone. To check if the Milestone has been met, we add some constraints that will filter out (of the result) those subgraphs where the Milestone has not been met. To do this we add a WHERE clause. The WHERE clause expression is evaluated in the context of each matched Milestone’s subgraph.
Considering each part of the schedule definition in turn, we construct the WHERE clause as follows:
ft.status == 'done' AND task.status == 'done'
The ‘LINKED WITH’ part of the MATCH expression defaults to matching ‘one or more’ (multiplicity 1..*) so the subgraph is only matched if there is at least one Integration Test Result, so we need only add a WHERE constraint to check the result value,
iResult.value == 'pass'
uResult.value == 'pass' AND build.testCoverage >= 80
The ‘LINKED 0..* TIMES WITH Bug’ part of the MATCH expression states that Bugs can be optionally assigned to the milestone. We have to add WHERE constraints to check the Bug Test Result and the status of the Bug
bug.status == 'done' AND bResult.value == 'pass'
The built and published versions should match the intended version defined in the milestone, and they should be built and published before the milestone due date.
build.buildDate < ms.dueDate AND build.version == ms.version AND binary.publishedDate < ms.dueDate AND binary.version == ms.version
And finally, we want to match only those milestones that have been marked as released,
ms.released == true
Putting this all together we have an expression as follows:
MATCH Milestone AS ms WHERE ms.dueDate <= now RETURN COLUMN Due CONTAINING COUNT ms JOIN MATCH Milestone AS ms ( LINKED 0..* TIMES WITH Bug AS bug LINKED USING 1..* LINKS WITH TestResult AS bResult OR LINKED WITH Feature AS ft ( LINKED 0..* TIMES WITH SubTask AS task OR LINKED WITH Requirement AS req ( LINKED USING 1..* LINKS WITH TestResult AS iResult AND LINKED TO CodeFile AS code LINKED USING 1..* LINKS WITH TestResult AS uResult ) ) ) LINKED 1 TIMES TO Build AS build LINKED 1 TIMES TO Binary AS binary WHERE ms.released == true AND ft.status == 'done' AND task.status == 'done' AND iResult.value == 'pass' AND uResult.value == 'pass' AND build.testCoverage >= 80 AND bug.status == 'done' AND bResult.value == 'pass' AND build.buildDate < ms.dueDate AND build.version == ms.version AND binary.publishedDate < ms.dueDate AND binary.version == ms.version RETURN COLUMN Met CONTAINING COUNT ms JOIN RETURN COLUMN Percent CONTAINING (Met / Due)* 100
Which gives a result table:
Due | Met | Percent |
---|---|---|
20 | 19 | 95 |
This result can be expanded to include the time dimension, by adding a TIME clause to the expression,
FOR TIMEPERIOD '01-Jan-2017' UNTIL '31-Dec-2017' EVERY month ...
Giving a table that show the value of my key schedule metric every month for the year 2017,
Timestamp | Due | Met | Percentage |
---|---|---|---|
1-Jan-2017 | 0 | 0 | 0 |
1-Feb-2017 | 2 | 2 | 100 |
1-Mar-2017 | 4 | 3 | 75 |
... | |||
1-Dec-2017 | 20 | 19 | 95 |
The other key metrics can be similarly calculated by navigating a traceability graph, though it would need some additional Artifact Types.
The VisTraQ research project is in the process of defining the query language illustrated above. The project will also produce a prototype web based application that implements a system to provide the capability of analysing traceability data. This could be used to monitor projects, and calculate KPIs using traceability data, rather than relying on manual reporting.