YAKINDU, Project management, Traceability, english

How Visualizing Traceability Data Removes the Worries of Project Management – Part 2

A previous post outlined an example project and a set of key metrics (KPIs) that I want to be able to calculate based on information gathered from multiple different development tools. In this follow up post, I will illustrate how to calculate one of those metrics using traceability information and an example query over the trace graph.

Traceability Information Model

I need information from multiple tools in order to get a complete picture of the project status. In fact, I can produce a diagram of all the necessary information that is required, and how it is interrelated. This is known as the Traceability Information Model (TIM). For my example project, the TIM is shown in the diagram below.

 

tim-traceability-information-model

Fig 1: Traceability Information Model

Using a traceability analysis tool (as is being developed under the VisTraQ Project as an extension to YAKINDU Traceability), I can collect data (information), that populates the TIM, on a regular basis (e.g. Weekly or Monthly) and then using appropriate queries over the collected data, I can calculate the values that are used to provide my Project Status Schedule Metric (and of course, the other status values).

The instances of the types defined in this TIM form a graph of Artefacts and Links known as the Traceability Information Graph (TIG). Artefacts are instances of the types defined in the TIM, and Links are instances of relationships between the types.
The Artefacts and Links are created from information that is defined in the various tools that are used by the project team throughout their daily work, as follows:

  • Milestones, Features, Bugs: come from JIRA.
  • Requirements: come from an Excel Spreadsheet stored in a Git repository.
  • Code Files, Unit Tests, Integration Tests: come from a Git repository
  • Builds, Test Results: come from Jenkins.
  • Publications come from Artifactory

Query Calculations

VisTraQ has defined a query language that is appropriate for writing queries that collect information from such a TIG. The language is based on the same concepts as Cypher (a graph query language) though it has a few extensions and a more natural-language like syntax. The basis of the language is to define a subgraph pattern which is matched in the TIG, and to perform some calculation on the resulting matched subgraphs.

Using this language I will show how to define queries that collect the required information for calculating the example Schedule Metric as defined above.

As a reminder, I defined my Schedule Metric as follows:

The Project can be considered to be 100% on schedule, if every Milestone that is due by the current date has been met. A Milestone is met if all the Features allocated to it have been implemented and tested by the defined Milestone date and a new Binary Artifact has been published. Breaking this down we can state that:

  • All Features for the Milestone are complete (in JIRA), including all their sub-tasks.
  • All Requirements must have at least one Integration Test defined against them, and all those tests must pass.
  • All Code that implements the Requirement must have a Test coverage of 80%, and they must pass.
  • All Bugs that are allocated to the Milestone must be ‘done’ and all associated Unit Tests must pass.
  • There must be a Binary artifact (e.g. jar file) corresponding to a release Build, that is Published.


Schedule Definition

We start by matching the Milestones that are currently due to be complete,

MATCH Milestone AS ms WHERE ms.dueDate <= now

And we compute an aggregation on this to calculate the total number of milestones, as we will need that in order to provide our end metric.

MATCH Milestone AS ms WHERE ms.dueDate <= now
RETURN COLUMN Due CONTAINING COUNT ms


Now, for each released milestone we calculate if the Milestone is met. We have to find, for each released Milestone, a subgraph of the other relevant information that is connected to the milestone. And, we must filter those subgraphs with certain constraints.

The required subgraph is most easily understood using a diagram

diagramatic-expression-match-subgraph


Fig 2: Diagramatic expression to match a subgraph


Which we can also express using a textual query language MATCH expression, as follows:

We want to match Milestones that have Features assigned and/or a number of Bugs. This gives two branches in the subgraph to match, we connect the alternatives with an (inclusive) ‘OR’ because we want to match either or both branches. A ‘LINKED WITH’  expression defines the link relationships to match (‘LINKED TO’ or ‘LINKED FROM’ would constrain the link to a particular direction, however in this case I am not worried about the direction, just the presence of a link), giving us the following:

MATCH Milestone AS ms
( LINKED 0..* TIMES WITH Bug AS bug
OR
  LINKED 0..* TIMES WITH Feature AS ft
)


A Bug is linked to one or more UnitTests, each of which has exactly 1 TestResult (produced by a Build). We could define our MATCH expression to match the path from Bug to UnitTest to TestResult. However, we are not really interested in the UnitTest definition itself, rather we are interested in the TestResult. The result cannot, of course, exist without the definition, however we do not need to explicitly match the UnitTest. Rather we match a path from the Bug to one or more TestResults. The path match is expressed by the dashed line in the diagram, and a ‘USING 1..* LINKS’ part of a ‘LINKED WITH’ expression. Adding this to the overall expression, we get:

MATCH Milestone AS ms
( LINKED 0..* TIMES WITH Bug AS bug
  LINKED USING 1..* LINKS WITH TestResult AS bResult
OR
  LINKED 0..* TIMES WITH Feature AS ft
)

 
I will not explain the rest in detail, however, the complete MATCH expression is as follows:

MATCH Milestone AS ms
( LINKED 0..* TIMES WITH Bug As bug
  LINKED USING 1..* LINKS WITH TestResult AS bResult
OR
  LINKED WITH Feature AS ft
  ( LINKED 0..* TIMES WITH SubTask AS task  
  OR
    LINKED WITH Requirement AS req
    (
      LINKED USING 1..* LINKS WITH TestResult AS iResult
    AND
      LINKED TO CodeFile AS code
      LINKED USING 1..* LINKS WITH TestResult AS uResult
    )
  )
)
LINKED 1 TIMES TO Build AS build
LINKED 1 TIMES TO Binary AS binary

 
This expression matches all the subgraphs for each Milestone. To check if the Milestone has been met, we add some constraints that will filter out (of the result) those subgraphs where the Milestone has not been met. To do this we add a WHERE clause. The WHERE clause expression is evaluated in the context of each matched Milestone’s subgraph.

Considering each part of the schedule definition in turn, we construct the WHERE clause as follows:

  • All Features for the Milestone are complete (in JIRA), including all their sub-tasks.

ft.status == 'done' AND task.status == 'done'

  • All Requirements must have at least one Integration Test defined against them, and all those tests must pass.

The ‘LINKED WITH’ part of the MATCH expression defaults to matching ‘one or more’ (multiplicity 1..*) so the subgraph is only matched if there is at least one Integration Test Result, so we need only add a WHERE constraint to check the result value,

iResult.value == 'pass'

  • All Code that implements the Requirement must have a Unit Test coverage of 80%, and they must pass.

uResult.value == 'pass' AND build.testCoverage >= 80

  • All Bugs that are allocated to the Milestone must be ‘done’ and all associated Unit Tests must pass.

The ‘LINKED 0..* TIMES WITH Bug’ part of the MATCH expression states that Bugs can be optionally assigned to the milestone. We have to add WHERE constraints to check the Bug Test Result and the status of the Bug

bug.status == 'done' AND bResult.value == 'pass'

  • There must be a Binary Artifact (e.g. jar file) corresponding to a release Build, that is Published.

The built and published versions should match the intended version defined in the milestone, and they should be built and published before the milestone due date.

build.buildDate < ms.dueDate
AND build.version == ms.version
AND binary.publishedDate < ms.dueDate
AND binary.version == ms.version


And finally, we want to match only those milestones that have been marked as released,

ms.released == true


Putting this all together we have an expression as follows:

MATCH Milestone AS ms
  WHERE ms.dueDate <= now
  RETURN COLUMN Due CONTAINING COUNT ms
JOIN
  MATCH Milestone AS ms
    ( LINKED 0..* TIMES WITH Bug AS bug
      LINKED USING 1..* LINKS WITH TestResult AS bResult
    OR
      LINKED WITH Feature AS ft
      ( LINKED 0..* TIMES WITH SubTask AS task  
      OR
        LINKED WITH Requirement AS req
        ( LINKED USING 1..* LINKS WITH TestResult AS iResult
        AND
          LINKED TO CodeFile AS code
          LINKED USING 1..* LINKS WITH TestResult AS uResult
        )
      )
    )
    LINKED 1 TIMES TO Build AS build
    LINKED 1 TIMES TO Binary AS binary
  WHERE
    ms.released == true
    AND ft.status == 'done'
    AND task.status == 'done'
    AND iResult.value == 'pass' 
    AND uResult.value == 'pass'
    AND build.testCoverage >= 80
    AND bug.status == 'done'
    AND bResult.value == 'pass'
    AND build.buildDate < ms.dueDate
    AND build.version == ms.version
    AND binary.publishedDate < ms.dueDate
    AND binary.version == ms.version
  RETURN COLUMN Met CONTAINING COUNT ms
JOIN
  RETURN COLUMN Percent CONTAINING (Met / Due)* 100

 
Which gives a result table:

Due Met Percent
20 19 95

This result can be expanded to include the time dimension, by adding a TIME clause to the expression,

FOR TIMEPERIOD '01-Jan-2017' UNTIL '31-Dec-2017' EVERY month
 ...


Giving a table that show the value of my key schedule metric every month for the year 2017,

Timestamp Due Met Percentage
1-Jan-2017 0 0 0
1-Feb-2017 2 2 100
1-Mar-2017 4 3 75
    ...
1-Dec-2017 20 19 95

The other key metrics can be similarly calculated by navigating a traceability graph, though it would need some additional Artifact Types.

Conclusion

The VisTraQ research project is in the process of defining the query language illustrated above. The project will also produce a prototype web based application that implements a system to provide the capability of analysing traceability data. This could be used to monitor projects, and calculate KPIs using traceability data, rather than relying on manual reporting.

    
About David Akehurst

Dr. David H. Akehurst is a highly experienced Software Consultant. He has worked for many years in both research and commercial organisations.