minQL: Development Metrics Made Easy

All Posts
Share this post
Share this post

There are a lot of languages for querying interconnected data (like JQL for Jira tickets) or time-series data (like Prometheus’ PromQL or New Relic’s NRQL).

However, these languages simply don’t work for non-trivial development metrics like “show a chart of pull requests that have been open more than 7 days, grouped by epic” or “plot the average time from reopening a closed ticket to the next related commit.”

Until now, you had to write complicated SQL queries to compute these metrics.

We did too, and decided enough is enough.

We are excited to announce minQL: the first language for easily querying interconnected time-series data from a wide range of sources – including tickets, commits, pull requests, meetings, sprints, deployments, and more.

Why Create a Query Language?

Why did minware – a company focused on development observability – decide to make a new query language?

It’s simple: we respect you, our customer.

That may sound trite, but you know what you want better than we do. Unlike our competitors, we won’t lecture you about how to manage your teams or look at your own data.

Our customers tend to be skilled engineers who need the power and flexibility to compute specialized metrics to better understand their organization. If they don’t find this in the tools they see, then they’ll build it themselves.

They’re in good company too. Every big tech company we’re aware of (Amazon, Cisco, Uber, LinkedIn, to name a few) builds its own in-house development metrics, even though it’s a lot of work and hard to get right.

minQL offers a lifeline to engineering teams like yours with complex needs – giving you full control of your metrics without hiring a team of data engineers and analysts, or distracting your existing team from work that adds value for your customers.

A Basic Example

If you’re here, you’re probably an engineer who wants less talking and more code.

Without further ado, here is a simple minQL statement to show the number of tickets in progress or review over time:

ticket.duration('status').filter(
  value.in('In Progress', 'In Review')
)

It retrieves all of the time intervals for the ticket status property, and filters only those where the status is _In Progress _or In Review. (Here the filter just references the status value, but you can filter by arbitrary functional expressions on the input row.)

minQL then sums the total number of days covered by those intervals by default (e.g., 2 tickets that are in progress for 1 day each = 2).

You put this into a chart as the Y-axis value, and voila:

Screenshot of custom report with work in progress by team

Joins Made Easy

Default Join Paths

You may notice that the chart above also breaks the data down by team. How did we do that? With this minQL formula:

person.get('team')

Wait a second, we didn’t specify a join, how does it know how to get the person?

minQL includes default join paths. So, if you don’t specify the joins, then it uses the most common path rather than showing you empty values. In this case, that’s the current ticket assignee during the time interval.

This means that if a ticket is reassigned in the middle of a time interval, the default join will automatically allocate it to the right person, without any additional code!

Nested Joins

What if you don’t want the assignee, and instead want the reporter’s team? That’s easy too:

ticket.get('reporter').get('team')

You can nest references this way as deeply as you want. You can even pick an item out of a list or flatten a list into multiple rows for a to-many join.

Historical Joins

You may also wonder whether the above statement shows the reporter’s current team or the team they were on while the work was in progress. The default is the current team, but you can show the reporter’s team while the ticket was in progress like this:

ticket.getOrig('reporter').getOrig('team')

This will also use the original reporter during the time interval if it happens to change.

Joins in Functional Expressions

Going back to the original example, you can also do joins in functional expressions.

What if you want to see time in review, but one team uses Waiting to indicate that a task is in review, while others use In Review, and you can’t include both statuses because Waiting means something different to them? (This is the type of complexity that comes up in real life.)

ticket.duration('status').filter(
  eq(
    value,
    iff(
      ticket
        .getOrig('assignee')
        .getOrig('team')
        .in('Team A'),
      'Waiting',
      'In Review',
    )
  )
)

If you’re unfamiliar with SQL, the query to intersect, filter, and aggregate multiple intervals like this is difficult to write and test. Even if you are familiar with SQL, it’s still a lot of work, and there are probably better ways to spend your time.

Complex Event Sequences, Simplified

One major limitation of time-series query languages like NewRelic’s NRQL is that you can only apply time series analysis to a single data source.

With software development metrics, it’s common to have event sequences spanning tickets, epics, commits, pull requests, and reviews.

We won’t get too deep into advanced functionality in this article, but it’s important to illustrate the power of minQL’s event sequences.

Multi-Channel Sequence Example

This example query counts the number of times that there is a rejected review after a ticket is reopened and before it is closed again:

merge(
  ticket.onChange('statusCategory')
    .filter(oldValue.in('Done')),
  review.event('created')
    .filter(review.get('status').in('rejected'))
    .join(ticket, review.get('branch').get('ticket')),
  ticket.onChange('statusCategory')
    .filter(newValue.in('Done'))
).stateMachine(
  state.on(0, 1),
  state.on(1, 2).on(2, 0),
  state.ouputEvent().on(2, 0)
)

There are a few things to explain for this query, which we cover in the following sections.

Event Types

First, we only saw durations previously, but minQL exposes a variety of object events, as well as change events for updates to any field.

You can see references in the above example to ticket status category changes and pull request review creation.

On-change events include both the old and new values in the scope for functional expressions, which we use here to filter changes away from the Done status category and back to the Done status category.

Merging Events from Different Object Types

Next, we want to merge these ticket and review events into one stream. To do that, we need a way of linking them together. In this case, we specify a join path for the review event to get its ticket. This lets minQL correctly link it using the default join key, which is the ID of the first event’s object type (ticket ID in this case).

You can also specify join keys for all inputs using arbitrary expressions. For example, you could instead join both the ticket and review to the assignee and PR author, respectively. This would show the time from any ticket being reopened to any rejected review for the same person.

State Machines

Once all the input events have been merged, minQL uses a state machine to generate output events. If you’re not familiar with the term state machine, it’s just a fancy way of saying multi-step process. The state machine above works as follows (the indexes in minQL are 0-based, but we describe them here as 1-based for simplicity):

  1. When the ticket status category changes away from Done (input event 1), go to step 2. (Ignore other inputs while in this step.)
  2. When a rejected review occurs (input event 2), go to step 3. If the ticket is closed again first (input event 3), then go back to step 1.
  3. Emit an output event when arriving at this step. When the ticket is closed again (input event 3), go back to step 1.

As you can see, this state machine will emit at most one output event each time a ticket is reopened.

If you instead want to only emit an event the first time a ticket is reopened, you could drop the .on(2, 0) from the last state. If you wanted to output an event for each rejected review while the ticket was reopened, you could add .on(1, 2) to the last state.

Also, note that minQL will run a separate state machine instance for each join key specified in the previous merge step (ticket ID in this example).

Count, Value, Duration, and Time-Length Sequence Outputs

The example sequence above just outputs a count of events. minQL also lets you output values from other fields, such as the story point estimate. You would do this by adding .value(ticket.get('story points')) to the review event above.

Sequences can also output durations. To output a time duration from ticket reopening to rejection, you’d simply change outputEvent() to outputDurationSinceLast(1), indicating that the state machine should emit a duration starting at the time when it last reached step 2 and ending when it arrives at step 3.

This can be useful to know how much time was spent between these events rather than just counting the number of occurrences.

With durations, you can further break down these durations by any other values (e.g., current assignee, ticket status) and minQL will allocate the time accordingly.

Finally, you may want to output the length of the time interval between two events, which you can do here by changing outputEvent() to outputTimeSinceLast(1). The subtle difference compared to a duration is that this will output the total interval length at the end of the interval to prevent it from splitting if attributes change during the interval.

This is helpful for measuring cycle times, where cutting the cycle time value in half if a ticket is reassigned and allocating half to each assignee wouldn’t make sense. The time-length output will just report the whole interval length using other attributes as of the end of the interval.

The Bottom Line: minQL vs. SQL

In this article, we’ve walked through increasingly complex examples showing how minQL offers simple and flexible queries for interconnected time-series data sources like tickets, code reviews, and more.

Implementing simple scenarios like computing in-progress tickets by team is somewhat easier in minSQL than SQL. This can put self-service queries within reach for a non-engineer or save thirty minutes for someone proficient in SQL.

However, the state machine example in this article (which is still a lot simpler than more advanced minQL scenarios) could totally blow up a sprint estimate if done in SQL even by an expert. If you were to replicate this logic with joins for each step – a difficult task in its own right – you might run into severe query performance issues and have to bring in a data engineer to create a custom user-defined table function. (If you don’t know what this means, trust us, it’s bad.)

Any minQL query is possible to write in SQL, and SQL can do things that minQL can’t. But, if you’re looking for the best way to implement non-trivial engineering metrics for your organization, give minQL a try.

Engineers First
Measure the business impact of things that slow down engineering