Exploratory data analysis with Power BI

Compilation of works and projects on data analysis

Exploratory data analysis with Power BI

Background

The support department has been getting complaints from customers on ticket handling speed and budget overrun. Both the project and support department share resources (developers) which never seem to be enough to fulfill the required allocated work. Management does not give credit to customer complaints for these reasons and argues that there are enough resources allocated to perform the upcoming (forecasted) work.

In this scenario, the first posed question is whether the workload forecasted by the developers is actually accurate. Knowing how reliable the time estimations are can help better predict the following weeks’ workload and, as a consequence, to make a better allocation of resources, allowing the company to fulfill promises made to customers on completion time and/or budget.

INDEX

  1. Accuracy variation
  2. Ticket bottleneck analysis
  3. Ticket categorization
  4. Conclussions and recommendations

Accuracy variation

The first metric chosen to analyze the scenario is the M.A.P.E:

The absolute part calculation has been implemented on Power BI on DAX on column level with the following code:

Accuracy variation =
ABS (
    1
        - DIVIDE (
            RELATED ( vwHoursBookedPerTicket[SumBillableHours] ),
            Tickets[Total Estimation],
            0
        )
)

Then, an average function can be called on different filtered tables to answer any business question:

Accuracy variation =
AVERAGE ( Tickets[Accuracy variation] )

Different visualizations were then built to illustrate the magnitude of the issue attending to different criteria.

Globally, per development team and per each support manager:

Dashboard 1

Detailed view (on drill down) for any of the given categories, breaking down the missed forecasted estimations on overestimates and underestimates:

Dashboard 2

Additionally to the Accuracy Variation KPI, a dashboard containing information on how much of the work has actually been estimated was necessary to understand how much the Accuracy Variation correlates with the actual situation on support.

Dashboard 3

The above data shows the magnitude of the situation in support, where 739 hours were estimated while over 2k hours were actually worked (1.3k + 718.7 hours). This means that the real work performed by support and the developers is around x3 times higher than what has been estimated.

Ticket bottleneck analysis

The second request posed by management was to understand where the incoming work was getting stuck and what could be done to clear possible bottlenecks, if any.

To follow up a ticket’s activity in real time, a new column that counts running stops (one stop is counted every time a ticket has to wait in a queie, ie; developer support, hosting, support managers…) had to be implemented:

AccumulatedStops =
VAR Ticket = TicketHistory[Work Item Id]
VAR CurrentDate = TicketHistory[Date]
VAR FilteredTable =
    FILTER (
        TicketHistory,
        TicketHistory[Work Item Id] = Ticket
            && TicketHistory[Date] <= CurrentDate
    )
RETURN
    CALCULATE ( SUM ( TicketHistory[Change column] ), FilteredTable )

Then, the current number of stops needs to be updated in the tickets table:

CurrentStops =
CALCULATE (
    MAX ( TicketHistory[AccumulatedStops] ),
    FILTER ( TicketHistory, TicketHistory[Work Item Id] = Tickets[Work Item Id] )
)

Finally, it is possible to count how many tickets are there per each number of stops, e.g. How many tickets required three stops to be resolved?

Count =
CALCULATE (
    COUNT ( Tickets[Work Item Id] ),
    FILTER ( Tickets, Tickets[CurrentStops] = CountofTicketStops[Number of stops] )
)

When plotting the data in a histogram, the following distribution was revealed:

Dashboard 4

Complementing the above, it was necessary to know where these tickets were actually waiting, in other words, which queues seemed to be the slowest, blocking their progress on resolution. For this it was necessary to analyze the distributions of waiting times depending on queues.

It was necessary tp categorize the queues or buckets and to classify each stop for each ticket accordingly:

Queue =
IF (
    TicketHistory[SPP SC priority] <> BLANK (),
    "Dev team",
    SWITCH (
        TicketHistory[Assigned To],
        "hosting", "Hosting",
        BLANK (), "Service Consultant",
        "Customer", "Customer",
        "AX-FO-Support", "ERP Support",
        "B1-Support", "ERP Support",
        "ECC-Support", "ERP Support",
        "GP-Support", "ERP Support",
        "NAV-BC-Support", "ERP Support",
        "Add-on support", "Add-On Support",
        "Core support", "Core support",
        IF (
            CONTAINS ( Tickets, Tickets[ServiceManager], TicketHistory[Assigned To] ),
            TicketHistory[Assigned To]
        ), "Service Consultant",
        "Other"
    )
)

The next step is to aggregate the data and to plot it:

Dashboard 5

Ticket categorization

Another question presented by the business was to better understand the work performed on support. Since this information was not readily available, customization were performed on the Azure DevOps platform to gather the missing data and a new project started.

Every support manager was requested to fill in two new fields added at ticket level to help categorize and describe the work being done on support.

After a few weeks, the first descriptive metrics could be presented:

Dashboard 6

Drilling down on the data:

Dashboard 7

And finally, an overall description of the most frequent issues being handled in support:

Dashboard 8

Conclussions and recommendations

A set of new KPIs has been proposed to help track resources efficiency and workload prediction: