The importance of benchmarks
Late last year, the Aha! Develop team added support for team line-level reporting. During a team demo in the run-up to the release, we discovered one of our internal sprint reports was taking 15 minutes to load, almost freezing the browser in the process. The affected report contained almost no data — it was for a single, nearly defunct team with a forgotten active sprint that was 1,000 days long. That oddity kick-started our curiosity and triggered a deep dive into how we were rendering sprint reports, resulting in a more than 100-fold performance increase.
Aha! Develop reports work by first loading raw record events, then converting those into report-specific data (as documented in this blog post). We would expect load time to be directly correlated with the number of events, but this report contained far fewer events than other reports that loaded fine. Digging further, we quickly found the issue wasn't isolated to a specific report — it was affecting all the sprint reports for the team (retrospective, burndown, burnup, and velocity). This was our first smoking gun. Sprint reports are very tightly coupled, so they all use the same sprint progress format to ensure consistency. Because the only reports affected used that format, we knew the issue had to be coming from the progress-parsing step. But where?
For a quick primer, the intermediate data structure looked a bit like this at the time of the investigation:
// All information needed to render a chart as-at a particular point in time
// Performance tracks committed/remaining/completed against records/estimate/effort
class ParsedSprintProgressDay {
start: Date;
events: RecordEvent[]
users: { [userId: string] : [User, Performance] };
team: Performance;
records: Set<Feature|Requirement>;
constructor(start: Date) { ... }
clone(newStart: Date) { ... }
}
The basic algorithm for parsing progress followed roughly these steps:
// Initialize the range and populate it from the start events
let currentDay = new ParsedSprintProgressDay(sprintStartDate);
const sprintData: ParsedSprintProgressDay[] = [currentDay];
filteredSprintStartEvents.forEach(event => {
processEvent(event, currentPeriod);
});
// Iterate through the remaining valid events within the sprint period
remainingEvents.forEach(event => {
// Fill the range until the event falls inside the current period
while (event.createdAt > moment(currentDay.start).endOf('day')) {
const newStart = moment(currentDay.start).add(1, 'day');
currentDay = currentDay.clone(newStart.toDate())
sprintData.push(currentDay);
}
processEvent(event, currentDay);
});
return sprintData;
The resulting sprint data would look something like this:
// Example values not representative of actual id/kind values
[{
start: "2026/03/02",
events: [
{ kind: "Committed", record: { id: "Feature-1", ... }, initialEstimate: 3 }, status: { id: "Status-Pending" } },
{ kind: "Committed", record: { id: "Feature-2", ... }, initialEstimate: 1 }, status: { id: "Status-InProgress" } },
],
users: {},
team: { committed: { count: 2, initialEstimate: 4, workDone: 0, completed: 0, remainingEstimate: 4 } },
records: [{ id: "Feature-1", ... }, { id: "Feature-2", ... }]
},
{
start: "2026/03/03",
events: [],
users: {},
team: { committed: { count: 2, initialEstimate: 4, workDone: 0, completed: 0, remainingEstimate: 4 } },
records: [{ id: "Feature-1", ... }, { id: "Feature-2", ... }]
},
{
start: "2026/03/04",
events: [
{ kind: "Added", record: { id: "Requirement-1", ... }, initialEstimate: 2, status: { id: "Status-Pending" } },
{ kind: "AssignedToUser", record: { id: "Feature-2", ... }, user: { id: "User-1", ... }, initialEstimate: 1 },
{ kind: "Completed", record: { id: "Feature-2", ... }, initialEstimate: 1, status: { id: "Status-Done" } },
],
users: {
"User-1": {
completed: { count: 1, initialEstimate: 2, workDone: 2, remainingEstimate: 0 },
}
},
team: {
committed: { count: 2, initialEstimate: 4, workDone: 0, remainingEstimate: 4 },
added: { count: 3, initialEstimate: 6, workDone: 0, remainingEstimate: 6 }, // added includes committed records
completed: { count: 1, initialEstimate: 2, workDone: 2, remainingEstimate: 0 },
},
records: [{ id: "Feature-1", ... }, { id: "Feature-2", ... }, { id: "Requirement-1", ...}]
}]
Because the problem report had relatively few events but a huge number of days, it logically followed that the issue had to lie in how we were managing the days. From the above code, we see that we clone the data for each day in the sprint, regardless of whether that day is interesting or not. Changing that behavior to skip empty days would stop this report failing — but why, exactly, was it as slow as it was?
Solution #1: Improving the algorithm
I'd been working in this area shortly before the investigation started, and I was confident the root cause was a memory problem. The data structure was highly inefficient, duplicating days without events and storing far more information than necessary (we only ever took the users and records from the last bucket). My hypothesis was this:
-
Memory usage was scaling with the number of days. Later days were more expensive; as time went on and new records were added to the sprint, they were stored for that day and all later days.
-
At some number of days, the level of memory usage was becoming an issue and causing the browser slowdown we were seeing. Each new day was both slower to clone and made every day after it slower to clone.
-
This would explain the issue we were seeing: that the same number of events over 1,000 days was orders of magnitude slower than over seven, especially when the number of events was far more than the number of days in either case.
I couldn't see any other explanation for the slowness, and this would fix the original problem — the number of nonempty days in the report was on the same scale as multi-team reports, which we knew could parse successfully. Most importantly, this hypothesis gave me an excuse to tidy some code I'd had my eyes on for a while. I jumped in while my colleagues kept investigating.
The resulting structure looked something like this:
// As before, but no records or users
class ParsedSprintProgressDay {
start: Date;
events: RecordEvent[]
users: { [userId: string] : [User, Performance] };
userPerformance: { [userId: string] : Performance };
team: Performance;
records: Set<Feature|Requirement>;
constructor(start: Date) { ... }
clone(newStart: Date) { ... }
}
// Top level container for progress days - has at least one day
// Days without events are stored as null
class ParsedSprintProgressData {
start: Date;
days: [
ParsedSprintProgressDay,
...(ParsedSprintProgressDay | null)[]
];
records: Set<Feature|Requirement>;
users: { [userId: string]: User }
// Returns the last non-null day with an index <= the given index
getDataAt: (index: number) => ParsedSprintProgressDay;
// Creates and returns a new day, adding intervening nulls
getEventDay: (eventDate: Date) => ParsedSprintProgressDay;
}
This was conceptually a fairly small change, but I quickly got caught in the weeds. Changing the structure so drastically led to cascading effects downstream, and the real code had a lot of complexities that took a while to hammer out. When it was done, I had a fix for the failing report — but my change didn't actually prove or disprove the hypothesis it was meant to solve. Fortunately, my team hadn't been idle.
Testing the hypothesis
While I was working on that change, my colleague had built a simple benchmarking suite. It generated random datasets at different scales, ran the parsing code against it, and output the results, showing both total time-to-run and the approximate heap memory usage. This gave us a way to verify that any changes we made outperformed what was already there. It also helped us make sure we weren't improving behavior for one case (few events over many days) at the cost of other cases.
The initial results against my change were mixed. Memory usage was down up to 90% on the larger datasets, but that only led to a 20% improvement in runtime, and this improvement was the same regardless of the number of days. The new algorithm was meaningfully faster for the case where most days were empty. So it did solve the original problem, but my hypothesis was dead in the water.
However, my colleague had his own change. Besides building the benchmarking suite, he'd also used the browser profiling tools on the parsing code, looking for easy fixes that could improve performance (something I hadn't actually done before jumping into my change). What he found was that instead of the profile we would expect — most of the work being done in event parsing, and some in the day cloning — the algorithm actually spent almost all of its total runtime inside two seemingly innocuous helper methods.
Solution #2: Speeding up hot paths
The first was lodash.cloneDeep(). We used this to simplify copying the nested performance objects within the day objects. These were typed objects with a known structure, so it was used purely for convenience without us realizing how much of a performance overhead it introduced. I had inadvertently reduced the number of calls to cloneDeep when I edited the data structure to reduce redundancy. That is actually the most likely cause of the small performance improvement my algorithm introduced.
The other was even more surprising: moment.format. We use moment to ensure we correctly handle users with different time zone preferences when rendering reports. But we know it is expensive and take care to limit how often we instantiate new moments or perform operations on them. However, we used moment.format throughout the parsing code to convert timestamps to human-readable strings, unaware that it was equally expensive. Because we used it to format the timestamps for every event in the data, this overhead had a drastic impact — even on sprints with relatively few days.
For cloneDeep, the replacement was simple. Sprint data had a known shape without arbitrary nesting, so we could replace calls to cloneDeep with explicit duplication:
clone(newStart: Date): ParsedSprintProgressDay {
const newClone = new ParsedSprintProgressDay(newStart);
newClone.team = this.team.clone();
// Calls deepClone on each performance in the team
newClone.users = this.users.deepClone();
newClone.records = this.records.deepClone();
// Explicitly creates a new performance and assigns to it with the spread operator
Object.entries(this.userPerformances).forEach(([userId, performance]) => {
newClone.userPerformances[userId] = performance.clone();
});
// No need to clone records
newClone.records = this.records;
return newClone;
}
Because we were only using an ISO 8601 date format, replacing moment.format was equally trivial. We created our own explicit formatting method:
function formatDate(date: Date | string): string {
const d = typeof date === 'string' ? new Date(date) : date;
const year = d.getFullYear();
const month = String(d.getMonth() + 1).padStart(2, '0');
const day = String(d.getDate()).padStart(2, '0');
return `${year}-${month}-${day}`;
}
const dateString = moment(event.date).format('YYYY-MM-DD');
const dateString = formatDate(event.date);
Our preferred solution: Combined approaches
Fortunately, this wasn't a case where we had to choose one approach or the other. We could easily integrate the removal of those expensive methods into the improved data structure, resulting in code that has the best of both approaches. The algorithm uses less memory, is consistently faster in all cases, and days without events have almost no impact on total parsing time.
The report that first triggered this investigation now finishes parsing in a fraction of a second — an improvement of three orders of magnitude. Unfortunately, it still takes a few seconds to render overall because it has to load the raw events. I have some improvements in mind for that as well, but you can bet that I'll be profiling everything first this time.
If you like deep dives and making things go fast, join Aha! See our open engineering roles.
