
Why Is My Sprint Unpredictable? (And How to Fix the Spillover)
A practical walkthrough — from install to your first data-backed retro.
Your sprints keep missing their commitments. Stories spill over into the next sprint. Some work flies through in a day, other items drag on for a week, and nobody can explain why. You've tried longer planning sessions, stricter acceptance criteria, even cutting scope — but the spillover keeps happening.
The problem usually isn't effort. It's visibility. You can't fix what you can't see — and most Jira setups don't show you where work actually loses time or why certain work types are consistently unpredictable.
That's what Flow Intelligence is for. It's a reporting and analytics module built into Smart Guess's Jira apps — available in Predictive Planning Poker, Time in Status, and the Free Planning Poker app. With it, you can see exactly where work loses time, understand why certain items take longer than expected, and decide what to change next — backed by your team's actual delivery data and an AI agile coach that analyzes your patterns, identifies root causes, and suggests experiments your team can try.
That's what Flow Intelligence is for. It analyzes your team's delivery data — cycle times, queues, variation patterns — and pairs it with an AI agile coach that tells you why work takes longer than expected and what to try next. Your sprint data, your team's delivery patterns, and actionable guidance — all in one place.
This walkthrough covers how to go from first open to running your first data-backed retrospective.
The Predictability section looks at your last 6 sprints and gives you a verdict — something like "Your Delivery Process is Moderatly Predictable" — with guidance on where to focus improvements. Below that, you'll see the Overall Process Variation (CV) as a single percentage, along with a Variation by Metric breakdown showing CV for Throughput, Work in Progress, and Cycle Time individually. You can toggle between Story Points and Work Type views to see where the variation lives.
Below the predictability section, four Key Flow Metrics cards show the sprint-level numbers: Cycle Time, Work In Progress, Throughput, and Flow Efficiency.
Each one shows the current value, a sparkline trend, and a percentage change versus the previous sprint.
- Cycle Time — average time to complete work. If it's rising, something is slowing down.
- Work In Progress — average items in progress at the same time. High WIP is the most common cause of rising cycle time. When the team juggles too many things, everything takes longer.
- Throughput — issues completed. This is the output metric. It should be stable or rising — but only if cycle time isn't rising with it.
- Flow Efficiency — the percentage of time work was actively being worked on versus sitting in a queue. If this number is 30%, items spend 70% of the time waiting to be worked on. That's where the leverage is.
What to look for
The overall CV number. This is your headline metric. Flow Intelligence uses Coefficient of Variation (CV) to measure how consistent your delivery is — lower means more predictable. Here's how to read it:
- Below 30% — Highly Predictable. Consistent delivery times. Your forecasts are reliable.
- 30–50% — Moderately Predictable. Some variation, but forecasting still works.
- 50–70% — Low Predictability. Significant inconsistency. Forecasts become less reliable.
- Above 70% — Unpredictable. Delivery times vary wildly. Estimates at this level are essentially guesses.
In practical terms: at 25% CV, if your 85th percentile says "done within 3 days," you can trust that number. At 70% CV, that same "3 days" forecast is unreliable — cycle time data is typically right-skewed, meaning a few items drag out far longer than the rest. High CV doesn't mean everything is slow. It means a handful of items take much longer than expected, and you can't predict which ones.
Which metric is driving the variation. If Throughput CV is 82% but Cycle Time CV is only 16%, the problem isn't that individual items are slow — it's that the number of items completed swings wildly sprint to sprint. That usually points to capacity changes, spillover, or unplanned work disrupting the sprint — not execution speed.
The trend across sprints. Is your overall CV improving? Or has it plateaued? This is the number you bring to your retro — not to blame anyone, but to track whether your process changes are actually working.
Step 2: See the Bigger Picture with Trends
The dashboard summary shows you this sprint. The Trends tab shows you the last six sprints side by side — so you can see whether things are getting better, worse, or stuck.
Open the Trends tab and you'll see a single chart plotting all four key metrics — Cycle Time, Throughput, WIP, and Flow Efficiency — across your recent sprints. Each metric gets its own line, so you can see how they move relative to each other.
What to look for
Throughput dropping while WIP stays flat (or rises). The team is starting work but not finishing it. This usually means too many things in progress at once, or work getting blocked in late-stage queues.
Cycle time climbing while throughput holds. Items are taking longer, but the team is still pushing the same number through. That's unsustainable — either quality is slipping or people are working longer hours to compensate.
Flow efficiency trending down. Work is spending more time waiting. This points to growing queues — review or testing bottlenecks, or work getting stuck in handoffs between team members.
A sudden spike or drop in any metric. Something changed. Maybe a team member went on leave. Maybe a wave of bugs landed. Maybe mid-sprint scope changes disrupted the plan. The spike itself isn't the insight — the conversation about why it happened is.
The Trends tab is the one you pull up in retrospectives to show the team what's actually changing over time, rather than relying on memory of how the last few sprints felt.
Step 3: Drill into Predictability
The dashboard summary gave you the overall CV. The Predictability tab lets you drill into which work is predictable and which isn't — and why.
The heading asks: Cycle Time Predictability — Which work types are predictable? You can toggle between Story Points and Work Type to see the breakdown differently.
Start with the Story Points view. This is where you should focus your improvement efforts. Work types will naturally have different cycle times — bugs behave differently from stories, and that's expected. But within a given story point size, your team should be delivering consistently. If 1-point items all take roughly the same time, your estimation is working. If they don't, the problem isn't the work — it's the process around it.
You'll often see a striking difference between the two views. The same data might show 16% CV by Story Points (Highly Predictable) but 60% CV by Work Type (Low Predictability).
That gap tells you something important: when your team sizes work with story points, items of the same size get delivered consistently. But within a work type — stories, bugs, tasks — cycle times are all over the place. Some bugs take half a day, others take five. That's expected, because work types mix different sizes and complexities together. Story point grouping strips that noise away and shows you whether your process is actually consistent.
The summary panel
At the top you'll see the Cycle Time Predictability (CV) as a percentage with a color indicator — green for highly predictable, yellow for moderate, red for unpredictable.
Next to it, a forecast panel shows How long does it typically take? with three numbers:
- Likely — the median duration (50th percentile)
- Plan for — a safer estimate (85th percentile), shown as a multiple of the median
- Worst — the pessimistic bound (95th percentile)
These forecasts are only as trustworthy as the CV suggests. A "Plan for 3.1 days" at 16% CV is solid. The same number at 70% CV is unreliable.
This is where you find actionable patterns. You might see 1-point items at 46% CV contributing 25% of total variation, while 2-point and 3-point items sit comfortably in the green. That tells you the smallest items — the ones the team assumes are quick — are actually the least consistent. Maybe some "1-pointers" are genuinely trivial while others hide unexpected complexity. That's a conversation worth having in refinement.
To understand what's causing the variation, drill down. Click the group with the highest CV in the "Driven by" breakdown. If 1-point items are at 46% CV, click into them.
Now the outliers become meaningful — and they go both ways. A dot above the 95th percentile line is an item that should have been quick but wasn't — maybe it hit a blocker or was undersized in refinement. But look for dots near zero too. An item completed in minutes that was estimated the same as others taking a full day is just as much an outlier — it might have skipped steps in the workflow, or it was trivially small and shouldn't have been sized the same. Both extremes widen the variation.
The Time in Status panel
Click any dot in the scatter plot to open the issue and the Time In Status panel. From there you will see a detailed breakdown of where time was actually spent.
You'll see a summary with cycle time split into working time and wait time, plus how long the item sat in the backlog before anyone touched it. Below that, a status flow shows the path the issue took — which statuses it moved through and how long it spent in each. A timeline at the bottom shows every transition with timestamps.
That's how you go from "this item is an outlier" to "this is why it's an outlier."
Step 4: Find the Root Cause — Plan the Next Experiment
You've seen the data. You can see the patterns. Now you need to decide what to do about it.
This is where you bring in Noesis — an AI agile coach that opens in the sidebar. Every report tab has an Ask Noesis button. Click it, and Noesis works with exactly the data you're looking at — to give you instant analysis, identify root causes across your sprint data, translate the statistics into plain language your whole team can follow, and suggest concrete experiments with measurable outcomes.
How it works
Say you've drilled into 1-point items on the Predictability tab and see 48% CV. Click Ask Noesis and it automatically builds a detailed prompt with your data — the CV, the median cycle time, the item count — and asks what specific actions you should take to improve consistency. No typing required.
Noesis responds with a structured analysis: which work types within that group are driving the variation, what the root cause likely is, and concrete recommendations. For example, it might find that Tasks and Stories share the same median cycle time but Tasks are far more scattered — suggesting an inconsistent definition of "done" for Tasks rather than actual complexity differences. Some tasks complete in minutes while others take over a day, and those probably shouldn't share the same size.
The recommendations are specific and measurable: items completing in under half a day probably shouldn't share a sizing bucket with items that take 1.5 days — split those into different sizes, define explicit criteria for what qualifies as 1-point work, or grow your sample size before drawing firm conclusions.
You can ask follow-up questions to dig deeper. Noesis keeps a persistent conversation history per project and per board, so you can pick up where you left off next sprint.
What if my team isn't familiar with flow concepts or the statistics?
Every report tab has a Learn more button next to Ask Noesis. This isn't a link to a help doc — it opens a guided learning path powered by Noesis, tailored to the report you're looking at.
On the Trends tab, it teaches you how to read trend patterns, what healthy versus concerning trends look like, and how metrics like cycle time, throughput, and WIP relate to each other through Little's Law. On the Predictability tab, it explains how to read the scatter plot and what high versus low variation means for different work types. On the Cumulative Flow tab, it walks you through how to spot bottlenecks and WIP accumulation from the shape of the chart.
Each learning path starts with the fundamentals and then goes deeper — from reading your data, to taking your first improvement actions, to making the case for change with stakeholders. The team doesn't need prior training in flow metrics. They just need to click Learn more on whatever they're looking at, and Noesis meets them where they are.
Step 5: Run Your First Data-Backed Retro
This is where it all comes together. Instead of walking into the retrospective with opinions about what went wrong, you walk in with data.
Before the retro - See what's wrong
Open Flow Intelligence. Check the predictability verdict and the four flow metrics at the top — note any changes from last sprint. Open the Trends tab — are the metrics moving in the right direction? Switch to Predictability — which story point sizes or work types are driving the variation? Then click Ask Noesis on the tab with the most interesting findings and review the suggested experiments.
You now have three things most retros lack: where the bottleneck is, what's causing it, and what to try next.
During the retro - Fix it
Share the predictability score or flow efficiency with the team. It's a single number that captures sprint health without pointing fingers. Then show the Trends chart — let the team see the trajectory over the last six sprints. Finally, share one or two Noesis suggestions as discussion starters.
The conversation shifts from "I felt like we were slow this sprint" to statements grounded in data: "our throughput dropped 40% while WIP stayed the same," or "our reviews take 3+ days, pushing cycle time beyond a week," or "testing changes in part X of our code consistently extends beyond 2 days." Each one points to a specific bottleneck with a specific experiment to try.
After the retro - Repeat
Pick at least one experiment. Run it for two sprints. Then come back to Flow Intelligence and check: did the predictability score improve? Did spillover decrease? Did the bottleneck shift? The Trends tab will show you whether the change made a difference — this is how you close the loop.
Beyond These Two Reports
This walkthrough focused on Trends and Predictability — the two reports that give you the clearest picture of sprint health and delivery consistency. But Flow Intelligence has three more report tabs worth exploring as you get comfortable:
Time in Status — shows where every issue spent its time, broken down by status. This is where you find systemic queues and waiting time. We cover this in depth in our Time in Status walkthrough.
Cumulative Flow — a stacked area chart that shows how work moves through stages over the course of a sprint. Widening bands mean work is piling up in a stage faster than it's being drained.
Sprint Health — an automatic health check that flags planning issues (scope added mid-sprint), execution issues (long cycle times, WIP overload), and delivery issues. Especially useful mid-sprint to catch problems before they cause spillover.