Why do stories keep spilling over into the next sprint?

Spillover usually isn't caused by poor effort — it's caused by poor visibility. Most Jira setups don't show you where work actually loses time or why certain work types are consistently slower than expected. The dashboard summary in Flow Intelligence is the starting point: it gives you a predictability verdict based on your last six sprints, and a Coefficient of Variation score that tells you how consistent your delivery actually is.

What is a good Coefficient of Variation (CV) for a Scrum team?

Below 30% is highly predictable — delivery times are consistent and forecasts are reliable. Between 30–50% is moderately predictable, where forecasting still works but with some variation. Between 50–70% means forecasts become unreliable. Above 70% means delivery times vary so wildly that estimates are essentially guesses. The key insight is that high CV doesn't mean everything is slow — it means a handful of items take much longer than expected, and you can't predict which ones.

How do you use flow metrics in a sprint retrospective?

Before the retro, open Flow Intelligence and check your predictability verdict, key flow metrics, and the Trends tab. Click Ask Noesis on the tab with the most interesting findings — it gives you instant analysis, identifies root causes across your sprint data, and suggests specific experiments to try. You walk into the retro with data, root cause analysis, and concrete proposals — so the team can focus on deciding what to change rather than debating what went wrong. Pick one experiment , run it for two sprints, and check whether the predictability score improved.

What does Flow Efficiency mean and why does it matter?

Flow Efficiency is the percentage of time work was actively being worked on versus sitting idle in a queue. If your Flow Efficiency is 30%, items spend 70% of the time waiting to be worked on. That's where the real leverage is. A declining Flow Efficiency trend points to growing queues — review or testing bottlenecks, or work getting stuck in handoffs between team members.

How do you find the root cause of high cycle time variation in Jira?

Start with the Flow Intelligence - Predictability tab and look at the "Driven by" breakdown — it shows which story point sizes or work types are contributing the most variation. Click the group with the highest CV to drill into the scatter plot, where outliers become visible in both directions. Then click any outlier dot to open the Time in Status panel , which shows exactly which status the item spent the most time in — that's how you go from "this item is an outlier" to "this is why it's an outlier."

How does Noesis help teams who aren't familiar with flow metrics?

Every report tab in Flow Intelligence has a Learn more button next to Ask Noesis. This opens a guided learning path tailored to the report you're looking at — explaining how to read the data, what healthy versus concerning patterns look like, and how to make the case for change with stakeholders. The team doesn't need prior training in flow metrics — Noesis meets them where they are on whatever tab they're looking at.

What is the difference between the Trends tab and the Predictability tab in Flow Intelligence?

The Trends tab shows you how your four key metrics — Cycle Time, Throughput, WIP, and Flow Efficiency — have moved across the last six sprints. It answers whether things are getting better, worse, or stuck. The Predictability tab drills into which specific work types or story point sizes are driving variation in the current data — and why. Trends shows you the direction of travel; Predictability shows you what's causing it.

Predictive Planning Poker Why Is My Sprint Unpredictable? (And How to Fix the Spillover)

18th Mar 2026

Why Is My Sprint Unpredictable? (And How to Fix the Spillover)

Q: Can you catch sprint unpredictability before the sprint starts?

Yes — and that's where it has the most impact. Flow Intelligence shows you what happened after the fact, but the session debrief in Planning Poker surfaces high-variability stories during refinement, before the sprint kicks off. See how Predictability starts before the sprint does

A practical walkthrough — from install to your first data-backed retro.

Your sprints keep missing their their targets. Stories spill over into the next sprint. Some work flies through in a day, other items drag on for a week, and nobody can explain why. You've tried longer planning sessions, stricter acceptance criteria, even cutting scope — but the spillover keeps happening.

The problem usually isn't effort. It's visibility. You can't fix what you can't see — and most Jira setups don't show you where work actually loses time or why certain work types are consistently unpredictable.

That's what Flow Intelligence is for. It's a reporting and analytics module built into Smart Guess's Jira apps — available in Predictive Planning Poker, Time in Status, and the Free Planning Poker app. With it, you can see exactly where work loses time, understand why certain items take longer than expected, and decide what to change next — backed by your team's actual delivery data and an AI agile coach that analyzes your patterns, identifies root causes, and suggests experiments your team can try.

This walkthrough covers how to go from first open to running your first data-backed retrospective.

Step 1: Read the Dashboard Summary

When you open Flow Intelligence, the top of the dashboard answers the big question first: How predictable is your team?

The Predictability section looks at your last 6 sprints and gives you a verdict — something like "Your Delivery Process is Moderatly Predictable" — with guidance on where to focus improvements. Below that, you'll see the Overall Process Variation (CV) as a single percentage, along with a Variation by Metric breakdown showing CV for Throughput, Work in Progress, and Cycle Time individually. You can toggle between Story Points and Work Type views to see where the variation lives.

Below the predictability section, four Key Flow Metrics cards show the sprint-level numbers: Cycle Time, Work In Progress, Throughput, and Flow Efficiency.

Each one shows the current value, a sparkline trend, and a percentage change versus the previous sprint.

Cycle Time — average time to complete work. If it's rising, something is slowing down.
Work In Progress — average items in progress at the same time. High WIP is the most common cause of rising cycle time. When the team juggles too many things, everything takes longer.
Throughput — issues completed. This is the output metric. It should be stable or rising — but only if cycle time isn't rising with it.
Flow Efficiency — the percentage of time work was actively being worked on versus sitting in a queue. If this number is 30%, items spend 70% of the time waiting to be worked on. That's where the leverage is.

What to look for

The overall CV number. This is your headline metric. Flow Intelligence uses Coefficient of Variation (CV) to measure how consistent your delivery is — lower means more predictable. Here's how to read it:

Below 30% — Highly Predictable. Consistent delivery times. Your forecasts are reliable.
30–50% — Moderately Predictable. Some variation, but forecasting still works.
50–70% — Low Predictability. Significant inconsistency. Forecasts become less reliable.
Above 70% — Unpredictable. Delivery times vary wildly. Estimates at this level are essentially guesses.

In practical terms: at 25% CV, if your 85th percentile says "done within 3 days," you can trust that number. At 70% CV, that same "3 days" forecast is unreliable — cycle time data is typically right-skewed, meaning a few items drag out far longer than the rest. High CV doesn't mean everything is slow. It means a handful of items take much longer than expected, and you can't predict which ones.

Which metric is driving the variation. If Throughput CV is 82% but Cycle Time CV is only 16%, the problem isn't that individual items are slow — it's that the number of items completed swings wildly sprint to sprint. That usually points to capacity changes, spillover, or unplanned work disrupting the sprint — not execution speed.

The trend across sprints. Is your overall CV improving? Or has it plateaued? This is the number you bring to your retro — not to blame anyone, but to track whether your process changes are actually working.

Step 2: See the Bigger Picture with Trends

The dashboard summary shows you this sprint. The Trends tab shows you the last six sprints side by side — so you can see whether things are getting better, worse, or stuck.

Open the Trends tab and you'll see a single chart plotting all four key metrics — Cycle Time, Throughput, WIP, and Flow Efficiency — across your recent sprints. Each metric gets its own line, so you can see how they move relative to each other.

What to look for

Throughput dropping while WIP stays flat (or rises). The team is starting work but not finishing it. This usually means too many things in progress at once, or work getting blocked in late-stage queues.

Cycle time climbing while throughput holds. Items are taking longer, but the team is still pushing the same number through. That's unsustainable — either quality is slipping or people are working longer hours to compensate.

Flow efficiency trending down. Work is spending more time waiting. This points to growing queues — review or testing bottlenecks, or work getting stuck in handoffs between team members.

A sudden spike or drop in any metric. Something changed. Maybe a team member went on leave. Maybe a wave of bugs landed. Maybe mid-sprint scope changes disrupted the plan. The spike itself isn't the insight — the conversation about why it happened is.

The Trends tab is the one you pull up in retrospectives to show the team what's actually changing over time, rather than relying on memory of how the last few sprints felt.

Put your data to work in your next refinement

Try Free Planning Poker

Install free

Run refinement that catches drift before it starts

Read the guide

Step 3: Drill into Predictability

The dashboard summary gave you the overall CV. The Predictability tab lets you drill into which work is predictable and which isn't — and why.

The heading asks: Cycle Time Predictability — Which work types are predictable? You can toggle between Story Points and Work Type to see the breakdown differently.

Start with the Story Points view. This is where you should focus your improvement efforts. Work types will naturally have different cycle times — bugs behave differently from stories, and that's expected. But within a given story point size, your team should be delivering consistently. If 1-point items all take roughly the same time, your estimation is working. If they don't, the problem isn't the work — it's the process around it.

You'll often see a striking difference between the two views. The same data might show 16% CV by Story Points (Highly Predictable) but 60% CV by Work Type (Low Predictability).

That gap tells you something important: when your team sizes work with story points, items of the same size get delivered consistently. But within a work type — stories, bugs, tasks — cycle times are all over the place. Some bugs take half a day, others take five. That's expected, because work types mix different sizes and complexities together. Story point grouping strips that noise away and shows you whether your process is actually consistent.

The summary panel

At the top you'll see the Cycle Time Predictability (CV) as a percentage with a color indicator — green for highly predictable, yellow for moderate, red for unpredictable.

Next to it, a forecast panel shows How long does it typically take? with three numbers:

Likely — the median duration (50th percentile)
Plan for — a safer estimate (85th percentile), shown as a multiple of the median
Worst — the pessimistic bound (95th percentile)

These forecasts are only as trustworthy as the CV suggests. A "Plan for 3.1 days" at 16% CV is solid. The same number at 70% CV is unreliable.

The "Driven by" breakdown

Below the summary, you'll see which story point sizes or work types are contributing the most variation. Each row shows the item count, its share of total CV, and its own CV percentage with a color indicator.

This is where you find actionable patterns. You might see 1-point items at 46% CV contributing 25% of total variation, while 2-point and 3-point items sit comfortably in the green. That tells you the smallest items — the ones the team assumes are quick — are actually the least consistent. Maybe some "1-pointers" are genuinely trivial while others hide unexpected complexity. That's a conversation worth having in refinement.

The scatter plot

At the bottom, a cycle time scatter plot shows every completed issue plotted by date and duration, with percentile lines at the 50th, 85th, and 95th. By default it shows all items, color-coded by story point size.

To understand what's causing the variation, drill down. Click the group with the highest CV in the "Driven by" breakdown. If 1-point items are at 46% CV, click into them.

Now the outliers become meaningful — and they go both ways. A dot above the 95th percentile line is an item that should have been quick but wasn't — maybe it hit a blocker or was undersized in refinement. But look for dots near zero too. An item completed in minutes that was estimated the same as others taking a full day is just as much an outlier — it might have skipped steps in the workflow, or it was trivially small and shouldn't have been sized the same. Both extremes widen the variation.

The Time in Status panel

Click any dot in the scatter plot to open the issue and the Time In Status panel. From there you will see a detailed breakdown of where time was actually spent.

You'll see a summary with cycle time split into working time and wait time, plus how long the item sat in the backlog before anyone touched it. Below that, a status flow shows the path the issue took — which statuses it moved through and how long it spent in each. A timeline at the bottom shows every transition with timestamps.

That's how you go from "this item is an outlier" to "this is why it's an outlier."

Step 4: Find the Root Cause — Plan the Next Experiment

You've seen the data. You can see the patterns. Now you need to decide what to do about it.

This is where you bring in Noesis — an AI agile coach that opens in the sidebar. Every report tab has an Ask Noesis button. Click it, and Noesis works with exactly the data you're looking at — to give you instant analysis, identify root causes across your sprint data, translate the statistics into plain language your whole team can follow, and suggest concrete experiments with measurable outcomes.

How it works

Say you've drilled into 1-point items on the Predictability tab and see 48% CV. Click Ask Noesis and it automatically builds a detailed prompt with your data — the CV, the median cycle time, the item count — and asks what specific actions you should take to improve consistency. No typing required.

Noesis responds with a structured analysis: which work types within that group are driving the variation, what the root cause likely is, and concrete recommendations. For example, it might find that Tasks and Stories share the same median cycle time but Tasks are far more scattered — suggesting an inconsistent definition of "done" for Tasks rather than actual complexity differences. Some tasks complete in minutes while others take over a day, and those probably shouldn't share the same size.

The recommendations are specific and measurable: items completing in under half a day probably shouldn't share a sizing bucket with items that take 1.5 days — split those into different sizes, define explicit criteria for what qualifies as 1-point work, or grow your sample size before drawing firm conclusions.

You can ask follow-up questions to dig deeper. Noesis keeps a persistent conversation history per project and per board, so you can pick up where you left off next sprint.

What if my team isn't familiar with flow concepts or the statistics?

Every report tab has a Learn more button next to Ask Noesis. This isn't a link to a help doc — it opens a guided learning path powered by Noesis, tailored to the report you're looking at.

On the Trends tab, it teaches you how to read trend patterns, what healthy versus concerning trends look like, and how metrics like cycle time, throughput, and WIP relate to each other through Little's Law. On the Predictability tab, it explains how to read the scatter plot and what high versus low variation means for different work types. On the Cumulative Flow tab, it walks you through how to spot bottlenecks and WIP accumulation from the shape of the chart.

Each learning path starts with the fundamentals and then goes deeper — from reading your data, to taking your first improvement actions, to making the case for change with stakeholders. The team doesn't need prior training in flow metrics. They just need to click Learn more on whatever they're looking at, and Noesis meets them where they are.

Step 5: Run Your First Data-Backed Retro

This is where it all comes together. Instead of walking into the retrospective with opinions about what went wrong, you walk in with data.

Before the retro - See what's wrong

Open Flow Intelligence. Check the predictability verdict and the four flow metrics at the top — note any changes from last sprint. Open the Trends tab — are the metrics moving in the right direction? Switch to Predictability — which story point sizes or work types are driving the variation? Then click Ask Noesis on the tab with the most interesting findings and review the suggested experiments.

You now have three things most retros lack: where the bottleneck is, what's causing it, and what to try next.

During the retro - Fix it

Share the predictability score or flow efficiency with the team. It's a single number that captures sprint health without pointing fingers. Then show the Trends chart — let the team see the trajectory over the last six sprints. Finally, share one or two Noesis suggestions as discussion starters.

The conversation shifts from "I felt like we were slow this sprint" to statements grounded in data: "our throughput dropped 40% while WIP stayed the same," or "our reviews take 3+ days, pushing cycle time beyond a week," or "testing changes in part X of our code consistently extends beyond 2 days." Each one points to a specific bottleneck with a specific experiment to try.

After the retro - Repeat

Pick at least one experiment. Run it for two sprints. Then come back to Flow Intelligence and check: did the predictability score improve? Did spillover decrease? Did the bottleneck shift? The Trends tab will show you whether the change made a difference — this is how you close the loop.

Beyond These Two Reports

This walkthrough focused on Trends and Predictability — the two reports that give you the clearest picture of sprint health and delivery consistency. But Flow Intelligence has three more report tabs worth exploring as you get comfortable:

Time in Status — shows where every issue spent its time, broken down by status. This is where you find systemic queues and waiting time. We cover this in depth in our Time in Status walkthrough.

Cumulative Flow — a stacked area chart that shows how work moves through stages over the course of a sprint. Widening bands mean work is piling up in a stage faster than it's being drained.

Sprint Health — an automatic health check that flags planning issues (scope added mid-sprint), execution issues (long cycle times, WIP overload), and delivery issues. Especially useful mid-sprint to catch problems before they cause spillover.

Predictability starts before the sprint does

Flow Intelligence shows you what happened. But the earlier you catch uncertainty, the less it costs you.

The session debrief in Planning Poker surfaces high-variability stories during refinement — before they're in the sprint, before the team has started work, while there's still time to slice, clarify, or simply have a better conversation about what you're actually going to build.

Most sprint problems are visible in the estimates. You just need the right lens to see them.

→ How to run effective Planning Poker refinement in Jira — and catch uncertainty before it becomes spillover.

Share with a friend

Why Is My Sprint Unpredictable? (And How to Fix the Spillover)

Step 1: Read the Dashboard Summary

What to look for

Step 2: See the Bigger Picture with Trends

What to look for

Put your data to work in your next refinement

Try Free Planning Poker

Run refinement that catches drift before it starts

Step 3: Drill into Predictability

The summary panel

The "Driven by" breakdown

The scatter plot

The Time in Status panel

Step 4: Find the Root Cause — Plan the Next Experiment

How it works

What if my team isn't familiar with flow concepts or the statistics?

Step 5: Run Your First Data-Backed Retro

Before the retro - See what's wrong

During the retro - Fix it

After the retro - Repeat

Beyond These Two Reports

Predictability starts before the sprint does

Predictive Planning Poker

Topics covered in the series

Popular posts

Explore More Ways to Improve Delivery

Flow Intelligence

Predictive Planning Poker