If you can’t reproduce it next week, it isn’t sign-off evidence: reproducible formal verification results and why formal needs history

If you can’t reproduce it next week, it isn’t sign-off evidence: Why formal needs history

A single green run is not an argument. Reproducible formal verification results are the foundation of sign-off evidence: a durable record of what was proven, under which assumptions, on which revision, and what remains open. That is the difference between running formal and using formal to support a sign-off decision.

The failure mode: green does not mean proven

Formal can look like progress in a snapshot. Properties are written, proofs go green, activity is visible, and the dashboard appears healthy. Then the RTL changes, an assumption is tightened, a constraint file moves, or an abstraction is adjusted. At that point, the question that matters is no longer how much formal was run. It is whether anyone can still say with confidence what remains true in the current revision.

Teams typically get into trouble at this point. Stale formal results rarely fail loudly. They stay green while the design moves underneath them. A proof valid last week can still be quoted in a review this week, even if the design intent, proof boundary, or verification environment has changed. By the time a mismatch is discovered, the project is often already close to sign-off. At that point, resolving the issue requires re-running large parts of the verification flow, introducing schedule disruption and non-trivial overhead. More importantly, it exposes a deeper problem: decisions are being made on evidence that no longer reflects the current design state.

The consequence is not just technical rework. It is decisions being made on stale or misaligned evidence. That is why late surprises are so damaging. The team is not simply finding more bugs. It is discovering that its evidence base has drifted. At that point, the risk is no longer technical correctness, but decision confidence at the program level.

The uncomfortable truth behind many weak sign-off discussions: a one-time formal result is not the same thing as sign-off evidence.

Why does this show up now?

The issue is becoming more visible because the conditions around formal verification have changed. Verification teams are expected to provide earlier feedback while designs continue to evolve during active development. Specifications do not remain fixed; multiple designers can modify the same block over time, and proofs must be re-run regularly because changes in the design, specifications, tools, and libraries can invalidate earlier results. Under those conditions, one-time formal success quickly loses value. If results do not survive revision churn in a controlled and interpretable way, they cease to function as evidence and become snapshots. [1]

This need for reproducible and revision-aware evidence becomes critical as formal is no longer used only as a specialist bug-hunting technique. As teams move closer to formal design sign-off, the quality of the surrounding process becomes as important as the quality of the proof itself.

A result must remain explicitly connected to the property, the constraint set, the design revision, the tool configuration, and the engineering decision it supports. Without that connection, formal may still find useful bugs, but it cannot reliably support sign-off decisions.

The shift: from expert skill to operating model

There was a time when the key question was whether the team had a formal expert/tool. Today, that question is too narrow. The better question is whether the organisation has a repeatable way to run formal that reduces uncertainty week by week.

Industrial experience has been pointing in that direction for years. In the published Intel Core i7 execution-engine work, the authors state that formal verification was used as the primary validation vehicle for that cluster, that coverage-driven testing was dropped entirely for that component, and that full formal verification of a design component is a feasible and industrially viable validation approach under the right conditions. [2]

The lesson is not that every block should be handled exactly that way. The lesson is that once formal starts carrying real validation weight, the surrounding workflow matters just as much as the solver. A proof is only as valuable as its traceability, its assumptions, and its ability to remain meaningful as the design changes.

That is where the shift from expert skill to operating model begins. Formal cannot support sign-off if results live only as isolated tool outputs, local scripts, or personal knowledge held by individual engineers. It needs a managed record of what was proven, under which assumptions, on which revision, with which tool setup, and what remains unresolved.

In practical terms, CI triggers, regression history, failure visibility, convergence tracking, and consistent artefact capture are not administrative extras. They are the mechanisms that keep formal evidence connected to the evolving RTL. They allow teams to distinguish current evidence from stale evidence, stable proofs from fragile proofs, and genuine closure from temporary tool success.

This operating model does not replace formal expertise. It preserves the value of that expertise so it can be reviewed, repeated, and trusted beyond a single run, a single engineer, or a single design revision.

What repeatability under change requires

A result that matters at sign-off must remain interpretable as the design changes. In practice, that requires three kinds of discipline.

A

CI for formal

Trigger-based re-run on RTL, env, or constraint change. Prevents: silent drift - "green that doesn't mean anything anymore."

B

Regression history

Per-property deltas with classified root cause — not anecdotes. Prevents: invisible non-convergence and progress theatre.

C

Debug traceability

Every failure owned: revision, property, assumptions, status. Prevents: tribal knowledge and re-debugging the same CEX.

A. CI for formal

Continuous integration (CI) for formal does not mean re-running every proof on every commit. It means explicitly deciding when old evidence ceases to be trustworthy.

That sounds obvious, but this is where many teams fail. A property remains green in the database, so it continues to appear in reports. Meanwhile, the covered logic changed two merges ago, assumptions changed to suppress noise, or the abstraction boundary moved. The visible result did not fail, but its meaning changed. From a sign-off perspective, that is already a broken flow. The value of CI here is not automation for its own sake. It is about maintaining reproducible formal verification results as the design evolves.

B. Regression history: reproducible formal verification results, not anecdotes

A mature formal flow does not report only pass counts. It reports movement.

The meaningful questions are not “How many properties are green?” but “What changed since the last trusted run?” “What stopped converging?”, and “What became less trustworthy because the proof context changed?” Without that history, teams end up telling stories about progress rather than showing it.

For this reason, regression history matters. A block that moves from inconclusive to proven on named properties, while another set of properties stalls for named reasons, gives managers something actionable. A flat green summary does not.

C. Debug tracking and traceability

Formal loses a great deal of value when counterexamples are resolved informally instead of structurally.

Engineers remember that a failure was “already understood,” or that a certain assumption was “needed to get convergence,” but the rationale lives in chat, memory, or a notebook. Weeks later, the same issue reappears under a slightly different revision, and the team debugs it again. That wastes effort, but more importantly, it erodes the auditability of the formal result itself.

A sign-off-capable flow requires that every important failure be reproducible and owned. The property must be identifiable. The revision must be identifiable. Assumptions must be reviewable, and dispositions must be visible. Assumption discipline becomes critical at this point: assumptions are not just a convenience to complete a proof; they define the meaning of the result. If assumptions are not tracked, the scope of the proof is not actually understood.

The LUBIS EDA Formal verification flow insight. For reproducible formal verification results
Figure 1: Formal verification as a structured, repeatable sign-off workflow [3]

A formal-ready verification environment organizes planning, execution, and sign-off into a repeatable process, ensuring that results remain reproducible, traceable, and aligned with evolving design revisions.

What good looks like week to week

Sign-off confidence is not built by occasional bursts of formal activity. It is built on reproducible formal verification results that remain meaningful week after week.

A good weekly review should answer three questions clearly:

  1. What uncertainty did we remove this week?
  2. What is not converging, and what is the plan?
  3. What could still surprise us later if we do nothing now?

Those questions may seem simple, but they require the right behavior. The first forces property-level deltas instead of generic status language. The second forces non-convergence to become visible before it becomes urgent. The third forces the team to state residual risk while there is still time to act on it.

That is also consistent with how LUBIS EDA describes its own service model: identify high-risk blocks, define verification targets and KPIs, execute against the plan, provide weekly feedback and on-the-spot bug reports, then hand over Assertion IP that integrates with regression tests so code changes can be checked against existing functionality. In other words, the customer-visible output is not just demonstrating that formal was run. It is progress that can be interpreted and managed. [4]

What good looks like, week by week, is steady deltas instead of random bursts of “progress,” early risk flags with clear ownership, and a visible path to closure per block with crisp definitions of done.

Turning infrastructure into sign-off evidence

Formal becomes credible at sign-off when it produces reproducible formal verification results that can be reviewed, challenged, and defended. Weekly status matters, but sign-off requires something more durable: a clear record of what was proven, under which assumptions, on which revision, and with what residual risk still open.

That record has two purposes. First, it gives project leaders visibility they can manage: what is complete, what is blocked, and what still needs action before closure. Second, it creates a defensible sign-off package that ties formal results to actual design risk rather than to isolated tool output.

At a minimum, that package should state five things explicitly:

  • What was proven
  • Under which assumptions
  • On which revision
  • How it maps to actual design risk
  • What remains open or bounded, with residual risk stated explicitly

This is the point at which formal stops being only a technical activity and becomes a decision input. If those elements are missing, results may still be useful for finding bugs, but they are not strong enough to justify schedule, quality, or tape-out decisions. LUBIS EDA’s own delivery model aligns with that requirement by combining verification planning, weekly feedback, Assertion IP, and regression-aware execution into a structure that supports sign-off rather than one-time proof activity. [4]

The claim and the practical checklist

Reproducible formal verification results must survive design churn and remain interpretable. Otherwise they are one-time snapshots, not sign-off evidence.

This distinction must be made clearly. The problem is not that teams need more ad hoc formal effort. The problem is that they need formal results that remain valid when the design moves.

A short checklist is usually enough to expose whether the flow is real:

  1. What triggers reruns?
  2. What is tracked as a weekly delta?
  3. Where do assumptions live, and how are they reviewed?
  4. How are failures triaged and prevented from recurring?
  5. What must be true for “done” at sign-off?

If those answers are unclear, then the project may still be getting value from formal, but it is not yet getting sign-off-grade value.

References

[1] S. Goel, A. Slobodova, R. Sumners, and S. Swords, “Balancing Automation and Control for Formal Verification of Microprocessors,” in Computer Aided Verification (CAV 2021), LNCS 12759, Springer, 2021, pp. 26–45. Available:https://link.springer.com/chapter/10.1007/978-3-030-81685-8_2

[2] R. Kaivola et al., “Replacing Testing with Formal Verification in Intel Core i7 Processor Execution Engine Validation,” in Computer Aided Verification (CAV 2009), LNCS 5643, Springer, 2009, pp. 414–429. Available: https://link.springer.com/chapter/10.1007/978-3-642-02658-4_32

[3] LUBIS EDA, “Formal-Ready Verification: What Makes It Essential?” Available: https://lubis-eda.com/formal-ready-verification-environment-lubis-eda/

[4] LUBIS EDA, “Formal Verification Sign-off as a Service.” [Online]. Available: https://lubis-eda.com/services/

Training Topics

  1. Abstraction vectors (time, functionality)
  2. AIP for protocols
  3. AIP orchestration
  4. BMC & IPG, invariants
  5. Codestyle
  6. Completeness
  7. Liveness property, safety property
  8. Non-determinism (abstraction technique)
  9. Response generation (abstraction technique)
  10. Scoreboard (abstraction technique)
  11. Signal cutting, blackboxing
  12. State space explosion and mitigation techniques
  13. SVA fundamentals
  14. Whitebox checking, blackbox checking, greybox checking
  15. Witness, vacuity, reachability

Become a leader in formal verification