It has been over 18 years since I first published a mnemonic I created to help me test error handling and reporting in software systems: FAILURE.
Bad error reporting does not merely inconvenience users. It transfers the system’s uncertainty, ambiguity, and failure onto the user — in the form of wasted time, anxiety, mistrust, unnecessary work, and sometimes real harm.
Ambiguous and incorrect error messages send users on blind puzzle-solving side-quests.
They try again. They change inputs. They restart things. They search forums. They contact support. They wonder what they did wrong.
Often, the system made them do unnecessary work.
Often, some layer of the system already knew enough to say something more accurate, more timely, and more helpful.
That does not mean exposing raw internal errors, stack traces, implementation details, or sensitive information.
It often takes deliberate design and engineering work to preserve useful context, bubble it up through layers of the system, translate it into terms the user can act on, and decide what should or should not be revealed.
Sometimes opacity is appropriate. There are good reasons to avoid transparency when it would make systems, accounts, or the data entrusted to them less secure.
But “we cannot reveal everything” should not become an excuse for “we will reveal nothing useful.”
Usually, there is still something helpful we can say.
When software wastes people’s time, it is not wasting an abstract resource. It is consuming their life.
Five minutes here. An hour there. An afternoon lost to guessing what the system could have explained.
Ambiguous errors can become little Langoliers, eating chunks of people’s lives while they try to solve a puzzle they should never have been given.
Incorrect error reporting can also provoke needless anxiety and stress.
A message that implies data may have been lost, money may have been charged, access may have been revoked, or the user may have done something wrong can have a real emotional impact — especially when the message is inaccurate or incomplete.
And telling users to “try again” when some part of the stack already knows that what is being tried will never work is not optimism.
It is disrespect.
Good error handling should help people understand:
What happened? What was affected? What was not affected? Can I recover? What should I do next? Is this my problem, your problem, or nobody’s fault?
This is not merely a technical problem.
It is a human interaction problem.
It is a relationship problem.
When something goes wrong, the system is still communicating. It may communicate respect or indifference. Clarity or confusion. Accountability or evasion. Helpfulness or abandonment.
So I do not want to ask only, “Was an error message displayed?”
I want to ask:
What did the system make the user do? What did it make them wonder? What did it make them feel? What did it cost them? What did it help them recover? What did it make them trust less?
That is why I use the FAILURE heuristic:
F — Functional
- When something goes wrong, does the system still behave like a reliable partner?
- Does it detect the problem before the user wastes effort?
- Does it prevent the user from making the situation worse?
- Does it preserve the user’s work, choices, money, access, and trust?
- Does the handling of the error work for the user’s actual situation, not just for the system’s internal state?
A — Appropriate
- Does the system interrupt the user only when interruption is justified?
- Does it communicate at the moment the user needs to know, not merely when the system happens to notice?
- Does it choose a channel, placement, and level of urgency that fit the user’s context?
- Is the system appropriately opaque where transparency would create risk, without being needlessly vague where helpfulness is possible?
- Does it avoid making a small problem feel catastrophic?
- Does it avoid hiding a serious problem behind quiet failure, vague status, or false reassurance?
I — Impact
- Does the user understand what this means for them?
- Do they know whether their work, data, money, access, privacy, time, or reputation was affected?
- Do they know what was not affected? Do they know whether the action failed, succeeded, partially succeeded, or is still unknown?
- Does the system relieve uncertainty where it can, instead of making the user carry it?
L — Log
- Does the system remember enough about what happened so the user does not have to become the evidence collector?
- Can support or engineering investigate without asking the user to repeat painful, risky, or time-consuming steps?
- Are the right people able to see the right information at the right time?
- Is the user’s privacy respected while the problem is being diagnosed?
- Do the logs help the organization learn from the failure, or merely prove that something broke?
U — UI / Understandable
- Is the user expected to be clairvoyant?
- Does the system speak to the user as a person, not as a stack trace reader?
- Does it explain the situation in terms of the user’s goal?
- Does the system translate internal knowledge into safe, useful user-facing information?
- Does it avoid making the user decode internal concepts, internal labels, or implementation details?
- Does it avoid blaming the user when the system does not know who or what caused the problem?
- Would the message still be understandable to someone who is tired, stressed, distracted, rushed, or unfamiliar with the system?
R — Recovery
- Does the user know what they can do next?
- Is the suggested next step realistic, available, and likely to help?
- Does the system avoid telling the user to retry when it already knows retrying will not work?
- Can the user save, undo, resume, escalate, or get help? When the user cannot fix the problem, does the system say who can?
- Does recovery feel like guidance, or like abandonment?
E — Emotions
- How is the system making the user feel at the moment something has gone wrong?
- Does it reduce confusion, anxiety, urgency, shame, and self-blame?
- Does it preserve dignity? Does it communicate honesty without panic?
- Does it acknowledge the user’s time and effort?
- Does it leave the user feeling helped, stranded, blamed, dismissed, manipulated, or respected?
A technically accurate error message can still be a poor user experience.
But an inaccurate, untimely, ambiguous, or unrecoverable error message can be worse than poor UX. It can waste people’s lives, increase their stress, erode their trust, and make them feel responsible for failures they did not cause.
The next time you write code to handle or report errors, or test error handling and reporting, do not stop at “was an error message displayed?”
Ask whether the error handling respects the user.