how do you manage testing across NVDA, JAWS, and VoiceOver without losing your mind? 🫠 I’m a QA engineer in the Netherlands working on a government portal that has to comply with EN 301 549 (Europe’s accessibility standard). This means I can’t just test with one screen reader — I need coverage across NVDA, JAWS, and VoiceOver at minimum.
The problem is the inconsistency is absolutely wild.
Specific things driving me crazy right now:
aria-liveregions that announce correctly in NVDA but are completely silent in JAWS- VoiceOver on iOS treating
role="button"differently depending on whether it’s a<div>or<span> - Focus management after modal close works in one, breaks in another
I’ve got a Windows VM for NVDA and JAWS testing and a physical iPhone for VoiceOver. The setup works but the context-switching is exhausting.
Has anyone found a sustainable workflow for multi-screen-reader regression testing?
I’m also wondering how much of this is reasonable to automate vs just accepting it needs skilled manual testers. Would love to hear from people doing this at scale.
The
aria-liveinconsistency you’re describing is a known pain point. JAWS has its own internal logic for deciding what’s “worth” announcing from live regions and it doesn’t always respectaria-atomicthe way the spec intends.What’s worked for us: wrapping live region content in a visually-hidden but real
<p>tag rather than relying purely on ARIA attributes. More semantic HTML tends to behave more predictably across readers than heavy ARIA decoration.<p class="sr-only" aria-live="polite">Form submitted successfully</p>It feels old-fashioned but it’s more reliable in practice.
thanks!
Quick question : How do you know this claim is true:
JAWS has its own internal logic for deciding what’s “worth” announcing from live regions
Have they stated this themselves?
Fellow EU tester here (based in Poland, also dealing with public sector a11y compliance). The inconsistency between JAWS and NVDA is genuinely one of the most frustrating parts of this work.
Our workflow that’s helped:
- We define a priority screen reader matrix based on our actual user analytics. For our user base NVDA + Chrome is dominant, so that gets the most rigorous testing.
- JAWS gets a second pass focused only on critical user journeys — login, form submission, key navigation flows
- VoiceOver Safari gets tested at the end on real hardware, not simulators
We stopped trying to achieve identical behavior across all readers and shifted to “no journey-blocking failures in any of them.” Subtle announcement differences are logged but not always blockers.
Also — if you’re not already using it, the Accessibility Insights for Web extension is great for structured manual audits. Free, from Microsoft, actually good.
I’ll look into this thanks!!!
I’m a QA consultant based in Canada specialising in a11y. Honest answer to your question:
Most of it needs to stay manual. Automated tools (Axe, Lighthouse etc.) catch maybe 30–40% of real accessibility issues — the rest requires human judgment and actual assistive technology.
What you can automate:
- Axe-core integrated into your Playwright suite catches low-hanging fruit on every PR
- Custom linting rules for missing alt text, empty labels, bad heading structure
What you cannot automate:
- Whether a screen reader actually conveys the right meaning
- Logical focus order
- Whether error messages are actually helpful when announced
Your exhaustion is valid. This work is skilled and time-intensive. Push back on anyone who tells you an overlay or an automated scanner “handles” accessibility.
I needed the validation 🥲 thank you, solid points
