How to do accessibility testing: a step-by-step guide

Accessibility testing is the process of systematically evaluating a digital product to find and remove barriers that prevent people with disabilities from using it effectively. A complete accessibility testing programme combines three layers: automated scanning, structured expert review, and live sessions with real users who rely on assistive technologies.

This guide walks through each layer in order, explains which WCAG criteria each layer covers, and gives practical guidance on recruiting participants for real-user sessions.

Why accessibility testing requires more than an automated scan

Automated tools are fast and consistent. Run axe, WAVE, or Lighthouse on any page and you will get a list of flagged violations within seconds. The problem is that automated tools catch only 30-40% of WCAG issues according to research from Deque Systems, the company behind the axe engine.

The issues that automated tools miss are often the most disruptive: a screen reader announcing a button as “button” with no label, a modal trapping keyboard focus, a form error message that appears visually but is never announced to a screen reader user. Catching these requires a human tester, ideally someone who actually uses the product with an assistive technology.

A three-layer approach gives you the best coverage:

Layer	What it catches	Time investment
Automated scan	Missing alt text, contrast failures, duplicate IDs, empty links	1-2 hours per release
Expert audit	Keyboard traps, ARIA misuse, logical reading order, focus management	1-3 days
Real-user sessions	Task failure points, workaround strategies, cognitive load	1-2 weeks including recruitment

Step 1: Run an automated scan

Start every accessibility testing cycle with an automated scan. Choose one or two tools and run them consistently so you can track regressions over time.

Recommended tools:

axe DevTools (Chrome extension or CI integration): industry standard, low false-positive rate
WAVE (WebAIM): visual overlay that shows issues in context, good for quick spot checks
Lighthouse (built into Chrome DevTools): covers accessibility alongside performance and SEO

What to scan:

Every distinct page template, not just the home page
Forms, modals, and dynamic content loaded via JavaScript
PDF documents and embedded media

What to document:

Record every issue with the WCAG success criterion it violates, the severity (critical blocks task completion, serious impedes it, moderate creates friction), and a screenshot. This gives developers a clear prioritized backlog.

Step 2: Conduct a structured expert audit

An expert audit means a trained tester working through the product manually using keyboard navigation and one or more screen readers. This layer catches issues that automated tools cannot evaluate: whether content structure makes sense when read linearly, whether interactive components communicate their state to assistive technology, and whether the focus indicator is visible throughout a workflow.

Keyboard navigation checklist:

Can every interactive element be reached using Tab?
Does focus move in a logical order that matches the visual layout?
Are modals and drawers trapping focus correctly (focus stays inside while open, returns to trigger when closed)?
Is there a skip navigation link at the top of each page?
Does pressing Escape close modals and dropdowns?

Screen reader checklist:

Use NVDA with Firefox on Windows and VoiceOver with Safari on macOS as your primary combination. JAWS is the most commonly used screen reader among users with visual impairments and should be included for any product targeting enterprise or government audiences.

Are images described meaningfully or marked as decorative?
Do form inputs have associated labels (not just visible placeholder text)?
Are error messages announced when they appear?
Do custom components (date pickers, accordions, carousels) expose the correct ARIA roles and states?
Are live regions used for content that updates dynamically?

A heuristic evaluation is a useful complement to a screen reader audit: it surfaces issues with information hierarchy and predictability that affect users with cognitive disabilities even when ARIA is implemented correctly.

Step 3: Test with real users who have disabilities

Expert audits expose technical compliance failures. Real-user sessions expose the actual impact on task completion. These are not the same thing. A product can pass a screen reader audit and still be nearly unusable for a person who relies on a screen reader every day.

Plan separate sessions by assistive technology type:

Screen reader users (low vision or blind)
Keyboard-only users (motor impairments, no mouse)
Users with cognitive disabilities (ADHD, dyslexia, acquired brain injury)
Switch access or voice control users for mobile or desktop apps

Session design:

Keep task-based accessibility sessions to 45-60 minutes. Write tasks in outcome terms (“find the pricing for the Pro plan and start a trial”) rather than steps (“click the pricing tab”). Do not describe which controls to use. Observe where participants pause, hesitate, or abandon.

Record the session with screen reader audio on so you capture exactly what the user hears alongside what you see on screen. Transcribe and annotate together.

Session size:

Five to eight participants per assistive technology category is sufficient to surface the main barriers in each group. Running fewer than five risks missing patterns. Running more than eight with the same AT type produces diminishing returns before you have fixed the issues from the first round.

For guidance on recruiting participants for live sessions, see how to recruit users for usability testing.

Step 4: Prioritise and fix issues

After all three layers, you will have issues from three sources. Consolidate them into a single backlog organized by:

Critical: Blocks task completion for one or more user groups. Fix before release.
Serious: Significantly impedes task completion. Fix in next sprint.
Moderate: Creates friction but task is completable with effort. Fix in current quarter.
Minor: Best-practice improvements. Add to backlog.

Map every issue to the WCAG 2.1 or 2.2 success criterion it violates. This makes it easier to brief developers and to communicate compliance status to legal or procurement stakeholders.

Step 5: Retest and build accessibility into your cycle

A single accessibility audit is not a programme. Issues recur as new features ship. Build accessibility checks into your development workflow:

Add axe-core to your CI/CD pipeline so automated violations block merges
Include keyboard and screen reader smoke tests in your QA checklist
Schedule expert audits quarterly or after major feature releases
Include at least one accessibility-focused moderated session in each major research cycle

Moderated usability testing works well for accessibility because you need to observe the user’s environment and assistive technology setup in real time. Unmoderated testing is harder to run accessibly because many self-serve platforms are not themselves accessible to screen reader users.

Recruiting participants with disabilities

Finding participants who use assistive technologies is the most commonly cited barrier to running real-user accessibility sessions. Standard consumer panels rarely have enough screened panelists who actively use screen readers or switch access.

Practical recruitment channels:

Disability advocacy organizations and networks (National Federation of the Blind, ACB, AHEAD for higher education)
Assistive technology user forums and communities
Vocational rehabilitation services
Specialized research panels that screen for assistive technology use

Screener questions should focus on the specific assistive technology, not the diagnosis. Ask: “Do you use a screen reader to browse the web? Which screen reader do you use most often?” rather than “Do you have a visual impairment?”

Over-recruit by 20-30% for accessibility studies. No-show and rescheduling rates are higher when participants have complex scheduling needs or when the product itself turns out not to work with their setup.

CleverX maintains a verified panel across 150+ countries with detailed profiling that includes assistive technology use, making it easier to source screened participants for accessibility sessions without starting from scratch each time.

Which WCAG criteria matter most

If you are prioritising where to start, focus on the criteria that cause the most task failures for the largest number of users:

WCAG criterion	What it covers	Why it matters
1.1.1 Non-text content	Alt text for images	Screen readers cannot convey images without it
1.3.1 Info and relationships	Semantic HTML structure	Defines landmarks, headings, and lists for AT users
1.4.3 Contrast (minimum)	4.5:1 ratio for text	Affects low-vision users and outdoor mobile use
2.1.1 Keyboard	All functionality via keyboard	Essential for motor-impaired users
2.4.3 Focus order	Logical tab sequence	Prevents confusion when navigating without a mouse
4.1.2 Name, role, value	ARIA labels and states	Lets AT announce what controls do and their current state

For the full standard, refer to the WCAG 2.2 guidelines published by the W3C.

Additional guidance on running inclusive research processes, including making your consent forms and screeners accessible, is covered in user research accessibility compliance guide.

For broader UX audit methodology that can sit alongside accessibility testing, see the UX audit checklist.

Frequently asked questions

What is accessibility testing?

Accessibility testing is the process of evaluating a digital product to identify barriers that prevent people with disabilities from using it. It combines automated scanning, expert review, and sessions with real users who rely on assistive technologies such as screen readers, switch access, or voice control.

What is the difference between automated and manual accessibility testing?

Automated tools like axe or WAVE scan HTML and flag rule-based violations such as missing alt text or low color contrast. They catch roughly 30-40% of WCAG issues. Manual testing covers the remaining issues that require human judgment: logical reading order, meaningful link text, and whether interactive components behave correctly with a keyboard or screen reader.

How many participants do I need for accessibility testing?

Most teams run five to eight sessions per assistive technology type. Because users with different disabilities interact with products in distinct ways, plan separate mini-studies for screen reader users, keyboard-only users, and users with cognitive or motor impairments rather than combining them into a single session set.

Which WCAG level should I test against?

WCAG 2.1 Level AA is the standard required by Section 508, the European Accessibility Act, and most organizational policies. WCAG 2.2 builds on 2.1 and adds criteria for cognitive accessibility and touch interaction. Start with AA; treat AAA criteria as stretch goals.

Can I do accessibility testing without disabled participants?

Automated tools and expert audits using assistive technologies give you a baseline, but they do not replace sessions with real users. People with disabilities develop personal workarounds and use assistive technology in ways that auditors do not predict. Real-user sessions consistently surface barriers that automated and expert methods miss.

How do I recruit participants with disabilities for research?

Screen for assistive technology use and functional abilities rather than diagnoses. Recruit from disability advocacy groups, assistive technology communities, and specialized panels. Over-recruit by 20-30% to account for higher no-show rates. Work with a recruitment partner that has verified panelists who use screen readers, switch access, and other assistive technologies.