Card sorting vs tree testing vs first-click testing

Card sorting, tree testing, and first-click testing each answer a different question about information architecture. Card sorting reveals how users mentally group content. Tree testing checks whether a proposed navigation structure is findable. First-click testing confirms whether users can navigate a designed interface. Used together, they cover the full design process from discovery to validation.

This guide explains what each method measures, when to use it, and how the three fit together.

What each method actually measures

The clearest way to distinguish these methods is by what they evaluate and when in the design process they apply.

Card sorting is a generative method. Participants receive a set of content items (on physical or digital cards) and arrange them into groups that make sense to them. In an open card sort, participants create and name the groups themselves. In a closed card sort, groups are predefined and participants sort items into them. The output reveals users’ mental models: how they expect content to be organized, what they call categories, and which items they associate with each other. The Nielsen Norman Group’s card sorting guide provides a solid theoretical foundation for both open and closed variants.

Tree testing (also called reverse card sorting) is evaluative. You give participants a text-only version of your proposed navigation hierarchy and ask them to find specific items using only that structure. The method records whether they find the correct item, how long it takes, and where they go wrong. Because tree testing strips out all visual design, it isolates labeling and structural problems without visual noise. Nielsen Norman Group’s tree testing overview explains directness and success scoring in detail.

First-click testing adds the visual dimension back. Participants see a screenshot or clickable prototype and are asked to complete a task by clicking where they would go first. The test records click location and time-to-first-click. Research by Bob Bailey found that users who click correctly on the first attempt complete the full task 87% of the time, which makes that first click a strong predictor of overall navigation success. The usability.gov first-click testing overview covers the method origins and core protocol.

Side-by-side comparison

Dimension	Card sorting	Tree testing	First-click testing
Method type	Generative	Evaluative	Evaluative
Stage in design	Early (discovery)	Mid (validation)	Late (confirmation)
What it tests	Mental models, groupings	Structure and labels	Layout, labels, visual hierarchy
Visual design required	No	No	Yes (screenshot or prototype)
Participants needed	15 to 30	50+ per task	50 to 100 per task
Output format	Similarity matrix, dendrogram	Success rate, directness score	Heatmap, click accuracy, time-to-click
Time to run	1 to 2 weeks	1 to 2 weeks	3 to 7 days
Best for	Building navigation	Validating navigation	Confirming implementation

When to use card sorting

Run card sorting when you are starting from scratch or when you suspect your existing navigation does not match how users think about your content. Common triggers include:

Redesigning a website, app, or portal with more than a few top-level categories
Expanding a product into new feature areas with unclear placement
Discovering through analytics that users cannot find content despite it existing
Merging two products or content areas with different organizational logic

Open card sorting is best for exploring mental models when you have no existing structure to test. Closed card sorting is better for checking whether specific items fit into a category structure you have already defined. Many teams run open first, then closed once they have a draft structure.

The card sorting tutorial for IA covers the step-by-step process and analysis in detail.

When to use tree testing

Tree testing is the right choice once you have a proposed structure and want to know whether it actually works before you invest in visual design. It is faster to set up than a prototype and more rigorous than informal feedback.

Use tree testing when:

You have completed card sorting and have a draft navigation hierarchy to validate
You want to compare two or more structural options (A/B tree testing)
You need to identify which specific labels or placements are causing confusion
You are auditing an existing navigation to find problem areas before a redesign

The text-only format is tree testing’s main advantage: because participants have no visual cues to fall back on, every navigation decision reveals exactly how well the structure and labels work on their own. Directness scores (the percentage of participants who go straight to the correct answer without backtracking) are particularly useful for identifying confusing navigation paths.

Explore the best tree testing tools for platform options at different budget levels. Optimal Workshop’s Treejack and Maze’s tree testing feature are the most widely used dedicated tools.

When to use first-click testing

First-click testing belongs at the end of the process, when you have a design to test. It is best suited to:

Validating that a new navigation layout translates the validated structure into a working visual interface
Testing competing homepage or landing page designs before launch
Checking whether labels that tested well in a tree test still work when rendered in a specific typeface, size, and position
Auditing live pages that have poor task completion rates without running full moderated sessions

Because first-click testing uses actual screenshots, it catches problems that tree testing cannot: labels that are visually overshadowed by nearby elements, calls-to-action that draw the eye away from navigation, or category names that read differently in bold heading format than they do in a plain text list.

The first-click testing methods and benchmarks guide covers accuracy benchmarks and how to interpret heatmap results.

How to sequence the three methods in practice

Most IA projects benefit from running these methods in sequence rather than in isolation. A typical flow looks like this:

Phase 1: Discovery Run an open card sort to understand how users organize your content. Analyze the similarity matrix to identify strong groupings and outliers. Use the results to draft a navigation structure.

Phase 2: Refinement Run a closed card sort (optional) to test whether specific items fit your proposed categories. Then move to tree testing to validate the full hierarchy. Use tree test results to fix labeling issues and restructure problem areas.

Phase 3: Confirmation Run first-click testing on your high-fidelity designs or live screenshots to confirm that the validated structure translates correctly into the visual interface. Address any layout or visual hierarchy problems before launch.

This three-phase approach is especially valuable for large-scale redesigns, complex B2B portals, and any product where navigation failure has a direct impact on user outcomes or revenue.

Common mistakes when choosing between these methods

Running tree testing without card sorting first. Tree testing can tell you that your structure fails, but it cannot tell you what structure would work better. Card sorting fills that gap. Skipping straight to tree testing often leads to iterated versions of the same flawed structure.

Using first-click testing instead of tree testing. First-click testing requires a visual design to exist, which makes iteration expensive. Tree testing is faster and cheaper for evaluating structural problems, and it should almost always precede first-click testing.

Treating card sort output as a finished navigation. Card sort data shows how users think, not a ready-to-ship navigation structure. Dendrogram clusters often contain practical conflicts (items that group together conceptually but cannot coexist in a navigation menu) that require design judgment to resolve.

Using too few participants for tree testing or first-click testing. Card sorting can yield useful patterns with 15 to 20 participants. Tree testing and first-click testing need at least 50 participants per task to produce reliable success rates and heatmaps. Underpowered studies lead to misleading confidence.

Recruiting participants for IA studies

All three methods require participants who match your target audience. Generic panel respondents often do not have the mental models, vocabulary, or task context that your actual users bring to navigation research.

For B2B products, recruiting verified professionals with specific job roles and seniority levels matters more than volume. For consumer products, demographic and behavioral filters (device type, frequency of use, product category familiarity) determine whether your card sort data reflects your real users or an irrelevant population.

Platforms like CleverX give access to an 8M+ panel of verified B2B and B2C participants across 150+ countries, which is useful for IA studies that need role-specific or sector-specific respondents rather than broad consumer samples.

For broader guidance on choosing the right method for your project, see usability testing methods: how to choose the right framework.

Frequently asked questions

What is the difference between card sorting and tree testing?

Card sorting is a generative method: participants group items into categories they define or choose, revealing how users mentally organize information. Tree testing is an evaluative method: participants navigate a pre-built text-only hierarchy to find specific items, revealing whether your proposed structure actually works. Use card sorting to build your navigation, then use tree testing to validate it.

What is the difference between tree testing and first-click testing?

Tree testing evaluates the underlying information architecture as a plain text hierarchy, with no visual design. First-click testing adds a visual layer (a screenshot or prototype) and tests whether users can identify the correct starting point on an actual page. Tree testing isolates labeling and structure problems; first-click testing reveals whether layout, visual hierarchy, and label placement work together.

Can you run all three methods on the same project?

Yes, and this is common practice in IA projects. The typical sequence is: card sorting to generate structure, tree testing to validate the structure, and first-click testing to confirm the design implementation. Running all three in sequence gives you a complete picture from concept through execution, with each method checking the work of the previous one.

How many participants do you need for each method?

Card sorting: 15 to 30 participants is usually enough to identify strong grouping patterns; open card sorts benefit from 20 to 30 for dendrogram clarity. Tree testing: 50 or more participants per task scenario gives reliable directness and success scores. First-click testing: 50 to 100 participants per task is the standard recommendation for detecting meaningful patterns.

Which method is best for navigation redesign?

For a full navigation redesign, use all three in sequence. Start with open card sorting to understand your users’ mental models, run closed card sorting to test a proposed category structure, validate with tree testing, and confirm the final design with first-click testing. If you can only run one method, tree testing gives the most actionable structural feedback for redesign projects.

Are these methods suitable for mobile navigation research?

All three methods work for mobile navigation research, but the setup differs slightly. Card sorting and tree testing translate directly to mobile contexts because they test structure and labels, not visual layout. First-click testing requires mobile-specific screenshots or prototypes rather than desktop ones. For complex mobile navigation patterns such as bottom tabs and hamburger menus, first-click testing is particularly valuable for validating placement decisions.