Are PURE Evaluations Better Than Heuristic Reviews?

A Real World Comparative Analysis

Jun 07, 2023

Article voiceover

1×

0:00

-7:38

Summary: PURE evaluations are often a better method over heuristic reviews for assessing UX work. They can track progress over time, simplify UX benchmarking, and provide a competitive edge.

Picture this: You're fine-tuning your UX processes with a forward-thinking, data-rich approach that spots potential usability issues and yields numerical, comparable scores. This isn't a pipe dream; it's an actual research method called Pragmatic Usability Rating by Experts (PURE) evaluations. Unlike heuristic reviews that tend to be static and singular, PURE evaluations provide actionable insights to track usability improvements over time and even help you stand out in the competitive landscape.

What Are PURE Evaluations

PURE is a methodology used in UX research. It is a predictive, quantitative usability inspection method that seeks to evaluate and score a product or interface based on the expected difficulty level a user might face while completing specific tasks.

A software interface under examination with a sequence of different faces, each representing various tasks being evaluated in a PURE audit, illustrating the task-based approach of PURE evaluations.

PURE provides a measure of task difficulty based on the cognitive load and other usability friction points. It involves breaking down a task into a series of steps, and then experts rate each step on a user friction scale. (I have some exciting updates to share in future posts about my custom PURE scoring scale. Here's a sneak peek: You've seen examples using a 3-point scale, but for the past seven years, I've been crafting a 5-point scale to better address complex interfaces. I can't wait to dive into the details with you. So stay tuned - there's so much more to discuss about PURE!)

A PURE evaluation can help highlight potential usability issues in the early stages of design and can guide modifications to improve the overall user experience. It's pragmatic because it is grounded in actual user tasks and behaviors rather than only usability principles and best practices.

The Power of Quantitative Data in UX

Three vertical bars representing varying levels of usability friction in a PURE evaluation. The first bar is red, signifying a high level of usability friction; the second bar is yellow, indicating a medium level of usability friction; and the third bar is green, denoting a low level of usability friction.

At the heart of PURE evaluations lies the strength of quantitative data. Unlike the qualitative feedback often generated by heuristic reviews, PURE evaluations provide a numerical score that allows you to monitor usability improvements across different design iterations. This continuous loop of design, assess, improve, and re-assess brings dynamic progression into your UX design process. (Does this iterative loop sound familiar? Hahaha)

PURE Evaluations for UX Benchmarking

By comparing PURE scores over time, you can quantify the progress your product makes in usability. This not only validates design changes but also sets benchmarks for future designs. It's a level of longitudinal analysis and strategic improvement that heuristic reviews simply cannot match.

Stay Competitive with PURE

A cheerful clock, a satisfied bundle of dollar bills, and a thumbs-up icon, symbolizing the time efficiency, cost-effectiveness, and overall positive business impact of implementing PURE evaluations.

But the power of PURE evaluations doesn't stop at self-improvement; it extends to competitive analysis as well. By comparing your product's PURE scores with those of competitors, you gain valuable insights into your product's usability standing in the market. This competitive edge can guide strategic design decisions to ensure your product isn't just usable but it's the best in its category. Think of PURE as a version of what the Baymard Institute does, except customized to you and your competitors.

Research-Backed Benefits of PURE

Studies indicate that clear, comparative data from PURE evaluations significantly speed up decision-making during the design process. The Nielsen Norman Group reports PURE methodology's predictiveness has been shown to reduce subsequent user testing iterations by up to 50%. The power of predictiveness to detect usability issues early in the design process is undeniably beneficial. By spotting potential trouble spots early on, we can address them proactively, rather than reactively, minimizing design rework and enhancing the efficiency of the entire design workflow. From my professional perspective, adopting PURE evaluations not only saves UX teams time and money, but also allows them to keep pace with the Agile development framework, preventing any perception of them as a bottleneck.

Are PURE Scores Reliable

The scores that come from a PURE evaluation provide directionally accurate results and demonstrate reasonable validity and reliability scores. A comparative analysis between PURE results and metrics derived from a usability-benchmarking study on an identical product revealed statistically significant correlations with widely-used ease-of-use survey measures such as SEQ and SUS, with correlation values of 0.5 (p <0.05) and 0.4 (p < 0.01), respectively. These figures affirm that PURE possesses reasonable validity when juxtaposed with standard quantitative metrics at statistically significant levels (p< 0.05).

Calculations for interrater reliability of PURE have varied from 0.5 to 0.9, with typically high scores (above 0.8) after expert raters receive training on the method. The 2016 article entitled Practical Usability Rating by Experts (PURE): A Pragmatic Approach for Scoring Product Usability goes into more depth on this and details the first documented case study of the PURE method being used.

Antidotally, I recently conducted a PURE evaluation at Bowery Valuation, where the team of three expert evaluators achieved an impressive interrater reliability score of 0.98, signifying a 98% agreement across fifteen real-world user tasks. Over the past seven years, during which I've regularly employed PURE, high-scoring results have been consistent and expected.

Conclusion

PURE evaluations, with their quantitative, comparative, and longitudinal capabilities, offer a comprehensive and effective usability testing method over heuristic reviews. In the rapidly evolving UX landscape, staying ahead means embracing methods that not only analyze but predict, improve, and outperform. And PURE evaluations are perfectly positioned to guide you on that path. So, it's time to move beyond heuristic reviews and step into the world of PURE evaluations for UX success.

Keep in mind, though, as with any expert-driven method, PURE isn't a substitute for direct user testing. Instead, it's a valuable tool to inform and streamline the design process, making subsequent user testing more efficient and effective.

Teaser

This post was designed to introduce my thoughts on PURE. I can't wait to share even more with you in upcoming posts. I'll be unveiling an expanded scoring system for PURE that's already in use by over 10 UX teams and will share some creative ways to use this scoring system. Read this article about the expanded PURE scoring system. And that's not all - I'll also provide handy templates and a step-by-step tutorial video to help you incorporate all these insights into your own work. So, stick around – there's so much more to come!