Chapter 15: Evaluation Studies: Controlled & Natural Settings

Loading audio…

ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.

If there is an issue with this chapter, please let us know → Contact Us

Usability testing is introduced as a method typically executed in specialized labs or temporary controlled locations, emphasizing the measurement of product usability, effectiveness, and user satisfaction. Key quantitative performance metrics recorded during these tests include the time required for users to successfully complete predefined tasks, along with the number and specific types of errors observed, often utilizing data collection methods such as video recording, logged movements, the "think aloud" technique, and user satisfaction questionnaires. Testing environments are supported by specialized equipment, including observation rooms, mobile evaluation tools, portable eye-tracking devices, and facial recognition systems, sometimes packaged as a "lab-in-a-box". The case study involving early testing of the Apple iPad illustrated how usability evaluation adapts to real-world constraints, such as tight deadlines, revealing common issues related to small target areas, challenging navigation, and problems arising from changing screen orientations. Moving towards greater rigor, experiments are conducted primarily in research labs to test specific hypotheses about user performance. Experimental design centers on manipulating independent variables (the features being tested) to measure their effect on dependent variables (the resulting performance metrics). Researchers establish both a null hypothesis (predicting no effect) and an alternative hypothesis (predicting a difference), which may be a one-tailed or two-tailed prediction. Controlling external factors is vital, and three common allocations of participants to conditions are used: between-subjects design (different participants for each condition), within-subjects design (same participants for all conditions, requiring counterbalancing to negate order effects), and matched-pairs design. Statistical tests, such as t-tests, are subsequently applied to determine if the measured differences between conditions are statistically significant, typically aiming for a probability value (p) of 0.05 (lesser than). Finally, field studies represent the least controlled approach, taking place in natural contexts like homes or hospitals, allowing evaluation of mobile, ambient, and IoT technologies. These studies often result in qualitative accounts revealing how technologies are used, adapted, and integrated into everyday, messy routines. The deployment of prototypes in natural settings for sustained periods is specifically termed in-the-wild studies. The Painpad case study demonstrated a field evaluation focused on patient compliance and the contextual factors influencing the use of a pain-monitoring device post-surgery in a hospital setting.