Chapter 10: Data at Scale in Interaction Design

Loading audio…

ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.

If there is an issue with this chapter, please let us know → Contact Us

While this data holds vast potential for societal benefits, such as improving healthcare, city planning, and understanding complex human behaviors, its collection raises serious user concerns regarding privacy, transparency, and fairness. Approaches for collecting and analyzing these massive datasets include scraping information from the web and mining "second source" data, such as Google Search terms, which can indirectly reveal intimate insights into human concerns, necessitating the formulation of well-honed questions to interpret the results. The quantified-self (QS) movement involves the self-tracking of personal data (e.g., health, mood, sleep) using wearables and apps, which can increase self-awareness and provide early warnings, though designers must carefully translate high-frequency data streams (like heart rate or EEG data) into informative, low-anxiety interfaces. Crowdsourcing data, utilized in large-scale citizen science projects such as eBird and iNaturalist, involves millions of contributors amassing billions of observations, introducing challenges concerning data ownership, contributor privacy, and the need to obscure the location of endangered species. Specialized analytical methods include sentiment analysis, which scores text or facial expressions along a negative-to-positive continuum to infer group feelings, although it is considered a heuristic due to its struggle to capture human nuance like slang. Social network analysis (SNA), rooted in social network theory, uses algorithms and centrality metrics to evaluate the strength of social ties between nodes (people/topics) and edges (links/ties), revealing group interactions or political polarization. Comprehensive studies, such as the Studentlife project, integrate subjective reporting (surveys) with automatic sensing data (e.g., sleep duration, conversational frequency) from multiple sources to achieve a more complete understanding of complex domains like student mental health. Data visualization is essential for making sense of data at scale, aiming to amplify human cognition by allowing users to discover patterns and anomalies following the mantra: "overview first, zoom and filter, and then details on demand". Visual tools include canonical representations, tree maps (often used as "market maps"), and spectrograms (for visualizing sound recordings). Customized, interactive dashboards—panels containing coordinated visualizations and controls—are crucial for analysts and managers to explore data and make informed decisions. The human-data design approach, demonstrated by the Physikit project, focuses on transforming abstract sensed data into a meaningful physical ambient presence to increase user engagement, such as using a rotating cube to visualize humidity levels. A significant ethical challenge involves balancing societal benefits (e.g., security, reduced congestion) with the sacrifice of individual privacy, which sometimes requires employing provocative probes like the Quantified Toilets project to stimulate public debate. UX designers are encouraged to adhere to ethical principles, including adopting privacy by design (limiting initial data collection) and storing personal Internet of Things (IoT) data locally on a home-fenced databox instead of the cloud to enhance trustworthiness. The core ethical principles of the FATE framework are advocated for designing systems that use AI algorithms in combination with data: Fairness, which addresses dataset bias and ensures impartial treatment; Accountability, ensuring automated systems can explain their decisions and defining who is responsible; Transparency, making system decisions visible rather than functioning as "black-boxes"; and Explainability, providing rationales that laypeople can understand, especially for high-stakes outcomes like loan applications. Bias in training datasets, illustrated by higher facial recognition error rates for darker-skinned individuals, highlights the urgent need to develop fairer and more accountable algorithms.