Publications

Paper

2024

Examining Human Perception of Generative Content Replacement in Image Privacy Protection

Anran Xu, Shitao Fang, Huan Yang, Simo Hosio, and Koji Yatani

In Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems 2024 (CHI’24)

Abs Code PDF

The richness of the information in photos can often threaten privacy, thus image editing methods are often employed for privacy protection. Existing image privacy protection techniques, like blurring, often struggle to maintain the balance between robust privacy protection and preserving image usability. To address this, we introduce a generative content replacement (GCR) method in image privacy protection, which seamlessly substitutes privacy-threatening contents with similar and realistic substitutes, using state-of-the-art generative techniques. Compared with four prevalent image protection methods, GCR consistently exhibited low detectability, making detection of edits remarkably challenging. GCR also performed reasonably well in hindering the identification of specific content and managed to sustain the image’s narrative and visual harmony. This research serves as a pilot study and encourages further innovation on GCR and the development of tools that enable human-in-the-loop image privacy protection using approaches similar to GCR.

2023

LipLearner: Customizable Silent Speech Interactions on Mobile Devices

Zixiong Su, Shitao Fang, and Jun Rekimoto

In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems 2023 (CHI’23) 🏆 Best Paper Award

Abs URL Code PDF

Silent speech interface is a promising technology that enables private communications in natural language. However, previous approaches only support a small and inflexible vocabulary, which leads to limited expressiveness. We leverage contrastive learning to learn efficient lipreading representations, enabling few-shot command customization with minimal user effort. Our model exhibits high robustness to different lighting, posture, and gesture conditions on an in-the-wild dataset. For 25-command classification, an F1-score of 0.8947 is achievable only using one shot, and its performance can be further boosted by adaptively learning from more data. This generalizability allowed us to develop a mobile silent speech interface empowered with on-device fine-tuning and visual keyword spotting. A user study demonstrated that with LipLearner, users could define their own commands with high reliability guaranteed by an online incremental learning scheme. Subjective feedback indicated that our system provides essential functionalities for customizable silent speech interactions with high usability and learnability.

Demo & Poster & Workshop

2023

Towards Understanding Sense of Inclusion in Social VR Onboarding

Shitao Fang, and Koji Yatani

In Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems 2023 (CHI ’23 LBW)

Abs URL PDF

Being included in social interactions is a fundamental human need in both physical and virtual worlds. However, it is overlooked in the context of social VR user experience. Based on social psychology, we define the sense of inclusion as the degree to which an individual perceives a sense of belonging and authenticity from a group. We initially use non-verbal behavior, which is commonly used in social VR, as an entry point to understanding the role of the sense of inclusion in social VR. We examine how the reactive behaviors of existing community members would influence the sense of inclusion during social VR onboarding. Our between-subject experiment (N=39) with three reactive behavioral conditions confirms that positive responses from existing community members increased the sense of inclusion. And the sense of inclusion positively mediates several user experiences including enjoyment and immersion. We highlight potential design implications and future research for social VR.

2022

Customizing Silent Speech Commands from Voice Input Using One-Shot Lipreading

Zixiong Su, Shitao Fang, and Jun Rekimoto

In The Adjunct Publication of the 35th Annual ACM Symposium on User Interface Software and Technology 2022 (UIST’22)

Abs URL

We present LipLearner, a lipreading-based silent speech interface that enables in-situ command customization on mobile devices. By leveraging contrastive learning to learn efficient representations from existing datasets, it performs instant fine-tuning for unseen users and words using one-shot learning. To further minimize the labor of command registration, we incorporate speech recognition to automatically learn new commands from voice input. Conventional lipreading systems provide limited pre-defined commands due to the time cost and user burden of data collection. In contrast, our technique provides expressive silent speech interaction with minimal data requirements. We conducted a pilot experiment to investigate the real-time performance of LipLearner, and the result demonstrates that an average accuracy of is achievable with only one training sample for each command.

2021

Towards Understanding the Social and Societal Implications of Avatar Designs in Social VR

Nami Ogawa, Shitao Fang, and Koji Yatani

In CHI Social VR Workshop 2021

Abs URL

Existing research confirms that the appearance and affordance of avatars can influence the users’ perception, attitudes, and behavior. However, such studies focus on perception and behavioral changes of the user who directly controls the given avatar. As a result, the social and societal implications of avatar designs are still underexplored. We argue that an emerging platform of social VR would enable further explorations on avatar’s effects in the context of interaction with other users, potentially opening up a new research horizon. In this paper, we describe our research direction and discuss potential research questions.

© Copyright 2024 Shitao Fang.