Currently I am investigating how a mismatch between a human’s understanding of the domain dynamics affects the quality of their feedback, and how this in turn impacts an agent’s reward learning.
My background includes identification and prevention of misuse of teleoperated robots, with Takayuki Kanda and Drazen Brscic at Kyoto University, and engineering low-cost socially assistive robots for autism therapy, mentored by Hashim Raza Khan at NED University of Engineering and Technology.
Reward learning via human feedback is a crucial capability for beneficial AI. Current methods are built on decision-making theories that assume a matched dynamics model between the learning agent and the feedback provider. However, humans often form imperfect internal dynamics models, and their feedback reflects these misconceptions. While this relationship has long been hypothesised, its manifestation in sequential decision-making remains largely an assumption. Our work provides the first comprehensive empirical investigation of this relationship through a randomized controlled trial (N=211). We followed a two-stage design where we first initialized the participants’ understanding of the dynamics in a grid-world navigation domain and then manipulated it using text-based instructions. Causal mediation analysis revealed that humans’ internal models play a mediating role in feedback behaviour. We show that this relationship is invariant across visual contexts and is robust to three common feedback types: pairwise preferences, trajectory corrections, and off-switch interventions. These findings confirm a critical limitation of current reward learning methods and establish the missing psychological foundation for approaches that incorporate dynamics understanding.
@inproceedings{shaheen2026criticalpitfall,title={Empirical Evidence and Analysis of a Critical Pitfall in Reward Learning from Human Feedback},author={Shaheen, Taha and West, Stephen G. and Zhang, Yu},booktitle={Proceedings of the 35th International Joint Conference on Artificial Intelligence (IJCAI-ECAI 2026)},year={2026},month=aug,url={https://par.nsf.gov/servlets/purl/10683099},}
ACM THRI
Investigation of Low-Moral Actions by Malicious Anonymous Operators of Avatar Robots
Taha Shaheen , Dražen Brščić , and Takayuki Kanda
ACM Transactions on Human-Robot Interaction, Sep 2024
Avatar robots allow a teleoperator to interact with the people and environment of a remote place. Malicious operators can use this technology to perpetrate malicious or low-moral actions. In this study, we used hazard identification workshops to identify low-moral actions that are possible through the locomotor movement, cameras, and microphones of an avatar robot. We conducted three workshops, each with four potential future users of avatars, to brainstorm possible low-moral actions. As avatars are not yet widespread, we gave participants experience with this technology by having them control both a simulated avatar and a real avatar as a malicious anonymous operator in a variety of situations. They also experienced sharing space with an avatar controlled by a malicious anonymous operator. We categorized the ideas generated from the workshops using affinity diagram analysis and identified four major categories: violate privacy and security, inhibit, annoy, and destroy or hurt. We also identified subcategories for each. In the second half of this study, we discuss all low-moral action subcategories in terms of their detection, mitigation, and prevention by studying literature from autonomous, social, teleoperated, and telepresence robots as well as other fields where relevant.
@article{shaheen2024lowmoralactions,author={Shaheen, Taha and Br\v{s}\v{c}i\'{c}, Dra\v{z}en and Kanda, Takayuki},title={Investigation of Low-Moral Actions by Malicious Anonymous Operators of Avatar Robots},year={2024},publisher={Association for Computing Machinery},address={New York, NY, USA},url={https://doi.org/10.1145/3696466},doi={10.1145/3696466},journal={ACM Transactions on Human-Robot Interaction},month=sep,keywords={avatar robots, low-moral actions, hazard identification, malicious users, ethics},}