Advertisement

Abstract

A prerequisite for social coordination is bidirectional communication between teammates, each playing two roles simultaneously: as receptive listeners and expressive speakers. For robots working with humans in complex situations with multiple goals that differ in importance, failure to fulfill the expectation of either role could undermine group performance due to misalignment of values between humans and robots. Specifically, a robot needs to serve as an effective listener to infer human users’ intents from instructions and feedback and as an expressive speaker to explain its decision processes to users. Here, we investigate how to foster effective bidirectional human-robot communications in the context of value alignment—collaborative robots and users form an aligned understanding of the importance of possible task goals. We propose an explainable artificial intelligence (XAI) system in which a group of robots predicts users’ values by taking in situ feedback into consideration while communicating their decision processes to users through explanations. To learn from human feedback, our XAI system integrates a cooperative communication model for inferring human values associated with multiple desirable goals. To be interpretable to humans, the system simulates human mental dynamics and predicts optimal explanations using graphical models. We conducted psychological experiments to examine the core components of the proposed computational framework. Our results show that real-time human-robot mutual understanding in complex cooperative tasks is achievable with a learning model based on bidirectional communication. We believe that this interaction framework can shed light on bidirectional value alignment in communicative XAI systems and, more broadly, in future human-machine teaming systems.

Get full access to this article

View all available purchase options and get full access to this article.

Already a subscriber or AAAS Member? Log In

Supplementary Materials

This PDF file includes:

Supplementary Methods
Figs. S1 to S3
Table S1
References (53, 54)

Other Supplementary Material for this manuscript includes the following:

Movies S1 and S2
MDAR Reproducibility Checklist

REFERENCES AND NOTES

1
N. Wiener, Some moral and technical consequences of automation. Science 131, 1355–1358 (1960).
2
R. Klimoski, S. Mohammed, Team mental model: Construct or metaphor? J. Manage. 20, 403–437 (1994).
3
V. Groom, C. Nass, Can robots be teammates? Benchmarks in human–robot teams. Interact. Stud. 8, 483–500 (2007).
4
S. H. Schwartz, Advances in Experimental Social Psychology (Elsevier, 1992), vol. 25, pp. 1–65.
5
W. B. Rouse, J. A. Cannon-Bowers, E. Salas, The role of mental models in team performance in complex systems. IEEE Trans. Syst. Man Cybern. 22, 1296–1308 (1992).
6
J. MacMillan, E. E. Entin, D. Serfaty, Communication overhead: The hidden cost of team cognition, in Team Cognition: Understanding the Factors That Drive Process and Performance, E. Salas, S. M. Fiore, Eds. (American Psychological Association, 2004).
7
A. Butchibabu, C. Sparano-Huiban, L. Sonenberg, J. Shah, Implicit coordination strategies for effective team communication. Hum. Factors 58, 595–610 (2016).
8
V. V. Unhelkar, S. Li, J. A. Shah, Decision-making for bidirectional communication in sequential human-robot collaborative tasks, in Proceedings of the 2020 ACM/IEEE International Conference on Human-Robot Interaction (HRI) (IEEE, 2020), pp. 329–341.
9
P. Abbeel, A. Y. Ng, Apprenticeship learning via inverse reinforcement learning, in Proceedings of the Twenty-First International Conference on Machine Learning (ICML) (Association for Computing Machinery, 2004.
10
W. B. Knox, P. Stone, Interactively shaping agents via human reinforcement: The TAMER framework, in Proceedings of the Fifth International Conference on Knowledge Capture (Association for Computing Machinery, 2009).
11
S. Griffith, K. Subramanian, J. Scholz, C. L. Isbell, A. L. Thomaz, Policy shaping: Integrating human feedback with reinforcement learning, in Proceedings of Advances in Neural Information Processing Systems (NeurIPS) (IEEE, 2013).
12
M. Edmonds, F. Gao, H. Liu, X. Xie, S. Qi, B. Rothrock, Y. Zhu, Y. N. Wu, H. Lu, S.-C. Zhu, A tale of two explanations: Enhancing human trust by explaining robot behavior. Sci. Robot. 4, aay4663 (2019).
13
M. T. Ribeiro, S. Singh, C. Guestrin, “Why should I trust you?”: Explaining the predictions of any classifier, in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Association for Computing Machinery, 2016).
14
H. Liu, Y. Zhang, W. Si, X. Xie, Y. Zhu, S.-C. Zhu, Interactive robot knowledge patching using augmented reality, in Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA) (IEEE, 2018).
15
Q. Zhang, X. Wang, Y. N. Wu, H. Zhou, S.-C. Zhu, Interpretable CNNs for object classification. IEEE Trans. Pattern Anal. Mach. Intell. 43, 3416–3431 (2020).
16
Z. Zhang, Y. Zhu, S.-C. Zhu, Graph-based hierarchical knowledge representation for robot task transfer from virtual to physical world, in Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE, 2021).
17
T. Chakraborti, S. Sreedharan, Y. Zhang, S. Kambhampati, Plan explanations as model reconciliation: Moving beyond explanation as soliloquy, in Proceedings of International Joint Conference on Artificial Intelligence (IJCAI) (AAAI Press, 2017), pp. 156–163.
18
Z. Gong, Y. Zhang, Behavior explanation as intention signaling in human-robot teaming, in Proceedings of International Symposium on Robot and Human Interactive Communication (RO-MAN) (IEEE, 2018), pp. 1005–1011.
19
A. Tabrez, S. Agrawal, B. Hayes, Explanation-based reward coaching to improve human performance via reinforcement learning, in Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction (HRI) (IEEE, 2019).
20
H. S. Huang, D. Held, P. Abbeel, A. Dragan, Enabling robots to communicate their objectives. Autonom. Robots 43, 309–326 (2019).
21
T. Yuan, H. Liu. L. Fan, Z. Zhen, T. Gao, Y. Zhu, S.-C. Zhu, Joint inference of states, robot knowledge, and human (false-) beliefs, in Proceedings of International Conference on Robotics and Automation (ICRA) (IEEE, 2020).
22
X. Gao, R. Gong, Y. Zhao, S. Wang, T. Shu, S.-C. Zhu, Joint mind modeling for explanation generation in complex human-robot collaborative tasks, in 2020 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN) (IEEE, 2020), pp. 1119–1126.
23
S. Russell, Human Compatible: Artificial Intelligence and the Problem of Control (Penguin, 2019).
24
H. H. Clark, Using Language (Cambridge Univ. Press, 1996).
25
P. A. Samuelson, A note on the pure theory of consumer’s behaviour. Economica 5, 61 (1938).
26
B. A. Huberman, The ecology of computation, in Digest of Papers. COMPCON Spring 89. Thirty-Fourth IEEE Computer Society International Conference: Intellectual Leverage (IEEE, 1988), p. 362.
27
M. K. Ho, M. Littman, J. MacGlashan, F. Cushman, J. L. Austerweil, Showing versus doing: Teaching by demonstration, in Proceedings of Advances in Neural Information Processing Systems (NeurIPS) (IEEE, 2016).
28
H. P. Grice, in Speech Acts (Brill, 1975), pp. 41–58.
29
N. D. Goodman, A. Stuhlmüller, Knowledge and implicature: Modeling language understanding as social cognition. Topics Cognit. Sci. 5, 173–184 (2013).
30
P. Shafto, N. D. Goodman, T. L. Griffiths, A rational account of pedagogical reasoning: Teaching by, and learning from, examples. Cogn. Psychol. 71, 55–89 (2014).
31
L. Yuan, D. Zhou, J. Shen, J. Gao, J. L. Chen, Q. Gu, Y. N. Wu, S.-C. Zhu, Iterative teacher-aware learning, in Proceedings of Advances in Neural Information Processing Systems (NeurIPS) (IEEE, 2021).
32
J. A. Simpson, Psychological foundations of trust. Curr. Dir. Psychol. Sci. 16, 264–268 (2007).
33
M. Rheu, J. Y. Shin, W. Peng, J. Huh-Yoo, Systematic review: Trust-building factors and implications for conversational agent design. Int. J. Human Comput. Interact. 37, 81–96 (2021).
34
M. Johnson, A. Vera, No AI is an island: The case for teaming intelligence. AI Mag. 40, 16–28 (2019).
35
A. Barreto, W. Dabney, R. Munos, J. J. Hunt, T. Schaul, H. P,. van Hasselt, D. Silver, Successor features for transfer in reinforcement learning, in Proceedings of Advances in Neural Information Processing Systems (NeurIPS) (IEEE, 2017).
36
N. Wang, D. V. Pynadath, S. G. Hill, Trust calibration within a human-robot team: Comparing automatically generated explanations, in Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction (HRI) (IEEE, 2016).
37
J. C. Licklider, R. W. Taylor, The computer as a communication device. Sci. Technol. 76, 21–31 (1968).
38
L. B. Resnick, J. M. Levine, S. Behrend, Socially Shared Cognition (American Psychological Association, 1991).
39
S. Arora, P. Doshi, A survey of inverse reinforcement learning: Challenges, methods and progress. Artificial Intel. 297, 103500 (2021).
40
D. Silver, J. Veness, Monte-Carlo planning in large POMDPs, in Proceedings of Advances in Neural Information Processing Systems (NeurIPS) (IEEE, 2010).
41
A. Rothe, B. M. Lake, T. M. Gureckis, Do people ask good questions? Comput. Brain Behav. 1, 69–89 (2018).
42
M. Chen, S. Nikolaidis, H. Soh, D. Hsu, S. Srinivasa, Planning with trust for human-robot collaboration, in Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction (HRI) (IEEE, 2018), pp. 307–315.
43
N. J. Smith, N. Goodman, M. Frank, Learning and using language via recursive pragmatic reasoning about other agents, in Proceedings of Advances in Neural Information Processing Systems (NeurIPS) (IEEE, 2013).
44
R. Carston, Informativeness, relevance and scalar implicature, in Pragmatics And Beyond New Series (John Benjamins Publishing Company, 1998), pp. 179–238.
45
A. Vogel, M. Bodoia, C. Potts, D. Jurafsky, Emergence of gricean maxims from multi-agent decision theory, in Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL) (ACM, 2013).
46
W. Liu, B. Dai, Z. Li, Z. Liu, J. Rehg, L. Song, Towards black-box iterative machine teaching, in Proceedings of International Conference on Machine Learning (ICML) (PMLR, 2018).
47
P. Wang, J. Wang, P. Paranamana, P. Shafto, A mathematical theory of cooperative communication, in Proceedings of Advances in Neural Information Processing Systems (NeurIPS) (IEEE, 2020).
48
H. de Weerd, R. Verbrugge, B. Verheij, Negotiating with other minds: The role of recursive theory of mind in negotiation with incomplete information. Autonom. Agents Multi Agent Syst. 31, 250–287 (2017).
49
T. Peltola, M. M. Çelikok, P. Daee, S. Kaski, Machine teaching of active sequential learners, in Proceedings of Advances in Neural Information Processing Systems (NeurIPS) (IEEE, 2019).
50
A. Lock, The Guided Reinvention of Language (Academic Press, 1980).
51
M. Tomasello, J. Call, Primate Cognition (Oxford Univ. Press, 1997).
52
M. Tomasello, Do apes ape, in Social Learning in Animals: The Roots of Culture, C. M. Heyes, B. G. Galef, Jr. Eds. (Academic Press, 1996), pp. 319–346.
53
Z. Tu, S.-C. Zhu, Image segmentation by data-driven Markov chain Monte Carlo. IEEE Trans. Pattern Anal. Mach. Intell. 24, 657 (2002).
54
N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, E. Teller, Equation of state calculations by fast computing machines. J. Chem. Phys. 21, 1087 (1953).

Information & Authors

Information

Published In

View large Science Robotics cover image
Science Robotics
Volume 7 | Issue 68
July 2022

Submission history

Received: 18 November 2021
Accepted: 21 June 2022

Permissions

Request permissions for this article.

Acknowledgments

The protocol for human study was reviewed and approved by the UCLA North IRB (ID no. 20-001767).
Funding: This work was supported by DARPA XAI N66001-17-2-4029.
Author contributions: L.Y.: devising algorithms, coding, designing and running participant study, analyzing data, and writing. X.G.: building the game interface, devising algorithms, coding, designing participant study, analyzing data, and writing. Z.Z.: devising algorithms, coding, designing participant study, analyzing data, and writing. M.E.: building the game interface, devising algorithms, coding, designing participant study, analyzing data, and writing. Y.N.W.: devising algorithms, writing, and providing the environment and the funding support for conducting this research. F.R.: designing and running participant study and writing. H.L.: designing participant study, examining data processing, and writing. Y.Z.: designing participant study, examining data processing, and writing. S.-C.Z.: setting research direction and providing the environment and the funding support for conducting this research.
Competing interests: S.-C.Z. has affiliations with Beijing Institute for General Artificial Intelligence (BIGAI), Peking University, and Tsinghua University; Y.Z. is currently an employee of Peking University; M.E. is currently an employee of Cruise Automation; Z.Z. is currently an employee of BIGAI; X.G. has affiliations with Amazon Inc. The other authors declare that they have no competing interests. However, the research presented in this article is funded primarily by the DARPA XAI project and primarily conducted at UCLA and UCSD; later study analysis and paper writing are partially conducted while Y.Z., Z.Z., and S.-C.Z. are with Peking University and BIGAI.
Data and materials availability: All data and software needed to evaluate the study of this paper are available in the paper or the Supplementary Materials. The code and data for this work have been deposited in the Dryad database https://doi.org/10.5068/D1XT3V.

Authors

Affiliations

Department of Computer Science, University of California, Los Angeles, Los Angeles, CA 90095, USA.
Roles: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing - original draft, and Writing - review & editing.
Department of Statistics, University of California, Los Angeles, Los Angeles, CA 90095, USA.
Roles: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Resources, Software, Validation, Visualization, Writing - original draft, and Writing - review & editing.
Department of Computer Science, University of California, Los Angeles, Los Angeles, CA 90095, USA.
Beijing Institute for General Artificial Intelligence (BIGAI), Beijing 100080, China.
Roles: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Software, Supervision, Validation, Visualization, Writing - original draft, and Writing - review & editing.
Department of Computer Science, University of California, Los Angeles, Los Angeles, CA 90095, USA.
Roles: Conceptualization, Formal analysis, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing - original draft, and Writing - review & editing.
Department of Statistics, University of California, Los Angeles, Los Angeles, CA 90095, USA.
Roles: Conceptualization, Methodology, and Writing - review & editing.
Department of Cognitive Science, University of California, San Diego, San Diego, CA 92093, USA.
Roles: Conceptualization, Investigation, Methodology, Project administration, Resources, Supervision, and Writing - review & editing.
Department of Statistics, University of California, Los Angeles, Los Angeles, CA 90095, USA.
Department of Psychology, University of California, Los Angeles, Los Angeles, CA 90095, USA.
Roles: Methodology, Resources, Supervision, and Writing - review & editing.
Department of Statistics, University of California, Los Angeles, Los Angeles, CA 90095, USA.
Beijing Institute for General Artificial Intelligence (BIGAI), Beijing 100080, China.
Institute for Artificial Intelligence, Peking University, Beijing 100871, China.
Roles: Conceptualization, Investigation, Methodology, Project administration, Supervision, Validation, and Writing - original draft.
Department of Computer Science, University of California, Los Angeles, Los Angeles, CA 90095, USA.
Department of Statistics, University of California, Los Angeles, Los Angeles, CA 90095, USA.
Beijing Institute for General Artificial Intelligence (BIGAI), Beijing 100080, China.
Institute for Artificial Intelligence, Peking University, Beijing 100871, China.
Roles: Conceptualization, Funding acquisition, Methodology, Project administration, and Supervision.

Funding Information

Defense Advanced Research Projects Agency: DARPA XAI N66001-17-2-4029

Notes

*
Corresponding author. Email: [email protected] (L.Y.); [email protected] (M.E.); [email protected] (H.L.); [email protected] (Y.Z.); [email protected] (S.-C.Z.)
These authors contributed equally to this work.

Metrics & Citations

Metrics

Article Usage
Altmetrics

Citations

Export citation

Select the format you want to export the citation of this publication.

View Options

Check Access

Log in to view the full text

AAAS ID LOGIN

AAAS login provides access to Science for AAAS Members, and access to other journals in the Science family to users who have purchased individual subscriptions.

Log in via OpenAthens.
Log in via Shibboleth.

More options

Purchase access to this article

Download and print this article within 24 hours for your personal scholarly, research, and educational use.

View options

PDF format

Download this article as a PDF file

Download PDF

Media

Figures

Multimedia

Tables

Share

Share

Share article link

Share on social media

(0)eLetters

eLetters is an online forum for ongoing peer review. Submission of eLetters are open to all. eLetters are not edited, proofread, or indexed. Please read our Terms of Service before submitting your own eLetter.

Log In to Submit a Response

No eLetters have been published for this article yet.