“Help me” Baseline
“I think if the robot was clearer or I saw it assemble the desk before, I
would know more about what it was asking me.”
“Did not really feel like ‘working together as a team’ – For more com-
plex furniture it would be more efficient for robot to say what action the human
should do?”
“The difficulty is not so much working together but the robots not being
able to communicate the actual problem they have. Also it is unclear which
ones of the robots has the problem.”
G
3
Inverse Semantics with S
2
“More fun than working alone.”
“I was focused on my own task but could hear the robot when it needed
help.... However, I like the idea of having a robot help you multitask.”
“There was a sense of being much more productive than I would have
been on my own.”
Figure 8: Comments from participants in our study.
users. Our results suggest that improving the nonverbal
communication that happens during handoff would signifi-
cantly improve the overall effectiveness of our system. Sec-
ond, a significant limitation of the overall system was the
frequent intervention by the experimenters to deal with un-
expected failures. Both of these conditions might be mod-
ified by a more nuanced model of the help that a human
teammate could provide. For example, if the robots could
predict that handoffs are challenging for people to success-
fully complete, they might ask for a different action, such as
to place the part on the ground near the robot. Similarly, if
the robots were able to model the ability of different people
to provide targeted help, they might direct some requests to
untrained users, and other requests to “level 2” tech support.
The different types of interventions provided by the exper-
imenters compared to the subjects points to a need for the
robots to model specific types of help that different people
can provide, as in Rosenthal et al. [2011].
5.3 Conclusion
The goal of our evaluation was to assess the effective-
ness of various approaches for generating requests for help.
The corpus-based evaluation compares the inverse semantics
method to several baselines in an online evaluation, demon-
strating that the inverse semantics algorithm significantly
improves the accuracy of a human’s response to a natural
language request for help compared to baselines. Our end-
to-end evaluation demonstrates that this improvement can
be realized in the context of a real-world robotic team in-
teracting with minimally trained human users. This work
represents a step toward the goal of mixed-initiative human-
robot cooperative assembly.
Our end-to-end evaluation highlights the strength of the
system, but also its weakness. Robots used a single model
for a person’s ability to act in the environment; in reality, dif-
ferent people have different abilities and willingness to help
the robot. Second, because the robots spoke to people, re-
questing help, some subjects responded by asking clarifying
questions. Developing a dialog system capable of answering
questions from people in real time could provide disambigua-
tion when people fail to understand the robot’s request. As
we move from robot-initiated to mixed-initiative communi-
cation, the reliance on common ground and context increases
significantly. Since our models can be expected to remain
imperfect, the demand for unambiguous sentences becomes
less satisfiable. In the long term, we aim to develop robots
with increased task-robustness in a variety of domains by
leveraging the ability and willingness of human partners to
assist robots in recovering from a wide variety of failures.
6. ACKNOWLEDGMENTS
This work was supported in part by the Boeing Company,
and in part by the U.S Army Research Laboratory under the
Robotics Collaborative Technology Alliance.
The authors thank Dishaan Ahuja and Andrew Spielberg
for their assistance in conducting the experiments.
References
M. Bollini, S. Tellex, T. Thompson, N. Roy, and D. Rus. Interpreting and execut-
ing recipes with a cooking robot. In 13th International Symposium on Experimental
Robotics, 2012.
D. L. Chen and R. J. Mooney. Learning to interpret natural language navigation
instructions from observations. In Proc. AAAI, 2011.
G. Dorais, R. Banasso, D. Kortenkamp, P. Pell, and D. Schreckenghost. Ad-
justable autonomy for human-centered autonomous systems on mars, 1998.
A. Dragan and S. Srinivasa. Generating legible motion. In Robotics: Science and
Systems, June 2013.
J. Dzifcak, M. Scheutz, C. Baral, and P. Schermerhorn. What to do and how
to do it: Translating natural language directives into temporal and dynamic
logic representation for goal management and action execution. In Proc. IEEE
Int’l Conf. on Robotics and Automation, pages 4163–4168, 2009.
T. Fong, C. Thorpe, and C. Baur. Robot, asker of questions. Journal of Robotics
and Autonomous Systems, 42:235–243, 2003.
K. Garoufi and A. Koller. Combining symbolic and corpus-based approaches
for the generation of successful referring expressions. In Proceedings of the 13th
European Workshop on Natural Language Generation, pages 121–131. Association for
Computational Linguistics, 2011.
D. Golland, P. Liang, and D. Klein. A game-theoretic approach to generating
spatial descriptions. In Proceedings of the 2010 conference on empirical methods in
natural language processing, pages 410–419. Association for Computational Lin-
guistics, 2010.
N. D. Goodman and A. Stuhlm
¨
uller. Knowledge and implicature: Modeling lan-
guage understanding as social cognition. Topics in cognitive science, 5(1):173–
184, 2013.
D. Jurafsky and J. H. Martin. Speech and Language Processing. Pearson Prentice
Hall, 2 edition, May 2008. ISBN 0131873210.
R. A. Knepper, T. Layton, J. Romanishin, and D. Rus. IkeaBot: An autonomous
multi-robot coordinated furniture assembly system. In Proc. IEEE Int’l Conf. on
Robotics and Automation, Karlsruhe, Germany, May 2013.
T. Kollar, S. Tellex, D. Roy, and N. Roy. Toward understanding natural language
directions. In Proc. ACM/IEEE Int’l Conf. on Human-Robot Interaction, pages 259–
266, 2010.
M. MacMahon, B. Stankiewicz, and B. Kuipers. Walk the talk: Connecting
language, knowledge, and action in route instructions. In Proc. Nat’l Conf. on
Artificial Intelligence (AAAI), pages 1475–1482, 2006.
J. Maitin-Shepard, J. Lei, M. Cusumano-Towner, and P. Abbeel. Cloth grasp
point detection based on multiple-view geometric cues with application to
robotic towel folding. In Proc. IEEE Int’l Conf. on Robotics and Automation, An-
chorage, Alaska, USA, May 2010.
C. Matuszek, N. FitzGerald, L. Zettlemoyer, L. Bo, and D. Fox. A joint model
of language and perception for grounded attribute learning. Arxiv preprint
arXiv:1206.6423, 2012.
E. Reiter and R. Dale. Building Natural Language Generation Systems. Cambridge
University Press, Jan. 2000. ISBN 9780521620369.
S. Rosenthal, M. Veloso, and A. K. Dey. Learning accuracy and availability of
humans who help mobile robots. In Proc. AAAI, 2011.
D. Roy. A trainable visually-grounded spoken language generation system. In
Proceedings of the International Conference of Spoken Language Processing, 2002.
R. Simmons, S. Singh, F. Heger, L. M. Hiatt, S. C. Koterba, N. Melchior, and
B. P. Sellner. Human-robot teams for large-scale assembly. In Proceedings of
the NASA Science Technology Conference, May 2007.
K. Striegnitz, A. Denis, A. Gargett, K. Garoufi, A. Koller, and M. Theune. Re-
port on the second second challenge on generating instructions in virtual en-
vironments (give-2.5). In Proceedings of the 13th European Workshop on Natural Lan-
guage Generation, pages 270–279. Association for Computational Linguistics,
2011.
S. Tellex, T. Kollar, S. Dickerson, M. Walter, A. Banerjee, S. Teller, and N. Roy.
Understanding natural language commands for robotic navigation and mobile
manipulation. In Proc. AAAI, 2011.
A. Vogel, M. Bodoia, C. Potts, and D. Jurafsky. Emergence of gricean maxims
from multi-agent decision theory. In Proceedings of NAACL 2013, 2013a.
A. Vogel, C. Potts, and D. Jurafsky. Implicatures and nested beliefs in approx-
imate Decentralized-POMDPs. In Proceedings of the 51st Annual Meeting of the
Association for Computational Linguistics, Sofia, Bulgaria, August 2013b. Asso ci-
ation for Computational Linguistics.
R. Wilson. Minimizing user queries in interactive assembly planning. IEEE Trans-
actions on Robotics and Automation, 11(2), April 1995.