Humans Predict Action using Grammar-like Structures

Sci Rep. 2020 Mar 4;10(1):3999. doi: 10.1038/s41598-020-60923-5.

Abstract

Efficient action prediction is of central importance for the fluent workflow between humans and equally so for human-robot interaction. To achieve prediction, actions can be algorithmically encoded by a series of events, where every event corresponds to a change in a (static or dynamic) relation between some of the objects in the scene. These structures are similar to a context-free grammar and, importantly, within this framework the actual objects are irrelevant for prediction, only their relational changes matter. Manipulation actions and others can be uniquely encoded this way. Using a virtual reality setup and testing several different manipulation actions, here we show that humans predict actions in an event-based manner following the sequence of relational changes. Testing this with chained actions, we measure the percentage predictive temporal gain for humans and compare it to action-chains performed by robots showing that the gain is approximately equal. Event-based and, thus, object independent action recognition and prediction may be important for cognitively deducing properties of unknown objects seen in action, helping to address bootstrapping of object knowledge especially in infants.

Publication types

  • Clinical Trial
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Female
  • Humans
  • Knowledge
  • Linguistics*
  • Male
  • Recognition, Psychology / physiology*
  • Virtual Reality*
  • Visual Perception / physiology*