Design and evaluation of a global workspace agent embodied in a realistic multimodal environment

Rousslan Fernand Julien Dossa; Kai Arulkumaran; Arthur Juliani; Shuntaro Sasai; Ryota Kanai

doi:10.3389/fncom.2024.1352685

Design and evaluation of a global workspace agent embodied in a realistic multimodal environment

Front Comput Neurosci. 2024 Jun 14:18:1352685. doi: 10.3389/fncom.2024.1352685. eCollection 2024.

Authors

Rousslan Fernand Julien Dossa¹, Kai Arulkumaran¹, Arthur Juliani², Shuntaro Sasai¹, Ryota Kanai¹

Affiliations

¹ Araya Inc., Tokyo, Japan.
² Microsoft Research, New York, NY, United States.

Abstract

As the apparent intelligence of artificial neural networks (ANNs) advances, they are increasingly likened to the functional networks and information processing capabilities of the human brain. Such comparisons have typically focused on particular modalities, such as vision or language. The next frontier is to use the latest advances in ANNs to design and investigate scalable models of higher-level cognitive processes, such as conscious information access, which have historically lacked concrete and specific hypotheses for scientific evaluation. In this work, we propose and then empirically assess an embodied agent with a structure based on global workspace theory (GWT) as specified in the recently proposed "indicator properties" of consciousness. In contrast to prior works on GWT which utilized single modalities, our agent is trained to navigate 3D environments based on realistic audiovisual inputs. We find that the global workspace architecture performs better and more robustly at smaller working memory sizes, as compared to a standard recurrent architecture. Beyond performance, we perform a series of analyses on the learned representations of our architecture and share findings that point to task complexity and regularization being essential for feature learning and the development of meaningful attentional patterns within the workspace.

Keywords: artificial neural networks; attention; embodiment; global workspace theory; imitation learning.

Grants and funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This work was supported by JST, Moonshot R&D Grant Number JPMJMS2012.