Androids and Robots

aiai

A monkey-shaped robot called “aiai” greets visitors at the pavilion entrance. It can carry on conversations with another robot through AI, and when a human joins, the three of them can continue the discussion.The pale pink coloring of the Japanese macaque’s face served as the inspiration for aiai, establishing a narrative link to the pink android figure representing humanity 1000 years in the future that appears at the end of the pavilion experience.

Details

Role within the Pavilion

Two aiai units are installed at the entrance to share important information and experience-related guidance with visitors.

Dimensions

W300 × D310 × H484 mm (excluding tail)

Weight

4.3 kg

Specifications

A robot developed through the RIKEN Guardian Robot Project*. It is equipped with recognition and dialogue functions created within the project to enable robots to autonomously provide unobtrusive support.
The two aiai robots converse with each other in the Osaka dialect.
The stone pedestal each aiai sits on is equipped with a monitor that displays real-time English translations of their conversation.
Degrees of freedom: Total of 8 degrees of freedom consisting of 2 in the head, 2 per arm, 1 in the waist, and 1 for mouth opening/closing
Sensors: Around the robot body are 1 depth image camera, 2 LiDAR sensors, and two 16-channel microphone arrays
Appearance: Realistic monkey-like form, including a monkey’s face and full-body fur

* Guardian Robot Project

Launched in April 2019, this initiative promotes the development and social implementation of next-generation robots combining neuroscience with AI technologies for purposes of creating a future society where humans, AI, and robots can coexist seamlessly. The project integrates strengths from psychology, neuroscience, cognitive science, and AI research. Unlike conventional robots, which typically perform only limited tasks based on detailed human instructions, this project envisions robots capable of autonomously perceiving their environment and the condition of the humans they assist, engaging appropriately without undermining human autonomy, and delivering unobtrusive support.

Functional Features

(1) Recognition of human position and speech content

The two LiDAR sensors measure the position of people nearby, and using this positional data and the audio data captured by the two microphone arrays, the system isolates human speech from the surrounding noise. A recognition system based on a large language model allows the robot to understand the content of human speech.

(2) Recognition of environmental context

With its depth image sensor and a recognition system trained on large-scale data, aiai can perceive contextual details about people nearby—for example, what color clothes they are wearing or what they are carrying.

(3) Dialogue function

The robot recognizes the appearance and speech of visitors and observes its surroundings. It generates conversation using a large language model that takes into account the events happening in the moment, allowing visitors to experience a sense of shared space.

(4) Two-robot dialogue function

Rather than directly addressing the visitors, the two robots engage in a conversation with each other based on their perception of the environment. In doing so, they draw the visitors into the contextual world generated by the robots themselves.