Core Concepts
A CNN-based system is developed to accurately detect the positional shifts and rotations of all figures on a semi-automated foosball table, providing the required game state data for future reinforcement learning and imitation learning experiments.
Abstract
The paper presents a CNN-based game state detection system for a semi-automated foosball table, where the black team is controlled by motors and the white team is controlled by human players.
The key highlights are:
Creation and verification of a ground truth dataset for training the CNN-based detection models. The dataset includes the shifts and rotations of both the black and white figures, measured using a combination of motor data and accelerometers.
Development of an end-to-end regression model that can detect the positional shifts and rotations of all figures on the table, without the need for an intermediate object detection step as in previous work.
Evaluation of different CNN backbones (ResNet, MobileNet, EfficientNet) as feature extractors for the regression model. The ResNet18-based model achieved the best performance, with mean absolute errors of 3.88 mm for position and 5.93 degrees for rotation.
Proposal of a data provisioning system based on ZeroMQ to enable real-time access to the game state data by multiple clients, such as reinforcement learning or imitation learning agents.
The authors discuss the limitations of the current system, such as the dependence on lighting conditions and image blur, and outline future research directions to improve the real-time capabilities and robustness of the system. The ultimate goal is to employ the game state detection system for capturing human-played foosball matches, which can then be used to train reinforcement learning agents through imitation learning.
Stats
The motors controlling the black figures report their shift and rotation, which is used as ground truth.
The positional shifts of the white figures are calculated using traditional computer vision techniques.
The rotations of the white figures are measured using accelerometers mounted on the rods.
Quotes
"The game state can be defined as the positional shift and the rotations of the figure rods plus the position of the ball as a function of time."
"Our system developed in this work is able to detect the game state of all figures (black and white) of the Foosball table using Deep CNN and Computer Vision."
"By providing data for both black and white teams, the presented system is intended to provide the required data for future developments of Imitation Learning techniques w.r.t. to observing human players."