Core Concepts
This paper proposes the "must-finish" idiom to enable the direct specification of liveness requirements in Behavioral Programming (BP), an executable specification paradigm. It presents two execution mechanisms, one based on Generalized Büchi Automata and another based on Markov Decision Processes, to enforce liveness properties during BP program execution. The paper also demonstrates the potential of the MDP-based approach in learning systems with large state spaces using deep reinforcement learning techniques.
Abstract
The paper addresses the limitation of the existing Behavioral Programming (BP) paradigm in expressing and executing liveness requirements, which are properties that specify "something good will eventually occur." It introduces a new "must-finish" idiom that allows BP users to directly model liveness requirements in their specifications.
The paper presents two execution mechanisms to enforce liveness properties during BP program execution:
GBA-based approach: The BP program is transformed into a Generalized Büchi Automaton (GBA), and the game-theoretic solution to the GBA is used to guide the event selection mechanism and ensure liveness.
MDP-based approach: The BP program is formulated as a Markov Decision Process (MDP), and a reward function is designed to capture the desired liveness behavior. The optimal action-value function of the MDP is then used to define a liveness-preserving event selection strategy.
The paper also demonstrates the potential of the MDP-based approach in handling large state spaces by leveraging deep reinforcement learning techniques. It evaluates the scalability of this approach using a parameterized version of the level-crossing benchmark.
The key contributions of the paper are:
Introducing the "must-finish" idiom to enable direct specification of liveness requirements in BP.
Proposing two execution mechanisms, GBA-based and MDP-based, to enforce liveness properties during BP program execution.
Showcasing the scalability of the MDP-based approach using deep reinforcement learning for systems with large state spaces.