Konsep Inti
A framework and method for automatic recalibration of quantum devices using reinforcement learning techniques, which can adapt the device configuration to changing experimental conditions.
Abstrak
The authors present a framework and method for the automatic recalibration of quantum devices using reinforcement learning (RL) techniques. The key aspects are:
Device Configuration: The device is controlled by continuous parameters θ = {θ1, ..., θM} that define its configuration.
Score Function: The quality of a device configuration θ is evaluated by a score function SE(θ), which depends on the current experimental conditions E. However, the detailed model of the environment E is often computationally unaffordable or costly to obtain.
Effective Score Function: The authors introduce an effective score function S̃E(θ), which is an approximate model of the true score function SE(θ). This effective model is used to initialize the RL agent's value function estimates.
Reinforcement Learning: The RL agent interacts with the device, selecting parameter values θ, observing measurement outcomes, and receiving rewards based on the accuracy of the final actions. The agent uses this feedback to fine-tune the device configuration.
De-calibration Witness: The agent monitors a de-calibration witness Wd, which is an experimentally accessible quantity that can detect when the device enters an off-calibration stage, triggering a recalibration.
Automatic Recalibration: The method combines the effective score function, RL, and the de-calibration witness to automatically recalibrate the device when the experimental conditions change, without requiring full knowledge of the environment model.
The authors showcase the proposed method using a numerical example of a Kennedy receiver-based long-distance quantum communication protocol, demonstrating the ability to adapt the device configuration to changing environmental conditions.
Statistik
The success probability of the communication protocol depends on the displacement value θ and the guessing rule k̂(θ, n).
The intensity |α|^2 of the transmitted signals is initially estimated using Neff experiment repetitions.
Kutipan
"While not expected to be fully accurate — does not retrieve the exact device configuration for each specific experimental condition—, it shall be thought as an ansatz for controls initialization."
"The success of our method hinges on the capabilities of devising an approximate model of setting's dependence with respect to changes in its surroundings (which we indistinctly call environment)."