Directly leveraging real-time online human behaviors to align large language models, avoiding the limitations of predefined preference signals and human annotations.