Belangrijkste concepten
Large language models (LLMs) can effectively leverage contextual information and physiological data from wearable sensors to deliver accurate predictions across diverse consumer health tasks.
Samenvatting
The paper presents a comprehensive evaluation of eight state-of-the-art LLMs, including Med-Alpaca, PMC-Llama, Asclepius, ClinicalCamel, Flan-T5, Palmyra-Med, GPT-3.5, and GPT-4, on thirteen health prediction tasks across six public health datasets. The tasks cover mental health, activity tracking, metabolism, sleep, and cardiology.
The experiments include four steps: (i) zero-shot prompting, (ii) few-shot prompting with chain-of-thoughts (CoT) and self-consistency (SC) techniques, (iii) instructional fine-tuning, and (iv) ablation studies with context enhancement strategies.
The key findings are:
- Zero-shot prompting shows comparable results to task-specific baseline models, indicating LLMs' inherent capabilities for health prediction.
- Few-shot prompting with larger LLMs like GPT-3.5 and GPT-4 can effectively ground numerical time-series data, leading to significant improvements over zero-shot and fine-tuned models.
- The fine-tuned Health-Alpaca model exhibits the best performance in 5 out of 13 tasks, despite being substantially smaller than GPT-3.5 and GPT-4.
- The context enhancement strategy, which strategically includes user profile, health knowledge, and temporal information in the prompts, can yield up to 23.8% performance improvement, highlighting the importance of contextual information for LLMs in the healthcare domain.
Statistieken
Steps: {812.0} steps, Burned Calories: {97.0} calories, Resting Heart Rate: {66.54} beats/min, Sleep Minutes: {487.0} minutes
Steps: {"NaN, 991.0, ..., NaN"} steps, Burned Calories: {"NaN, 94.0 ..., NaN"} calories, Resting Heart Rate: {"69.32, 67.72, ..., 64.55"} beats/min, Sleep Minutes: {"534.0, 455.0, ..., 405.0"} minutes
Citaten
"Readiness score is an indicator of how prepared our body is for physical activity. It is decided by activity, recent sleep and heart rate variability."
"The user is 23-year-old male with 182 cm."