The author introduces Semantic Multi-Object Tracking (SMOT) to integrate "where" and "what" in tracking, aiming for comprehensive video analysis.