A novel approach for estimating the 6D pose of novel objects using only a textual prompt, without requiring object models or video sequences.