ChatGPT, a large language model, can outperform specialized GNN models on the CLRS algorithmic reasoning benchmark by directly executing classical algorithms in Python.
Decomposing the reasoning process of language models into two steps - THINK to discover task-level logic expressed in pseudocode, and EXECUTE to tailor the pseudocode and simulate its execution - can significantly improve their performance on various algorithmic reasoning tasks.