Enhancing Multimodal Large Language Models with Unified Structure Learning for improved OCR-free Document Understanding.
統一構造学習によるOCRフリー文書理解の向上