PALO: A Polyglot Large Multimodal Model for 5B People
Kernekoncepter
PALO is a large multilingual multimodal model designed to bridge the gap between vision and language tasks across ten major languages, offering inclusive and high-performing capabilities.
"Propelled by advancements in generative AI, Large Multimodal Models (LMMs) have emerged as a pivotal advancement in the field, seamlessly bridging the gap between vision and language tasks."
"Our work addresses this disparity by developing the first fully open-source multilingual LMM called PALO, which encompasses ten major languages covering 65% of the global population."
"PALO offers visual reasoning capabilities in 10 major languages that span a total of ∼5B people (65% of the world population)."
"The resulting polyglot LMMs demonstrate performance gains on diverse language tasks with substantial improvements in understanding and generating content for low-resource languages."
"We introduce PALO, a polyglot LLM for 5B people, covering almost two-thirds of the world’s population."