Efficient Optimization of Large Vision-Language Models with FastV Method
The author identifies inefficiencies in visual attention within LVLMs and proposes FastV, a method to optimize computational efficiency by pruning visual tokens based on attention scores, reducing FLOPs without compromising performance.