帖文详情
avatar
@_akhaliq@x.good.news
InternVideo2 Scaling Video Foundation Models for Multimodal Video Understanding We introduce InternVideo2, a new video foundation model (ViFM) that achieves the state-of-the-art performance in action recognition, video-text tasks, and video-centric dialogue. Our approach
查看详情
0
0
0
@_akhaliq@x.good.news
0/478
加载中