英伟达NVIDIA:Cosmos 3:面向物理AI的全模态世界模型技术报告(英文版).pdf |
下载文档 |
资源简介
We introduce Cosmos3, afamilyofomnimodalworldmodelsdesignedtojointlyprocessandgeneratelan guage, image, video, audio, and action sequences within a unified mixture-of-transformers architecture. By supporting highly flexible input-output configurations, Cosmos 3 seamlessly unifies critical modalities for Physical AI—effectively subsuming vision-language models, video generators, world simulators, and arXiv:2606.02800v4 [cs.CV] 23 Jun 2026 world-action models into a single framework. Our eval
本文档仅能预览20页



