LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models, introduces a new method for generating 3D models using large language models (LLMs). The authors address the challenge of tokenizing 3D mesh data for LLMs by representing the mesh data as plain text using the OBJ file format, a standard text-based format for 3D models. This approach allows for direct integration with LLMs without modifying the vocabulary or tokenizers, minimizing additional training overhead. The study then introduces LLAMA-MESH, a fine-tuned LLaMA model that can generate 3D meshes from textual prompts, produce interleaved text and 3D mesh outputs, and understand and interpret 3D meshes. LLAMA-MESH achieves comparable mesh generation quality to models trained from scratch while maintaining strong text generation abilities, demonstrating the potential for LLMs to become universal generative tools for multiple modalities.
資訊
- 節目
- 頻率每日更新
- 發佈時間2024年11月21日 下午12:32 [UTC]
- 長度19 分鐘
- 年齡分級兒少適宜