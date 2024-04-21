XAI Grok 1.5 Vision Leverages Tesla FSD to Understand the Real World

Grok-1.5V is competitive with existing frontier multimodal models in a number of domains, ranging from multi-disciplinary reasoning to understanding documents, science diagrams, charts, screenshots, and photographs. Grok has capabilities in understanding our physical world. Grok outperforms its peers in our new RealWorldQA benchmark that measures real-world spatial understanding. For all datasets below, they evaluate Grok in a zero-shot setting without chain-of-thought prompting. This leading real world understanding is building upon Tesla data and work with FSD (full self driving).

Writing Code from a Handdrawn Flowchart

Calories from Food Labels

