Quick answer
AI Summary: Discovers that multimodal models like CLIP naturally develop 'multimodal neurons' that abstractly link images, text, and sketches of a concept, revealing both profound cognitive similarities to the human brain and novel adversarial vulnerabilities.