Image. Credit
Alibaba Launches Advanced AI Models
While Chinese AI lab DeepSeek grabs headlines this week, Alibaba has unveiled its own ambitious project. On Monday, the company’s Qwen team introduced the Qwen2.5-VL series—an advanced family of AI models capable of performing complex text and image analysis tasks. These models can parse documents, interpret videos, count objects in images, and even control a PC, echoing the functionality of OpenAI’s Operator.
In benchmark tests, the Qwen2.5-VL flagship model outperformed major players like OpenAI’s GPT-4o, Anthropic’s Claude 3.5 Sonnet, and Google’s Gemini 2.0 Flash in areas like video understanding, math, document analysis, and Q&A evaluations.
Impressive Capabilities and Accessibility
Qwen2.5-VL’s advanced features include the ability to analyze charts, extract data from invoices and forms, and interpret hours-long videos. Additionally, the model can recognize intellectual property from films and TV shows, as well as a wide range of products, hinting that training data may include copyrighted material.
The model’s interactive capabilities extend beyond data analysis. A demonstration by Philipp Schmid, a technical lead at Hugging Face, showcased Qwen2.5-VL booking flights and interacting with apps on both Android and Linux platforms. However, the model’s real-world application on desktop environments showed room for improvement, scoring low on OSWorld benchmarks that mimic real computer operations.
Developers can test Qwen2.5-VL through Alibaba’s Qwen Chat app or download it from Hugging Face. The smaller models in the series, Qwen2.5-VL-3B and Qwen2.5-VL-7B, are available under a permissive license. However, the flagship model, Qwen2.5-VL-72B, requires special permission for commercial deployment by companies with over 100 million monthly active users.
Regulatory Constraints in China
As with many Chinese AI systems, Qwen2.5-VL operates within the boundaries of strict regulatory oversight. For instance, when asked to discuss sensitive topics like “Xi Jinping’s mistakes,” the model refused to respond, reflecting the influence of China’s internet regulator, which enforces AI adherence to “core socialist values.”
Despite these constraints, Alibaba’s Qwen2.5-VL series positions the company as a formidable competitor in the global AI race, offering a glimpse into China’s rapidly advancing AI ecosystem.