Alibaba’s Qwen team releases AI models that can control PCs and phones
Alibaba unveils Qwen2.5-VL, an advanced AI model for text and image analysis, challenging OpenAI, Google, and Anthropic. It excels in video understanding, math, and document processing, with features like software interaction.

Alibaba Unveils Qwen2.5-VL AI Models to Compete with OpenAI and Google
Alibaba is ramping up its AI efforts with the launch of Qwen2.5-VL, a new family of AI models designed for text and image analysis. The release comes as competition in the AI space heats up, with Alibaba positioning itself as a serious contender against OpenAI, Google, and Anthropic.
According to Alibaba’s Qwen team, Qwen2.5-VL can analyze documents, process videos, extract data from images, and even control PCs and mobile devices. The company claims the model outperforms OpenAI’s GPT-4o, Anthropic’s Claude 3.5 Sonnet, and Google’s Gemini 2.0 Flash in video understanding, math, and document analysis.
The models are available for testing through Alibaba’s Qwen Chat app and can also be downloaded from Hugging Face for developers. A standout feature is its ability to interact with software—Hugging Face technical lead Philipp Schmid recently shared a video showing Qwen2.5-VL booking a flight on an Android device using the Booking.com app.
However, like other AI models developed in China, Qwen2.5-VL comes with content restrictions. When asked about sensitive topics, such as “Xi Jinping’s mistakes,” the system declined to respond. China’s strict internet regulations require AI models to align with state-approved guidelines, limiting discussions on politically sensitive matters.
Alibaba’s latest move signals its ambition to solidify its place in the AI industry. As global competition intensifies, Qwen2.5-VL positions the company as a key player in the race to develop next-generation artificial intelligence.
Source: TechCrunch
What's Your Reaction?






