Visualize Thread by @Ali_TongyiLab

✨ Visual Editor

palette Canvas & Background

Presets

Custom Colors

Gradient:arrow_forward

Text Color:

Gradient Angle135°

Background Pattern

Grain Texture

Aspect Ratio

style Card Style

Preset

Padding40px

Card Radius16px

Enable Card Shadow

Glassmorphism Effect

Show Watermark AGENCY

Show Timestamps

Show X Logo

text_fields Typography

Font Family

Font Size16px

Tongyi Lab

@Ali_TongyiLab

1/10 🚀 Qwen3.5-Omni is here! Scaling up to a native omni-modal AGI.
Meet the next generation of Qwen, designed for native text, image, audio, and video understanding, with major advances in both intelligence and real-time interaction.
A standout feature:
Audio-Visual Vibe Coding: Describe your vision to the camera, and Qwen3.5-Omni instantly builds a functional website or game for you.
Highlights:
Script-Level Captioning: Generate detailed video scripts with timestamps, scene cuts & speaker mapping.
SOTA Performance: Qwen3.5-Omni has secured 215 SOTA scores across various sub-tasks, matching the top-tier text/vision capabilities of the Qwen3.5 series.
Audio-Visual Understanding: From auto-segmentation to fine-grained script generation, it understands the relationship between characters and their environment like never before.
Seamless Interaction: With native API support for Semantic Interruption, voice conversations feel human-like and background-noise resistant.
Global Multilingual Mastery: Pioneering support for 74 languages in speech recognition and 29 languages in expressive speech generation, breaking down global communication barriers.
Autonomous Intelligence: Native support for WebSearch and complex Function Calling—the model now independently decides when to pull real-time data.
Qwen3.5-Omni is built to be the backbone of next-gen AI applications, empowering developers and users alike with true multimodal reasoning.