Not true at all. Qwen has a VLM (qwen2 vl instruct) which is the backbone of Bytedance’s TARS computer use model. Both Alibaba (Qwen) and Bytedance are Chinese.
Also DeepSeek got a ton of attention with their OCR paper a month ago which was an explicit example of using images rather than text.
Not true at all. Qwen has a VLM (qwen2 vl instruct) which is the backbone of Bytedance’s TARS computer use model. Both Alibaba (Qwen) and Bytedance are Chinese.
Also DeepSeek got a ton of attention with their OCR paper a month ago which was an explicit example of using images rather than text.