Dashboard

ByteDance: UI-TARS 7B - NeuralHub | NeuralHub

ByteDance: UI-TARS 7B

Model Details

Company: bytedance

Created: 12/11/2025

Description

UI-TARS-1.5 is a multimodal vision-language agent optimized for GUI-based environments, including desktop interfaces, web browsers, mobile systems, and games. Built by ByteDance, it builds upon the UI-TARS framework with reinforcement learning-based reasoning, enabling robust action planning and execution across virtual interfaces. This model achieves state-of-the-art results on a range of interactive and grounding benchmarks, including OSworld, WebVoyager, AndroidWorld, and ScreenSpot. It also demonstrates perfect task completion across diverse Poki games and outperforms prior models in Minecraft agent tasks. UI-TARS-1.5 supports thought decomposition during inference and shows strong scaling across variants, with the 1.5 version notably exceeding the performance of earlier 72B and 7B checkpoints.

Technical Specifications

Context Window

128k tokens

Max Output

2k tokens

Pricing (Input / Output)

$0.00009999999999999999 / $0.00019999999999999998 per 1M

Architecture

transformer

Modality

text+image->text

API Usage

Example API Call

curl -X POST https://api.neuralhub.xyz/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer NEURALHUB_API_KEY" \
-d '{
  "model": "bytedance/ui-tars-1.5-7b",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "" }
  ],
  "temperature": 0.7,
  "max_tokens": 500,
  "top_p": 0.9
}'

Response Format

The API returns an OpenAI-compatible response. Example:

{
  "id": "chatcmpl-<uuid>",
  "object": "chat.completion",
  "created": 1765590423,
  "model": "bytedance/ui-tars-1.5-7b",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The answer to life, the universe, and everything is famously 42..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 26,
    "completion_tokens": 169,
    "total_tokens": 195
  }
}