Skip to content

Architecture Gallery: The context length for Qwen3 may be incorrect #979

@yihan35

Description

@yihan35

Regarding the context length shown in the architecture diagrams: the Qwen3-4B diagram states "41k tokens", and the Qwen3-235B-A22B diagram states "128k tokens". However, the official model card states: "Context Length: 32,768 natively and 131,072 tokens with YaRN." I would appreciate it if you could take a look and verify whether what I have pointed out is correct.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions