Skip to content

Feature Request: Add support for TOON (Token-Oriented Object Notation) to optimize LLM context usage #2633

@0warning0error

Description

@0warning0error

Please describe your feature request.
I wish I could use yq to convert standard YAML/JSON configurations into TOON (Token-Oriented Object Notation) format directly from the CLI.

Currently, developers preprocessing data for Large Language Models (LLMs) must use separate tools or custom scripts to reduce token costs. Adding native TOON support (-o toon) would allow yq to act as a universal bridge, transforming verbose YAML/JSON into highly compact, schema-aware TOON streams. This solves the problem of high API costs and context window limits by reducing token usage by ~40-60% while maintaining human readability and even improving LLM parsing accuracy (benchmarks show ~1.4% accuracy gain over JSON).

Describe the solution you'd like
If we have data1.yml like:

users:
  - id: 1
    name: Alice
    role: admin
  - id: 2
    name: Bob
    role: user

And we run a command:

yq -o toon '.' data1.yml

it could output:

users[2]{id,name,role}:
1,Alice,admin
2,Bob,user

(Note: This reduces token count by ~50% compared to JSON output by declaring the schema once.)

Describe alternatives you've considered

  1. Using the standalone TOON CLI: Requires installing an extra Node.js/Python/Rust tool (npx @toon-format/cli), breaking the simplicity of a single-binary yq workflow.
  2. Custom Python/Go scripts: Developers currently write ad-hoc scripts to flatten JSON for LLMs, which are hard to maintain, lack testing, and don't integrate with yq's powerful filtering expressions.
  3. Sticking with JSON/YAML: Accepting 40-60% higher token costs and potentially lower model accuracy due to syntactic noise.

Additional context

  • Why TOON? TOON is an emerging standard designed specifically for LLM efficiency. Official benchmarks across 4 major models (Claude, GPT, Gemini, Grok) confirm it uses significantly fewer tokens while yielding higher response accuracy than JSON.
  • Implementation Plan: I have reviewed CONTRIBUTING.md. I have implemented this as a new encoder plugin (non-breaking), and ensure all linting/security checks pass.
  • Reference: Format specification and benchmarks: https://github.com/toon-format/toon
  • Goal: To make yq the go-to tool for "AI-ready" data preprocessing, just as it is for standard config management.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions