-
-
Notifications
You must be signed in to change notification settings - Fork 753
Description
Please describe your feature request.
I wish I could use yq to convert standard YAML/JSON configurations into TOON (Token-Oriented Object Notation) format directly from the CLI.
Currently, developers preprocessing data for Large Language Models (LLMs) must use separate tools or custom scripts to reduce token costs. Adding native TOON support (-o toon) would allow yq to act as a universal bridge, transforming verbose YAML/JSON into highly compact, schema-aware TOON streams. This solves the problem of high API costs and context window limits by reducing token usage by ~40-60% while maintaining human readability and even improving LLM parsing accuracy (benchmarks show ~1.4% accuracy gain over JSON).
Describe the solution you'd like
If we have data1.yml like:
users:
- id: 1
name: Alice
role: admin
- id: 2
name: Bob
role: userAnd we run a command:
yq -o toon '.' data1.ymlit could output:
users[2]{id,name,role}:
1,Alice,admin
2,Bob,user
(Note: This reduces token count by ~50% compared to JSON output by declaring the schema once.)
Describe alternatives you've considered
- Using the standalone TOON CLI: Requires installing an extra Node.js/Python/Rust tool (
npx @toon-format/cli), breaking the simplicity of a single-binaryyqworkflow. - Custom Python/Go scripts: Developers currently write ad-hoc scripts to flatten JSON for LLMs, which are hard to maintain, lack testing, and don't integrate with
yq's powerful filtering expressions. - Sticking with JSON/YAML: Accepting 40-60% higher token costs and potentially lower model accuracy due to syntactic noise.
Additional context
- Why TOON? TOON is an emerging standard designed specifically for LLM efficiency. Official benchmarks across 4 major models (Claude, GPT, Gemini, Grok) confirm it uses significantly fewer tokens while yielding higher response accuracy than JSON.
- Implementation Plan: I have reviewed
CONTRIBUTING.md. I have implemented this as a new encoder plugin (non-breaking), and ensure all linting/security checks pass. - Reference: Format specification and benchmarks: https://github.com/toon-format/toon
- Goal: To make
yqthe go-to tool for "AI-ready" data preprocessing, just as it is for standard config management.