Optimizing for Token Cost & Performance

Why Optimization Matters

V7 Go uses LLMs (e.g. GPT-5, Gemini, Claude) under the hood — and every interaction with a model consumes tokens, which translates to cost and latency.

The key to scalable AI workflows? Do more with fewer calls.

This section breaks down practical strategies to reduce token usage while improving speed and accuracy.

Common Token Optimization Tactics

1. Bundle Fields Using JSON Properties

Instead of extracting each field with a separate AI call:

Group related fields (e.g. start_date, end_date, rent_amount)
Use a single JSON prompt with clear schema definition
Parse fields later using Python

Reduces token calls

Improves accuracy by anchoring related context

2. Use Python for Logic + Field Splitting

Parse values from JSON into discrete properties
Run calculations (e.g. monthly_payment = principal * rate)
Handle conditionals (“If A = X, then B = Y”)

Zero token cost

Great for deterministic logic

3. Classify with Selects, Not Open-Ended Text

Replace free-text logic with Single Select
LLM chooses from a fixed set of values (e.g., “Low / Medium / High Risk”)

Reduces hallucination

Shorter prompts, shorter outputs = lower cost

4. Avoid Redundant Calls with Shared Properties

When multiple downstream fields depend on the same source info:

Extract that info once in JSON, Collections, or Structured Text
Use that property as an input for downstream properties

Avoids reprocessing long documents

5. Split Long Docs with Page Splitter

Automatically break PDFs into smaller chunks
Run LLM extraction on each section, then roll up results

Avoids exceeding context limits

More stable performance for large files

6. Match Model to Task Complexity

Choose models based on the job, not just defaults:

Task Type	Suggested Model
Simple extraction	GPT-4 (standard)
Logic-heavy reasoning	Gemini 2.5 Pro
Writing-heavy tasks	Claude Sonnet/Gemini 2.5 Pro
Cheap classification	GPT-4 nano / 3.5

Don’t overpay for simple prompts

Use best-in-class where accuracy matters

Key Takeaway

Smart agents aren’t just accurate — they’re efficient.

Use:

JSON to batch tasks
Python to reduce calls
Selects to enforce structure
Page Splitter to manage long docs
Model selection to balance quality vs. cost