Section 10: Optimizing for Token Cost & Performance

Why Optimization Matters

V7 Go uses LLMs (e.g. GPT-5, Gemini, Claude) under the hood — and every interaction with a model consumes tokens, which translates to cost and latency.

The key to scalable AI workflows? Do more with fewer calls.

This section breaks down practical strategies to reduce token usage while improving speed and accuracy.


Common Token Optimization Tactics

1. Bundle Fields Using JSON Properties

Instead of extracting each field with a separate AI call:

  • Group related fields (e.g. start_date, end_date, rent_amount)
  • Use a single JSON prompt with clear schema definition
  • Parse fields later using Python

Reduces token calls

Improves accuracy by anchoring related context


2. Use Python for Logic + Field Splitting

  • Parse values from JSON into discrete properties
  • Run calculations (e.g. monthly_payment = principal * rate)
  • Handle conditionals (“If A = X, then B = Y”)

Zero token cost

Great for deterministic logic


3. Classify with Selects, Not Open-Ended Text

  • Replace free-text logic with Single Select
  • LLM chooses from a fixed set of values (e.g., “Low / Medium / High Risk”)

Reduces hallucination

Shorter prompts, shorter outputs = lower cost


4. Avoid Redundant Calls with Shared Properties

When multiple downstream fields depend on the same source info:

  • Extract that info once in JSON, Collections, or Structured Text
  • Use that property as an input for downstream properties

Avoids reprocessing long documents


5. Split Long Docs with Page Splitter

  • Automatically break PDFs into smaller chunks
  • Run LLM extraction on each section, then roll up results

Avoids exceeding context limits

More stable performance for large files


6. Match Model to Task Complexity

Choose models based on the job, not just defaults:

Task TypeSuggested Model
Simple extractionGPT-4 (standard)
Logic-heavy reasoningGemini 2.5 Pro
Writing-heavy tasksClaude Sonnet/Gemini 2.5 Pro
Cheap classificationGPT-4 nano / 3.5

Don’t overpay for simple prompts

Use best-in-class where accuracy matters


Key Takeaway

Smart agents aren’t just accurate — they’re efficient.

Use:

  • JSON to batch tasks
  • Python to reduce calls
  • Selects to enforce structure
  • Page Splitter to manage long docs
  • Model selection to balance quality vs. cost