Section 10: Optimizing for Token Cost & Performance
Why Optimization Matters
V7 Go uses LLMs (e.g. GPT-5, Gemini, Claude) under the hood — and every interaction with a model consumes tokens, which translates to cost and latency.
The key to scalable AI workflows? Do more with fewer calls.
This section breaks down practical strategies to reduce token usage while improving speed and accuracy.
Common Token Optimization Tactics
1. Bundle Fields Using JSON Properties
Instead of extracting each field with a separate AI call:
- Group related fields (e.g. start_date, end_date, rent_amount)
- Use a single JSON prompt with clear schema definition
- Parse fields later using Python
Reduces token calls
Improves accuracy by anchoring related context
2. Use Python for Logic + Field Splitting
- Parse values from JSON into discrete properties
- Run calculations (e.g. monthly_payment = principal * rate)
- Handle conditionals (“If A = X, then B = Y”)
Zero token cost
Great for deterministic logic
3. Classify with Selects, Not Open-Ended Text
- Replace free-text logic with Single Select
- LLM chooses from a fixed set of values (e.g., “Low / Medium / High Risk”)
Reduces hallucination
Shorter prompts, shorter outputs = lower cost
4. Avoid Redundant Calls with Shared Properties
When multiple downstream fields depend on the same source info:
- Extract that info once in JSON, Collections, or Structured Text
- Use that property as an input for downstream properties
Avoids reprocessing long documents
5. Split Long Docs with Page Splitter
- Automatically break PDFs into smaller chunks
- Run LLM extraction on each section, then roll up results
Avoids exceeding context limits
More stable performance for large files
6. Match Model to Task Complexity
Choose models based on the job, not just defaults:
| Task Type | Suggested Model |
|---|---|
| Simple extraction | GPT-4 (standard) |
| Logic-heavy reasoning | Gemini 2.5 Pro |
| Writing-heavy tasks | Claude Sonnet/Gemini 2.5 Pro |
| Cheap classification | GPT-4 nano / 3.5 |
Don’t overpay for simple prompts
Use best-in-class where accuracy matters
Key Takeaway
Smart agents aren’t just accurate — they’re efficient.
Use:
- JSON to batch tasks
- Python to reduce calls
- Selects to enforce structure
- Page Splitter to manage long docs
- Model selection to balance quality vs. cost
Updated about 15 hours ago
