Coze
Official siteLow-code agent workflow platform for fast automation delivery.
OpenAutomatically select Gemini's Flex or Priority inference tier based on budget and latency requirements, ensuring each query is cost‑effective while meeting performance targets.
Use {goal}, {cost_limit}, {latency_threshold}, {query_text} placeholders, not double braces.This section only explains placeholders. It is not an input form on this website. Copy the prompt, then replace variables in Coze / Dify / ChatGPT.
{goal}The business objective of the query, e.g., "generate product description" or "answer customer question"
Filling hint: replace this with your real business context.
{cost_limit}Maximum allowed cost per request in USD
Filling hint: replace this with your real business context.
{latency_threshold}Desired maximum response latency in milliseconds
Filling hint: replace this with your real business context.
{query_text}The raw text or prompt to send to Gemini
Filling hint: replace this with your real business context.
Fill variables below to generate a ready-to-run prompt in your browser.
{goal}The business objective of the query, e.g., "generate product description" or "answer customer question"
{cost_limit}Maximum allowed cost per request in USD
{latency_threshold}Desired maximum response latency in milliseconds
{query_text}The raw text or prompt to send to Gemini
Generated Prompt Preview
Use {goal}, {cost_limit}, {latency_threshold}, {query_text} placeholders, not double braces.Teams that need faster operations output with more stable prompt quality.
Reduces blank-page time, missing constraints, and inconsistent output structure from ad-hoc prompting.
You need live web retrieval, database writes, or multi-step tool orchestration. Use full workflow automation for that.
Keep exploring with similar templates and matching tools.
No recent items yet.
1. Read {cost_limit} and {latency_threshold} to assess budget and performance needs.
2. If {cost_limit} is below the Flex cost threshold and {latency_threshold} is above the Flex latency threshold, choose Flex; otherwise choose Priority.
3. Build the Gemini API request, setting the tier parameter to the selected inference layer.
4. Send the request and wait for the response.
5. Log actual cost and latency; if they exceed thresholds, trigger an alert or automatically downgrade the tier.
Operations
Tools that work well with this template.
Low-code agent workflow platform for fast automation delivery.
Open