Skip to main content
Version: ChatLead

OpenRouter Integration Feature Documentation


Group Billing

Author(s)

  • Alapan Das
  • ...

Last Updated Date

[2025-08-20]


SRS References


Version History

VersionDateChangesAuthor
1.02025-08-20Initial draftAlapan Das
............

Overview

This feature integrates the OpenRouter API into the BotProxyService, enabling advanced LLM-powered chat capabilities with support for tool/function calling. The integration allows the agent to dynamically invoke backend tools (e.g., fetch dealership info, parse dates, save leads) in response to user queries.


Key Features

  • LLM Routing: Uses OpenRouter's model router to select the best model for each chat session.
  • Tool Calling: Supports OpenAI-style tool calls (tools and tool_choice parameters) for dynamic backend actions.
  • Flexible Toolset: Includes tools for dealership info, contact details, appointment checks, inventory queries, and lead extraction.
  • System Prompt Customization: Prompts instruct the agent to use tools for accurate, real-time information.
  • Session Management: Each chat session is tracked and managed for context and memory.

Implementation Details

1. Model Configuration

  • Uses ChatOpenAI from LangChain with OpenRouter endpoint.
  • Tool calling is enabled via tools and tool_choice: "auto" parameters.
  • Example:
    model = ChatOpenAI(
    temperature=0.1,
    model="@preset/model-router",
    base_url="https://openrouter.ai/api/v1",
    api_key=settings.OPENROUTER_API_KEY,
    model_kwargs={"tool_choice": "auto"}
    ).bind(tools=openai_tools)

2. Tool Definition

  • Tools are defined using LangChain's @tool decorator.
  • Example tool:
    @tool
    def get_dealership_phone() -> str:
    """Fetch the phone number of the dealership"""
    return sub_reg.contact.phone1

3. Agent Execution

  • The agent is created with explicit instructions to use tools for specific queries.
  • Intermediate steps and tool calls are logged for debugging.
  • Conversation memory is managed to retain context and tool results.

4. API Endpoints

  • /api/v1/chat/get-reply/{product_name}/{sub_id}/{chat_id}: Handles chat requests and invokes the agent.
  • /api/v1/chat/extract-lead/{product_name}/{sub_id}/{chat_id}: Extracts lead information from chat history.

Deployment

  • Docker-based deployment pipeline (dev-deploy.yml) builds, saves, and transfers images to the remote server.
  • Uses Docker Compose for multi-service orchestration.
  • SSH tasks automate image transfer and context setup.

Troubleshooting

  • Ensure functions and function_call are replaced with tools and tool_choice in API requests.
  • If tool calls are not executed, check system prompt instructions and agent configuration.
  • Address memory warnings by specifying output_key in ConversationBufferMemory.

Use Cases

  • Migrate from OpenAI to OpenRouter to gain vendor flexibility, lower costs, model routing, and first-class support for OpenAI-style tool calling while preserving existing agent/tool workflows.

  • Primary benefits

    • Model routing: automatic selection of the best model per session (performance/cost tradeoff) without hard-coding model IDs.
    • Tool/function parity: OpenRouter supports OpenAI-style tool calls (tools + tool_choice) so existing LangChain agents can keep using tools.
    • Vendor resilience: avoid single-vendor lock-in and enable fallback policies across multiple LLM providers.
    • Cost and latency control: route chats to smaller, cheaper models when appropriate and to higher-capability models when needed.
  • Model generalization strategy

    • Abstract model names in configuration (e.g., logical profiles: "high-accuracy", "low-latency", "balanced") instead of provider-specific model IDs.
    • Maintain a provider map that resolves logical profiles to provider/model tuples at runtime (e.g., "balanced" -> OpenRouter:@preset/model-router; fallback -> OpenAI:gpt-4).
    • Use capability tags (accuracy, latency, context-size, cost) on each model profile to match session requirements programmatically.
    • Implement a routing policy: static mapping + dynamic overrides based on user subscription, conversation intent, or system load.
  • Risks and mitigations

    • Model behavior differences: run parallel A/B tests and keep an OpenAI fallback for failing intents.
    • Tool-call semantics: validate that tool arguments and responses remain stable; add integration tests.
    • Compliance and availability: ensure OpenRouter terms and availability meet requirements; keep multi-provider fallback.
  • When to prefer OpenRouter vs staying on OpenAI

    • Prefer OpenRouter when you need routing flexibility, lower-cost inference, or multi-model workflows.
    • Stay on OpenAI for strictly certified models or when regulatory/contract constraints require it.
  • Acceptance criteria for migration

    • No regression in tool-call accuracy for core flows.
    • Cost per conversation meets target thresholds.
    • Observability in place for routing decisions and fallbacks.
    • Rollout plan with gradual traffic shifting and quick rollback.

References


Change Log

  • 2025-08-31: Initial OpenRouter integration documentation.