Research

Ongoing frontier research for enterprise AI.

We treat routing, reasoning, caching, protocols, and multi-agent runtime as foundational systems work for enterprise AI.

View publications Open-source core

RoutingReasoningRuntimeProtocols

Frontier technical research turned into deployable infrastructure for enterprise AI.

Research Focus

We use frontier machine learning and systems research to make AI deployable in the real world.

We care not just about stronger models, but about learned decision mechanisms and runtime foundations that let AI operate safely, reliably, and sustainably in production.

Learning-based control

Routing systems

We study signal learning, model selection, and inference policy so routing becomes a learnable, optimizable, and auditable ML problem rather than a pile of hand-written rules.

Open GitHub

Capability, cost, and boundaries

Reasoning and efficiency

We study adaptive reasoning, semantic caching, and efficiency frontiers to learn better trade-offs among capability, latency, and cost, making stronger models truly deployable.

Deployment, protocols, and runtime

Runtime and protocols

We study runtime isolation, multi-agent execution, and open protocol layers that give models, tools, and enterprise systems a stable way to work together in production.

Publications

Papers, protocols, and open work.

Across routing, reasoning, caching, runtime, and protocol design, we publish papers, drafts, and open implementations.

arXiv 2026Featured paper

vLLM Semantic Router: Signal Driven Decision Routing for Mixture-of-Modality Models

A system treatment of signal-driven routing for mixture-of-modality systems, covering semantic policy, model selection, and controllable inference paths.

Open publication

NeurIPS MLForSys 2025

When to Reason: Semantic Router for vLLM

Routes harder questions to stronger reasoning models instead of paying the same reasoning cost on every request.

Open publication

arXiv 2025

Category-Aware Semantic Caching

Uses dynamic cache thresholds and TTL policy to improve reuse while staying aligned with query type.

Open publication

IETF Draft 2025

Semantic Inference Routing Protocol (SIRP)

A protocol proposal for classification-aware routing across AI infrastructure layers.

Open publication

Research Method

Research, code, and systems practice move together.

We keep papers, open implementation, and protocol design in one loop rather than advancing them in isolation.

Frontier problems

We focus on routing, reasoning, runtime, and control as technical problems with real deployment consequences.

Open implementation

Research is grounded in working systems, from signal extraction and decision logic to DSL compilation and provider-neutral runtime behavior.

Protocol lens

We turn system insights into reusable interfaces and protocol proposals that can push broader infrastructure forward.

Continue

See products and architecture.

See how the research becomes a deployable system.

Products Architecture