vLLM Semantic Router
System-level intelligent router for mixture-of-models.
It`s just a good thinking game 🎲
I study intelligent routing and system intelligence for LLMs.
My current work is about moving routing beyond model selection and toward system behavior: semantic signals, coordinated model paths, and decision loops that optimize quality, cost, and reliability together.
Latest log: The Second Half of LLM RoutingSignal-driven orchestration for model, tool, and policy selection.
Pragmatic infrastructure work across gateways, inference stacks, and control planes.
Writing, evaluation, and community work that reframes routing as a systems problem.
System-level intelligent router for mixture-of-models.
Manages Envoy Proxy as a standalone or Kubernetes-based application gateway.
Manages unified access to generative AI services built on Envoy Gateway.
Cost-efficient and pluggable infrastructure components for GenAI inference.
Connects, secures, controls, and observes services.
Observability console for Istio with service mesh.
Leading the community effort to define standards for AI Gateway in the Kubernetes ecosystem.
Representing and promoting Cloud Native Computing Foundation projects and values globally.
Advocating for open source adoption and best practices across the Asia-Pacific region.
Reviewing and selecting talks for one of the largest cloud-native conferences.