Skip to content

AIGatewayRoute with model-based matching fails to route /v1/chat/completions requests despite /v1/models listing the model #1485

@googs1025

Description

@googs1025

Description:

Issue: AIGatewayRoute with model-based matching fails to route /v1/chat/completions requests despite /v1/models listing the model

After deploying the InferencePool example README, the /v1/models endpoint correctly returns a list of available models (e.g., meta-llama/Llama-3.1-8B-Instruct, mistral:latest). However, when sending a standard OpenAI-compatible request to /v1/chat/completions with a model field in the JSON body, Envoy AI Gateway returns:

No matching route found. It is likely because the model specified in your request is not configured in the Gateway.

This suggests that the gateway fails to match the model field from the request body to the routes defined in the AIGatewayRoute, even though the model is listed and the route appears correctly configured.

Expected behavior:

The request should be routed to the appropriate backend (e.g., vllm-llama3-8b-instruct InferencePool) based on the model value in the JSON payload, without requiring custom headers like x-ai-eg-model.

Repro Steps
Deploy the example manifests:

kubectl apply -f base.yaml
kubectl apply -f aigwroute.yaml

Forward the gateway service port:

kubectl port-forward svc/envoy-default-inference-pool-with-aigwroute-d416582c 8081:80 -n envoy-gateway-system

Verify models are listed:

curl http://localhost:8081/v1/models

Send a chat completion request:

curl -X POST http://localhost:8081/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
        "model": "meta-llama/Llama-3.1-8B-Instruct",
        "messages": [{"role": "user", "content": "Say this is a test"}]
      }'

Observe error:

No matching route found. It is likely because the model specified in your request is not configured in the Gateway.
root@iZ6wed7dv05mxum2042djiZ:~/ai-gateway/examples/inference-pool# kubectl apply -f base.yaml
service/mistral-upstream created
deployment.apps/mistral-upstream created
inferencepool.inference.networking.k8s.io/mistral created
inferenceobjective.inference.networking.x-k8s.io/mistral created
serviceaccount/mistral-epp created
service/mistral-epp created
deployment.apps/mistral-epp created
configmap/plugins-config created
role.rbac.authorization.k8s.io/pod-read created
rolebinding.rbac.authorization.k8s.io/pod-read-binding created
clusterrole.rbac.authorization.k8s.io/auth-reviewer created
clusterrolebinding.rbac.authorization.k8s.io/auth-reviewer-binding created
aiservicebackend.aigateway.envoyproxy.io/envoy-ai-gateway-basic-testupstream created
backend.gateway.envoyproxy.io/envoy-ai-gateway-basic-testupstream created
deployment.apps/envoy-ai-gateway-basic-testupstream created
service/envoy-ai-gateway-basic-testupstream created
root@iZ6wed7dv05mxum2042djiZ:~/ai-gateway/examples/inference-pool# kubectl apply -f aigwroute.yaml
gatewayclass.gateway.networking.k8s.io/inference-pool-with-aigwroute created
gateway.gateway.networking.k8s.io/inference-pool-with-aigwroute created
aigatewayroute.aigateway.envoyproxy.io/inference-pool-with-aigwroute created
root@iZ6wed7dv05mxum2042djiZ:~# kubectl get pods -A
NAMESPACE                 NAME                                                              READY   STATUS    RESTARTS   AGE
default                   envoy-ai-gateway-basic-testupstream-6f75dd4cf6-9cv86              1/1     Running   0          2m47s
default                   mistral-epp-f95446897-xvq4f                                       1/1     Running   0          2m48s
default                   mistral-upstream-9c959d4d4-b9d97                                  1/1     Running   0          2m48s
default                   mistral-upstream-9c959d4d4-dgkkp                                  1/1     Running   0          2m48s
default                   mistral-upstream-9c959d4d4-kq8jq                                  1/1     Running   0          2m48s
envoy-ai-gateway-system   ai-gateway-controller-5558c7cf7c-r8s9t                            1/1     Running   0          35m
envoy-gateway-system      envoy-default-inference-pool-with-aigwroute-d416582c-58ffc9jxwm   3/3     Running   0          2m42s
envoy-gateway-system      envoy-gateway-6dd8f9b8f-dxwgs                                     1/1     Running   0          34m
kube-system               coredns-668d6bf9bc-9n5cn                                          1/1     Running   0          36m
kube-system               coredns-668d6bf9bc-wld9q                                          1/1     Running   0          36m
kube-system               etcd-cluster1-control-plane                                       1/1     Running   0          36m
kube-system               kindnet-6dr8j                                                     1/1     Running   0          35m
kube-system               kindnet-cl8tz                                                     1/1     Running   0          36m
kube-system               kindnet-phj8s                                                     1/1     Running   0          35m
kube-system               kube-apiserver-cluster1-control-plane                             1/1     Running   0          36m
kube-system               kube-controller-manager-cluster1-control-plane                    1/1     Running   0          36m
kube-system               kube-proxy-b959k                                                  1/1     Running   0          35m
kube-system               kube-proxy-nctnq                                                  1/1     Running   0          36m
kube-system               kube-proxy-tq59g                                                  1/1     Running   0          35m
kube-system               kube-scheduler-cluster1-control-plane                             1/1     Running   0          36m
local-path-storage        local-path-provisioner-7dc846544d-4vn9n                           1/1     Running   0          36m
root@iZ6wed7dv05mxum2042djiZ:~# kubectl get svc -A
NAMESPACE                 NAME                                                   TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)                                            AGE
default                   envoy-ai-gateway-basic-testupstream                    ClusterIP      10.96.213.150   <none>        80/TCP                                             3m21s
default                   kubernetes                                             ClusterIP      10.96.0.1       <none>        443/TCP                                            36m
default                   mistral-epp                                            ClusterIP      10.96.104.62    <none>        9002/TCP                                           3m22s
default                   mistral-upstream                                       ClusterIP      None            <none>        8080/TCP                                           3m22s
envoy-ai-gateway-system   ai-gateway-controller                                  ClusterIP      10.96.30.73     <none>        9443/TCP,1063/TCP,9090/TCP                         36m
envoy-gateway-system      envoy-default-inference-pool-with-aigwroute-d416582c   LoadBalancer   10.96.136.253   <pending>     80:31523/TCP                                       3m16s
envoy-gateway-system      envoy-gateway                                          ClusterIP      10.96.123.136   <none>        18000/TCP,18001/TCP,18002/TCP,19001/TCP,9443/TCP   35m
kube-system               kube-dns                                               ClusterIP      10.96.0.10      <none>        53/UDP,53/TCP,9153/TCP                             36m
root@iZ6wed7dv05mxum2042djiZ:~/ai-gateway/examples/inference-pool# kubectl port-forward svc/envoy-default-inference-pool-with-aigwroute-d416582c 8081:80 -n envoy-gateway-system
Forwarding from 127.0.0.1:8081 -> 10080
Forwarding from [::1]:8081 -> 10080
Handling connection for 8081
Handling connection for 8081
root@iZ6wed7dv05mxum2042djiZ:~# curl -X POST "http://localhost:8081/v1/models"
{"data":[{"id":"meta-llama/Llama-3.1-8B-Instruct","created":1762256144,"object":"model","owned_by":"Envoy AI Gateway"},{"id":"meta-llama/Llama-3.1-8B-Instruct","created":1762256144,"object":"model","owned_by":"Envoy AI Gateway"},{"id":"mistral:latest","created":1762256144,"object":"model","owned_by":"Envoy AI Gateway"},{"id":"some-cool-self-hosted-model","created":1762256144,"object":"model","owned_by":"Envoy AI Gateway"}],"object":"list"}root@iZ6wed7dv05mxum2042djiZ:~#
root@iZ6wed7dv05mxum2042djiZ:~#
root@iZ6wed7dv05mxum2042djiZ:~#
root@iZ6wed7dv05mxum2042djiZ:~#
root@iZ6wed7dv05mxum2042djiZ:~# curl -X POST "http://localhost:8081/v1/chat/completions"   -H "Content-Type: application/json"   -d '{"messages":[{"role":"user","content":"Say this is a test"}],"model":"meta-llama/Llama-3.1-8B-Instruct"}'
No matching route found. It is likely because the model specified in your request is not configured in the Gateway.root@iZ6wed7dv05mxum2042djiZ:~#

What issue is being seen? Describe what should be happening instead of
the bug, for example: Envoy should not crash, the expected value isn't
returned, etc.

Repro steps:

Include sample requests, environment, etc. All data and inputs
required to reproduce the bug.

Note: If there are privacy concerns, sanitize the data prior to
sharing.

Environment:

Environment
Envoy Gateway Helm Chart: v0.0.0-latest
Kubernetes: vX.XX (e.g., v1.28 via Kind)
OS: Ubuntu 22.04
Steps followed: InferencePool Example README

Include the environment like gateway version, envoy version and so on.

Logs:

Include the access logs and the Envoy logs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions