Istio 从网关访问服务错误 503 NC cluster_not_found

背景:是这样的,我使用默认的 httpbin gateway 来测试一些服务,但是使用之后从网关访问不符合我的预期,所以简单看了一下原因。
这里的 yaml 是官方提供的(我小改了一些修改了 vs 的范围,以及添加了 host)

apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: httpbin-gateway
spec:
  selector:
    istio: ingressgateway
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
    - "httpbin-arm.com"
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: httpbin
spec:
  hosts:
  - "httpbin-arm.com"
  - "mesh"
  gateways:
  - httpbin-gateway
  http:
  - route:
    - destination:
        host: httpbin.book-system
        port:
          number: 8000

执行后,从网关访问没有任何输出

curl -v -H "Host: httpbin-arm.com" http://192.168.2.240:32445

查看网关的日志发现错误

[2022-05-06T07:07:07.890Z] "GET / HTTP/1.1" 503 NC cluster_not_found - "-" 0 0 0 - "192.168.2.240" "curl/7.66.0" "b05d1bcb-f67c-9f17-aa91-587fd8f50e0c" "httpbin-arm.com" "-" - - 10.250.0.193:8080 192.168.2.240:6117 - -
[2022-05-06T07:09:13.744Z] "GET / HTTP/1.1" 503 NC cluster_not_found - "-" 0 0 0 - "192.168.2.240" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.3 Safari/605.1.15" "2945db3e-928e-96a1-a4cf-2f871410a6db" "httpbin-arm.com:32445" "-" - - 10.250.0.193:8080 192.168.2.240:9720 - -
[2022-05-06T07:09:13.856Z] "GET /favicon.ico HTTP/1.1" 503 NC cluster_not_found - "-" 0 0 0 - "192.168.2.240" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.3 Safari/605.1.15" "c932aa92-84bb-9de0-ba96-46b974d8edea" "httpbin-arm.com:32445" "-" - - 10.250.0.193:8080 192.168.2.240:48728 - -
[2022-05-06T07:09:21.786Z] "GET / HTTP/1.1" 503 NC cluster_not_found - "-" 0 0 0 - "192.168.2.240" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.3 Safari/605.1.15" "8582ad7b-6cdc-9f23-9254-ee7a11b46d5e" "httpbin-arm.com:32445" "-" - - 10.250.0.193:8080 192.168.2.240:59931 - -
[2022-05-06T07:09:29.161Z] "GET / HTTP/1.1" 503 NC cluster_not_found - "-" 0 0 0 - "192.168.2.240" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.3 Safari/605.1.15" "29ce549d-38c1-9aa7-9864-946a2c9b354e" "httpbin-arm.com:32445" "-" - - 10.250.0.193:8080 192.168.2.240:59931 - -
[2022-05-06T07:09:56.839Z] "GET / HTTP/1.1" 503 NC cluster_not_found - "-" 0 0 0 - "192.168.2.240" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.3 Safari/605.1.15" "29b737ab-0ac1-96fd-aff5-ef3283e8ff68" "httpbin-arm.com:32445" "-" - - 10.250.0.193:8080 192.168.2.240:59931 - -

有错误就比较简单了,我们接下来就是寻找为什么会出现这个错误

1: 查看网关的 route 配置

从下面返回的结果我们就可以发现,httpbin-arm.com host 路由到了 outbound|8000||httpbin.book-system cluster,这个 cluster 是我们在 vs 中使用的。并没有直接使用 destinationrule 来定义 cluster。

./istioctl pc route istio-ingressgateway-667f585f87-75dxv.istio-system -oyaml
- name: http.80
  validateClusters: false
  virtualHosts:
  - domains:
    - httpbin-arm.com
    - httpbin-arm.com:*
    includeRequestAttemptCount: true
    name: httpbin-arm.com:80
    routes:
    - decorator:
        operation: httpbin.book-system:8000/*
      match:
        prefix: /
      metadata:
        filterMetadata:
          istio:
            config: /apis/networking.istio.io/v1alpha3/namespaces/book-system/virtual-service/httpbin
      route:
        cluster: outbound|8000||httpbin.book-system
        maxGrpcTimeout: 0s
        retryPolicy:
          hostSelectionRetryMaxAttempts: "5"
          numRetries: 2
          retriableStatusCodes:
          - 503
          retryHostPredicate:
          - name: envoy.retry_host_predicates.previous_hosts
          retryOn: connect-failure,refused-stream,unavailable,cancelled,retriable-status-codes
        timeout: 0s

2: 查看网关的 cluster 配置

以下是网关的部分 cluster 配置。
从输出的内容可以看到,httpbin 的 cluster 只有一个并且叫做 outbound|8000||httpbin.book-system.svc.cluster.local

./istioctl pc cluster istio-ingressgateway-667f585f87-75dxv.istio-system -oyaml
  .....
  filters:
  - name: istio.metadata_exchange
    typedConfig:
      '@type': type.googleapis.com/udpa.type.v1.TypedStruct
      typeUrl: type.googleapis.com/envoy.tcp.metadataexchange.config.MetadataExchange
      value:
        protocol: istio-peer-exchange
  metadata:
    filterMetadata:
      istio:
        default_original_port: 8000
        services:
        - host: httpbin.book-system.svc.cluster.local
          name: httpbin
          namespace: book-system
  name: outbound|8000||httpbin.book-system.svc.cluster.local
  transportSocketMatches:
  - match:
      tlsMode: istio
    name: tlsMode-istio
  .... 

结论:我们在 vs 中使用的 destination host 是 httpbin.book-system,所以网关去找一个叫做 outbound|8000||httpbin.book-system 的 cluster,但是实际底层的 httpbin destinationrule 是 fqdn 配置即:全域名 name.namespaces.svc.cluster.local 所以就没有找到 cluster 导致初选了错误。
两种修复方法

  • 更新 vs 中 host 的配置,更新成 fqdn 即可
  • 使用 destinationrule 来定义 httpbin cluster,然后配置 vs subset

这两种方法都可以解决问题。

你可能感兴趣的:(istio,istio,gateway,503,clusternotfound,NC)