Skip to content

SSL routines:OPENSSL_internal:HTTP_REQUEST:TLS_error_end after some hours from deployment time #22123

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Roxyrob opened this issue Feb 7, 2025 · 0 comments

Comments

@Roxyrob
Copy link

Roxyrob commented Feb 7, 2025

Overview of the Issue

To test Consul service mesh in "Transparent Proxy" mode, We deployed in a AWS EKS "static-server" (hashicorp/http-echo) and "static-client" (curlimages/curl) images in the service mesh (sidecar proxy injection: connectInject / consul dataplane), as sample code on consul documentation do.

NOTE: we use AWS EC2 SPOT instances in sandboxing/texting environments

Calls from static-client to static-server (curl), works fine using proxy and using these FQDN:

  • static-server.service.consul (consul dns resolve to a static-server Pod IP)
  • static-server.connect.consul (consul dns resolve to a static-server Pod IP)
  • static-server.virtual.consul (consul dns resolve to 240.x.x.x - private virtual IP used by sidecar proxy)

After some hours we cannot anymore get a correct response from static-server using these FQDN:

  • (KO) static-server.service.consul
  • (KO) static-server.connect.consul

But no issue for the FQDN:

  • (OK) static-server.virtual.consul

A restart for static-server deployment resolve the issue.


Reproduction Steps

Steps to reproduce this issue:

  1. Deploy consul on AWS EKS cluster/datacenter with ArgoCD
  2. Deploy static-server and static-client in different specific namespaces enabling connectInjection (Transparent Proxy enable by default in chart)
  3. Set WAN Federation with Mesh Gateway with other consul clusters/datacenter
  4. Run commands like kubectl -n static-client exec deploy/static-client -c static-client -- curl http://static-server.service.consul
  5. Get correct responses from static-server (for all three FQDNs)
  6. View static-client logs for the calls
...
2025-02-03T09:12:05.830Z+00:00 [debug] envoy.rbac(22) checking connection: requestedServerName: , sourceIP: 10.x.y.z:52864, directRemoteIP: 10.x.y.z:52864,remoteIP: 10.x.y.z:52864, localAddress: 10.x.y.z:20000, ssl: uriSanPeerCertificate: spiffe://aa574018-aaa6-0a95-28a6-956aa6e501cd.consul/ns/default/dc/dc1/svc/static-client, dnsSanPeerCertificate: , subjectPeerCertificate: , dynamicMetadata:
2025-02-03T09:12:05.831Z+00:00 [debug] envoy.rbac(22) enforced allowed, matched policy consul-intentions-layer4
2025-02-03T09:12:05.838Z+00:00 [debug] envoy.rbac(22) checking connection: requestedServerName: , sourceIP: 10.x.y.z:52864, directRemoteIP: 10.x.y.z:52864,remoteIP: 10.x.y.z:52864, localAddress: 10.x.y.z:20000, ssl: uriSanPeerCertificate: spiffe://aa574018-aaa6-0a95-28a6-956aa6e501cd.consul/ns/default/dc/dc1/svc/static-client, dnsSanPeerCertificate: , subjectPeerCertificate: , dynamicMetadata:
...
  1. View static-server logs for the calls
...
2025-02-03T10:27:57.667Z+00:00 [debug] envoy.filter(23) [Tags: "ConnectionId":"1787"] new tcp proxy session
2025-02-03T10:27:57.667Z+00:00 [debug] envoy.filter(23) [Tags: "ConnectionId":"1787"] Creating connection to cluster passthrough~static-server.default.dc1.internal.aa574018-aaa6-0a95-28a6-956aa6e501cd.consul
...
  1. Wait for some hours
  2. Run again commands like kubectl -n static-client exec deploy/static-client -c static-client -- curl http://static-server.service.consul
  3. Get correct responses from static-server (for virtual.connect FQDN)
  4. Get error below (for service.consul and connect.consul FQDNs)
curl: (52) Empty reply from server
command terminated with exit code 52
  1. View static-server logs for the calls returning error
...
2025-02-03T10:23:07.521Z+00:00 [debug] envoy.connection(23) [Tags: "ConnectionId":"6617"] remote address:10.a.b.c:37846,TLS_error:|268435612:SSL routines:OPENSSL_internal:HTTP_REQUEST:TLS_error_end
...

Consul info for both Client and Server

Client info
agent:
        check_monitors = 0
        check_ttls = 0
        checks = 0
        services = 0
build:
        prerelease =
        revision = 920cc7c6
        version = 1.20.1
        version_metadata =
consul:
        acl = enabled
        bootstrap = false
        known_datacenters = 5
        leader = true
        leader_addr = 10.36.10.68:8300
        server = true
raft:
        applied_index = 1146343
        commit_index = 1146343
        fsm_pending = 0
        last_contact = 0
        last_log_index = 1146343
        last_log_term = 64
        last_snapshot_index = 1130696
        last_snapshot_term = 64
        latest_configuration = [{Suffrage:Voter ID:d44171e8-e0e2-6abb-95c3-01f2fc99a918 Address:10.36.30.45:8300} {Suffrage:Voter ID:0f0e40cd-33ab-ea1e-f3f8-6b5f50f1ddfe Address:10.36.10.68:8300} {
Suffrage:Voter ID:d7d3f7d8-a3b5-12ac-8f09-ba8413757bcb Address:10.36.42.88:8300}]
        latest_configuration_index = 0
        num_peers = 2
        protocol_version = 3
        protocol_version_max = 3
        protocol_version_min = 0
        snapshot_version_max = 1
        snapshot_version_min = 0
        state = Leader
        term = 64
runtime:
        arch = amd64
        cpu_count = 2
        goroutines = 479
        max_procs = 2
        os = linux
        version = go1.22.7
serf_lan:
        coordinate_resets = 0
        encrypted = true
        event_queue = 0
        event_time = 23
        failed = 0
        health_score = 0
        intent_queue = 0
        left = 0
        member_time = 1038
        members = 3
        query_queue = 0
        query_time = 1
serf_wan:
        coordinate_resets = 0
        encrypted = true
        event_queue = 0
        event_time = 1
        failed = 0
        health_score = 0
        intent_queue = 0
        left = 0
        member_time = 25440
        members = 15
        query_queue = 0
        query_time = 1
Using kubernetes consul dataplane with chart config (see Server agent HCL config below)
Server info
Using kubernetes consul dataplane with chart config (see Server agent HCL config below)
global:
  enabled: true
  enablePodSecurityPolicies: false
  datacenter: dc1
  tls:
    enabled: true
    verify: true
    httpsOnly: true
  federation:
    enabled: true
    createFederationSecret: true
  gossipEncryption:
    autoGenerate: true
  acls:
    manageSystemACLs: true
    createReplicationToken: true
  argocd:
    enabled: true
  server:
    enabled: true
    replicas: 3
    storageClass: ebs-csi-gp3-encrypt-retain
    persistentVolumeClaimRetentionPolicy:
      whenDeleted: Retain
      whenScaled: Delete
    resources: |
      requests:
        cpu: "100m"
      limits:
        memory: "500Mi"
        cpu: "500m"
    storage: 10Gi
    disruptionBudget:
      enabled: false
  dns:
    enabled: true
    enableRedirection: false
  ui:
    enabled: true
    service:
      type: ClusterIP
  connectInject:
    enabled: true
    default: false
    logLevel: "debug"
    transparentProxy:
      defaultEnabled: true
      defaultOverwriteProbes: true
    disruptionBudget:
      enabled: false
    cni:
      enabled: true
      logLevel: info
      cniBinDir: "/opt/cni/bin"
      cniNetDir: "/etc/cni/net.d"
  meshGateway:
    enabled: true
    replicas: 2
    service:
      type: LoadBalancer
      annotations:
        'service.beta.kubernetes.io/aws-load-balancer-name': "consul-mgw-dc1-pri"
        'service.beta.kubernetes.io/aws-load-balancer-type': "external"
        'service.beta.kubernetes.io/aws-load-balancer-scheme': "internal"
        'service.beta.kubernetes.io/aws-load-balancer-nlb-target-type': "ip"
        'service.beta.kubernetes.io/aws-load-balancer-backend-protocol': "tcp"
        'service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled': "true"

Operating system and Environment details

AWS EKS Cluster
Client Version: v1.29.0-eks-5e0fdde
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.12-eks-2d5f260

Log Fragments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant