Skip to content

Nodes in cosul-agent are having different ips in nodes and members #22296

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
shashankgarg22 opened this issue Apr 26, 2025 · 0 comments
Open

Comments

@shashankgarg22
Copy link

Hi team,

I was using consul by taking the code from github, building it into an image and using that image to run my consul-agent containers.

When my kubernetes cluster goes for a rollback.

We are using consul for service discovery where we have some MICM pod whose address needs to be discovered by other pods.

So, when we are doing the rollback, it is giving the old IP to the consul agent service health check,
When checked in nodes, the IP is the old IP, but when checked in members, it has the new IP.

here is the proof :

My role pods:

[root@golden-20210518 ~]# oc get pods -o wide | grep -i scscf
ftncpcfx-scscf-58485cfdd9-hrqq6 2/2 Running 0 7h27m 10.128.14.245 appworker-32.fi-912.vlab.nsn-rdnet.net
ftncpcfx-scscfc-548689679f-hc797 1/2 Running 0 24m 10.129.12.221 appworker-14.fi-912.vlab.nsn-rdnet.net

ftncpcfx-scscfc-548689679f-hc797 , his consul agent is having the issue.

these are the pods whose service is registered, and the role pods are trying to find :

[root@golden-20210518 ~]# oc get pods -o wide | grep -i micm
ftncpcfx-micm-567c9774db-m64g5 2/2 Running 0 5h19m 10.131.24.226 appworker-10.fi-912.vlab.nsn-rdnet.net
ftncpcfx-micmpartner-6bc9f69877-jndrb 2/2 Running 2 (6m52s ago) 7h25m 10.129.12.185 appworker-14.fi-912.vlab.nsn-rdnet.net

Inside the consul@ftncpcfx-scscfc-548689679f-hc797:

We are facing issue :

The service discovery command used to find the service:
Command:
[consul@ftncpcfx-scscfc-548689679f-hc797 bin]$ curl http://localhost:8500/v1/health/service/ftncpcfx

[{"Node":{"ID":"4684b697-ddc2-dac5-4653-8606c7043ec1","Node":"ftncpcfx","Address":"10.131.24.226","Datacenter":"ftncpcfx","TaggedAddresses":{"lan":"10.131.24.226","lan_ipv4":"10.131.24.226","wan":"10.131.24.226","wan_ipv4":"10.131.24.226"},"Meta":{"consul-network-segment":"","consul-version":"1.18.1"},"CreateIndex":15825,"ModifyIndex":15827},"Service":{"ID":"ftncpcfx","Service":"ftncpcfx",

First service:
"Tags":["mode:primary"],"Address":"::ffff:10.131.24.226",

"TaggedAddresses":{"lan_ipv4":{"Address":"::ffff:10.131.24.226","Port":2000},"wan_ipv4":{"Address":"::ffff:10.131.24.226","Port":2000}},"Meta":null,"Port":2000,"Weights":{"Passing":1,"Warning":1},"EnableTagOverride":false,"Proxy":{"Mode":"","MeshGateway":{},"Expose":{}},"Connect":{},"PeerName":"","CreateIndex":15832,"ModifyIndex":15832},"Checks":[{"Node":"ftncpcfx","CheckID":"ftncpcfx-SESSION-HC","Name":"Controller Session Healthcheck","Status":"passing","Notes":"This session is supposed to do a curl internally every 4 seconds","Output":"","ServiceID":"","ServiceName":"","ServiceTags":[],"Type":"ttl","Interval":"0s","Timeout":"0s","ExposedPort":0,"Definition":{},"CreateIndex":15830,"ModifyIndex":15830},{"Node":"ftncpcfx","CheckID":"serfHealth","Name":"Serf Health Status","Status":"passing","Notes":"","Output":"Agent alive and reachable","ServiceID":"","ServiceName":"","ServiceTags":[],"Type":"","Interval":"","Timeout":"","ExposedPort":0,"Definition":{},"CreateIndex":15825,"ModifyIndex":15825}]},{"Node":{"ID":"241ee620-c14c-2325-82b2-d1e665e50871"

Second service should be having the IP as : 10.129.12.185 but having 10.129.9.113
,"Node":"ftncpcfx-partner","Address":"10.129.9.113",

"Datacenter":"ftncpcfx","TaggedAddresses":{"lan":"10.129.9.113","lan_ipv4":"10.129.9.113","wan":"10.129.9.113","wan_ipv4":"10.129.9.113"},"Meta":{"consul-network-segment":"","consul-version":"1.18.1"},"CreateIndex":3298,"ModifyIndex":3299},"Service":{"ID":"ftncpcfx-partner","Service":"ftncpcfx","Tags":["mode:backup"],"Address":"::ffff:10.129.9.113","TaggedAddresses":{"lan_ipv4":{"Address":"::ffff:10.129.9.113","Port":2000},"wan_ipv4":{"Address":"::ffff:10.129.9.113","Port":2000}},"Meta":null,"Port":2000,"Weights":{"Passing":1,"Warning":1},"EnableTagOverride":false,"Proxy":{"Mode":"","MeshGateway":{},"Expose":{}},"Connect":{},"PeerName":"","CreateIndex":3302,"ModifyIndex":3302},"Checks":[{"Node":"ftncpcfx-partner","CheckID":"ftncpcfx-SESSION-HC","Name":"Controller Session Healthcheck","Status":"passing","Notes":"This session is supposed to do a curl internally every 4 seconds","Output":"","ServiceID":"","ServiceName":"","ServiceTags":[],"Type":"ttl","Interval":"0s","Timeout":"0s","ExposedPort":0,"Definition":{},"CreateIndex":3301,"ModifyIndex":3301},{"Node":"ftncpcfx-partner","CheckID":"serfHealth","Name":"Serf Health Status","Status":"passing","Notes":"","Output":"Agent alive and reachable","ServiceID":"","ServiceName":"","ServiceTags":[],"Type":"","Interval":"","Timeout":"","ExposedPort":0,"Definition":{},"CreateIndex":3298,"ModifyIndex":3298}]}]

while the consul catalog nodes command gives the same :
[consul@ftncpcfx-scscfc-548689679f-hc797 bin]$ ./consul catalog nodes
Node ID Address DC
ftncpcfx 4684b697 10.131.24.226 ftncpcfx
ftncpcfx-partner 241ee620 10.129.9.113 ftncpcfx

But the consul members' command is giving this :
[consul@ftncpcfx-scscfc-548689679f-hc797 bin]$ ./consul members
Node Address Status Type Build Protocol DC Partition Segment
ftncpcfx 10.131.24.226:8301 alive client 1.18.1 2 ftncpcfx default
ftncpcfx-partner 10.129.12.185:8301 alive client 1.18.1 2 ftncpcfx default

which is the new IP of micm !

[consul@ftncpcfx-scscfc-548689679f-hc797 bin]$ ./consul catalog datacenters
ftncpcfx

These are my consul pods running in a statefulset:
ftncpcfx-sd-0 302d6473 10.131.24.225 ftncpcfx
ftncpcfx-sd-1 312d6473 10.130.13.201 ftncpcfx
ftncpcfx-sd-2 322d6473 10.130.16.23 ftncpcfx

The issue is when ftncpcfx-scscf-58485cfdd9-hrqq6 is trying to find the service which is ftncpcfx-partner, it is not able to find as it is having the old IP after rollback, the new IP is not been issued to be taken care of by consul.
The IP in nodes and the IP in the members are different !!

Why ??

Can you help me with the same?

Thanks and regards,
Shashank Garg

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant