Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't update Cloudflare records after update to 0.15.1 #5035

Open
stefanandres opened this issue Jan 27, 2025 · 19 comments
Open

Can't update Cloudflare records after update to 0.15.1 #5035

stefanandres opened this issue Jan 27, 2025 · 19 comments
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug.

Comments

@stefanandres
Copy link

stefanandres commented Jan 27, 2025

What happened:

After updating from 0.15.0 to 0.15.1 updating CF records just fails with "forbidden", even though nothing else changed.
Updating records after downgrading works again.

===== apps/Deployment external-dns/external-dns-cloudflare ======
199c199
<         image: registry.k8s.io/external-dns/external-dns:v0.15.0
---
>         image: registry.k8s.io/external-dns/external-dns:v0.15.1
{"action":"UPDATE","level":"info","msg":"Changing record.","record":"<domain>.org","time":"2025-01-27T10:27:39Z","ttl":1,"type":"CNAME","zone":"<zone>"}
{"action":"UPDATE","level":"error","msg":"failed to update record: forbidden (1002)","record":"<domain>.org","time":"2025-01-27T10:27:39Z","ttl":1,"type":"CNAME","zone":"<zone>"}
{"action":"UPDATE","level":"info","msg":"Changing record.","record":"<domain>.org","time":"2025-01-27T10:27:39Z","ttl":1,"type":"TXT","zone":"<zone>"}
{"action":"UPDATE","level":"error","msg":"failed to update record: forbidden (1002)","record":"<domain>.org","time":"2025-01-27T10:27:40Z","ttl":1,"type":"TXT","zone":"<zone>"}
{"level":"fatal","msg":"Failed to do run once: failed to submit all changes for the following zones: [<zone>]","time":"2025-01-27T10:27:40Z"}

What you expected to happen:

It should update the record

This might be related to changes in:

But I can't find anything obvious.

@stefanandres stefanandres added the kind/bug Categorizes issue or PR as related to a bug. label Jan 27, 2025
@ivankatliarchuk
Copy link
Contributor

Are you using API token or CF_API_KEY and CF_API_EMAIL?

It could be quite sensitive, but are there any useful debug logs?

@lcapka
Copy link

lcapka commented Jan 27, 2025

We are having the same issue. We are using CF_API_TOKEN. In the output there is not much information:

time="2025-01-27T22:17:03Z" level=info msg="Instantiating new Kubernetes client"
time="2025-01-27T22:17:03Z" level=info msg="Using inCluster-config based on serviceaccount-token"
time="2025-01-27T22:17:03Z" level=info msg="Created Kubernetes client https://10.43.0.1:443"
time="2025-01-27T22:17:20Z" level=info msg="Changing record." action=UPDATE record=redacted.com ttl=1 type=CNAME zone=b71c92bd6...
time="2025-01-27T22:17:21Z" level=error msg="failed to update record: forbidden (1002)" action=UPDATE record=redacted.com ttl=1 type=CNAME zone=b71c92bd6...
time="2025-01-27T22:17:21Z" level=info msg="Changing record." action=UPDATE record=redacted.com ttl=1 type=TXT zone=b71c92bd6...
time="2025-01-27T22:17:22Z" level=error msg="failed to update record: forbidden (1002)" action=UPDATE record=redacted.com ttl=1 type=TXT zone=b71c92bd6...
time="2025-01-27T22:17:22Z" level=fatal msg="Failed to do run once: failed to submit all changes for the following zones: [b71c92bd6...]"

@0x77dev
Copy link

0x77dev commented Jan 27, 2025

+1 here

Also using CF_API_TOKEN, and rolling back to 1.14.5 resolves the issue for me.

@ivankatliarchuk
Copy link
Contributor

/help

@k8s-ci-robot
Copy link
Contributor

@ivankatliarchuk:
This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

  • Why are we solving this issue?
  • To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
  • Does this issue have zero to low barrier of entry?
  • How can the assignee reach out to you for help?

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Jan 28, 2025
@ivankatliarchuk
Copy link
Contributor

Share config that is failing. Is it helm or raw manifests?

For those who know how to execute go, worth to try

  1. Clone repo
  2. Provide kube context
  3. Run main.go

export CF_API_TOKEN=xxxxxxxx

just an example with fake source

go run main.go \
    --provider=cloudflare \
    --domain-filter=xxxxxx.org \
    --fqdn-template="xxxxxx.org" \
    --registry=txt \
    --source=fake \
    --log-level=debug

@ivankatliarchuk
Copy link
Contributor

ivankatliarchuk commented Jan 28, 2025

As well we have official staging images available for testing https://console.cloud.google.com/gcr/images/k8s-staging-external-dns/GLOBAL/external-dns?pli=1&inv=1&invt=AboC3A

Coud someone give it a try?

@stefanandres
Copy link
Author

As well we have official staging images available for testing console.cloud.google.com/gcr/images/k8s-staging-external-dns/GLOBAL/external-dns?pli=1&inv=1&invt=AboC3A

Coud someone give it a try?

The image gcr.io/k8s-staging-external-dns/external-dns:v20250128-v0.15.1-142-g9619e6b1 still fails with the same error from above.

It doesn't output anything more related to that change when using log-level=debug.

We are also using the CF_API_TOKEN

  ¦ ¦ ¦ env:
  ¦ ¦ ¦ - name: CF_API_TOKEN
  ¦ ¦ ¦ ¦ valueFrom:
  ¦ ¦ ¦ ¦ ¦ secretKeyRef:
  ¦ ¦ ¦ ¦ ¦ ¦ key: api-token
  ¦ ¦ ¦ ¦ ¦ ¦ name: cloudflare-api-token

@AndrewCharlesHay
Copy link
Contributor

@lcapka @0x77dev @stefanandres Can you grant your Cloudflare Token to have Zone Settings: Read access and try again please?

@0x77dev
Copy link

0x77dev commented Jan 29, 2025

@AndrewCharlesHay already tried, did not work for me

@ivankatliarchuk
Copy link
Contributor

It works for me, but I'm not using a stock helm chart. Could someone share arguments, minimal config I'll try to dig a bit more.

bhuism added a commit to bhuism/home-ops that referenced this issue Feb 1, 2025
@onedr0p
Copy link
Contributor

onedr0p commented Feb 1, 2025

I think the version upgrade from 0.15.0 to 0.15.1 might be a red herring, instead maybe it's related to the issues described here?
cert-manager/cert-manager#7540

@stefanandres if you downgrade external-dns back to 0.15.0 does the issue remain?

@no1here0003
Copy link

I think the version upgrade from 0.15.0 to 0.15.1 might be a red herring, instead maybe it's related to the issues described here? cert-manager/cert-manager#7540

@stefanandres if you downgrade external-dns back to 0.15.0 does the issue remain?

Down grading mine to 1.15.0 fixed it for me.

@onedr0p
Copy link
Contributor

onedr0p commented Feb 1, 2025

Ignore me then 😆 I haven't had any problems using 0.15.0 or 0.15.1 with an API token with the correct scopes.

@ivankatliarchuk
Copy link
Contributor

I think the version upgrade from 0.15.0 to 0.15.1 might be a red herring, instead maybe it's related to the issues described here? cert-manager/cert-manager#7540
@stefanandres if you downgrade external-dns back to 0.15.0 does the issue remain?

Down grading mine to 1.15.0 fixed it for me.

Was the helm chart downgraded or only the image?

@stefanandres
Copy link
Author

I think the version upgrade from 0.15.0 to 0.15.1 might be a red herring, instead maybe it's related to the issues described here? cert-manager/cert-manager#7540
@stefanandres if you downgrade external-dns back to 0.15.0 does the issue remain?

Down grading mine to 1.15.0 fixed it for me.

Was the helm chart downgraded or only the image?

I upgraded both, both downgrading the image only fixes the error.

@mikesmitty
Copy link

It works for me, but I'm not using a stock helm chart. Could someone share arguments, minimal config I'll try to dig a bit more.

Trying with this staging image gives me this output: gcr.io/k8s-staging-external-dns/external-dns:v20250131-v0.15.1-152-g8eb8ea3a

{"level":"info","msg":"config: {APIServerURL: KubeConfig: RequestTimeout:30s DefaultTargets:[my-tunnel-guid-here.cfargotunnel.com] GlooNamespaces:[gloo-system] SkipperRouteGroupVersion:zalando.org/v1 Sources:[service ingress] Namespace: AnnotationFilter: LabelFilter: IngressClassNames:[] FQDNTemplate: CombineFQDNAndAnnotation:false IgnoreHostnameAnnotation:false IgnoreNonHostNetworkPods:false IgnoreIngressTLSSpec:false IgnoreIngressRulesSpec:false GatewayNamespace: GatewayLabelFilter: Compatibility: PodSourceDomain: PublishInternal:false PublishHostIP:false AlwaysPublishNotReadyAddresses:false ConnectorSourceServer:localhost:8080 Provider:cloudflare ProviderCacheTime:0s GoogleProject: GoogleBatchChangeSize:1000 GoogleBatchChangeInterval:1s GoogleZoneVisibility: DomainFilter:[my-domain-here.app] ExcludeDomains:[] RegexDomainFilter: RegexDomainExclusion: ZoneNameFilter:[] ZoneIDFilter:[] TargetNetFilter:[] ExcludeTargetNets:[] AlibabaCloudConfigFile:/etc/kubernetes/alibaba-cloud.json AlibabaCloudZoneType: AWSZoneType: AWSZoneTagFilter:[] AWSAssumeRole: AWSProfiles:[] AWSAssumeRoleExternalID: AWSBatchChangeSize:1000 AWSBatchChangeSizeBytes:32000 AWSBatchChangeSizeValues:1000 AWSBatchChangeInterval:1s AWSEvaluateTargetHealth:true AWSAPIRetries:3 AWSPreferCNAME:false AWSZoneCacheDuration:0s AWSSDServiceCleanup:false AWSSDCreateTag:map[] AWSZoneMatchParent:false AWSDynamoDBRegion: AWSDynamoDBTable:external-dns AzureConfigFile:/etc/kubernetes/azure.json AzureResourceGroup: AzureSubscriptionID: AzureUserAssignedIdentityClientID: AzureActiveDirectoryAuthorityHost: AzureZonesCacheDuration:0s CloudflareProxied:true CloudflareDNSRecordsPerPage:100 CloudflareRegionKey: CoreDNSPrefix:/skydns/ AkamaiServiceConsumerDomain: AkamaiClientToken: AkamaiClientSecret: AkamaiAccessToken: AkamaiEdgercPath: AkamaiEdgercSection: OCIConfigFile:/etc/kubernetes/oci.yaml OCICompartmentOCID: OCIAuthInstancePrincipal:false OCIZoneScope:GLOBAL OCIZoneCacheDuration:0s InMemoryZones:[] OVHEndpoint:ovh-eu OVHApiRateLimit:20 PDNSServer:http://localhost:8081 PDNSServerID:localhost PDNSAPIKey: PDNSSkipTLSVerify:false TLSCA: TLSClientCert: TLSClientCertKey: Policy:sync Registry:txt TXTOwnerID:default TXTPrefix: TXTSuffix: TXTEncryptEnabled:false TXTEncryptAESKey: TXTNewFormatOnly:false Interval:1m0s MinEventSyncInterval:5s Once:false DryRun:false UpdateEvents:false LogFormat:json MetricsAddress::7979 LogLevel:info TXTCacheInterval:0s TXTWildcardReplacement: ExoscaleEndpoint: ExoscaleAPIKey: ExoscaleAPISecret: ExoscaleAPIEnvironment:api ExoscaleAPIZone:ch-gva-2 CRDSourceAPIVersion:externaldns.k8s.io/v1alpha1 CRDSourceKind:DNSEndpoint ServiceTypeFilter:[] CFAPIEndpoint: CFUsername: CFPassword: ResolveServiceLoadBalancerHostname:false RFC2136Host:[] RFC2136Port:0 RFC2136Zone:[] RFC2136Insecure:false RFC2136GSSTSIG:false RFC2136CreatePTR:false RFC2136KerberosRealm: RFC2136KerberosUsername: RFC2136KerberosPassword: RFC2136TSIGKeyName: RFC2136TSIGSecret: RFC2136TSIGSecretAlg: RFC2136TAXFR:false RFC2136MinTTL:0s RFC2136LoadBalancingStrategy:disabled RFC2136BatchChangeSize:50 RFC2136UseTLS:false RFC2136SkipTLSVerify:false NS1Endpoint: NS1IgnoreSSL:false NS1MinTTLSeconds:0 TransIPAccountName: TransIPPrivateKeyFile: DigitalOceanAPIPageSize:50 ManagedDNSRecordTypes:[A AAAA CNAME] ExcludeDNSRecordTypes:[] GoDaddyAPIKey: GoDaddySecretKey: GoDaddyTTL:0 GoDaddyOTE:false OCPRouterName: IBMCloudProxied:false IBMCloudConfigFile:/etc/kubernetes/ibmcloud.json TencentCloudConfigFile:/etc/kubernetes/tencent-cloud.json TencentCloudZoneType: PiholeServer: PiholePassword: PiholeTLSInsecureSkipVerify:false PluralCluster: PluralProvider: WebhookProviderURL:http://localhost:8888 WebhookProviderReadTimeout:5s WebhookProviderWriteTimeout:10s WebhookServer:false TraefikDisableLegacy:false TraefikDisableNew:false NAT64Networks:[]}","time":"2025-02-02T16:32:48Z"}
{"level":"info","msg":"Instantiating new Kubernetes client","time":"2025-02-02T16:32:48Z"}
{"level":"info","msg":"Using inCluster-config based on serviceaccount-token","time":"2025-02-02T16:32:48Z"}
{"level":"info","msg":"Created Kubernetes client https://10.96.0.1:443","time":"2025-02-02T16:32:48Z"}
{"action":"UPDATE","level":"info","msg":"Changing record.","record":"my-domain-here.app","time":"2025-02-02T16:32:50Z","ttl":1,"type":"CNAME","zone":"my-zone-id"}
{"action":"UPDATE","level":"error","msg":"failed to update record when editing region: forbidden (1002)","record":"my-domain-here.app","time":"2025-02-02T16:32:50Z","ttl":1,"type":"CNAME","zone":"my-zone-id"}
{"action":"UPDATE","level":"info","msg":"Changing record.","record":"my-domain-here.app","time":"2025-02-02T16:32:50Z","ttl":1,"type":"TXT","zone":"my-zone-id"}
{"action":"UPDATE","level":"error","msg":"failed to update record when editing region: forbidden (1002)","record":"my-domain-here.app","time":"2025-02-02T16:32:51Z","ttl":1,"type":"TXT","zone":"my-zone-id"}
{"level":"fatal","msg":"Failed to do run once: failed to submit all changes for the following zones: [my-zone-id]","time":"2025-02-02T16:32:51Z"}

Looks like it's specifically failing at the apex record which is a (flattened) CNAME for my CF tunnel. Here's the values.yaml I'm using with the v1.15.1 helm chart:

serviceMonitor:
  enabled: true
env:
  - name: CF_API_TOKEN
    valueFrom:
      secretKeyRef:
        name: external-dns-cloudflare
        key: CF_API_TOKEN

logFormat: json

policy: sync

domainFilters:
  - my-domain-here.app

provider: cloudflare

extraArgs:
  - --cloudflare-proxied
  - --default-targets=my-tunnel-guid-here.cfargotunnel.com

And the associated TXT record contents: "heritage=external-dns,external-dns/owner=default,external-dns/resource=service/external-dns/my-domain-here-app"

KyteProject added a commit to KyteProject/gitops-homelab that referenced this issue Feb 2, 2025
@AndrewCharlesHay
Copy link
Contributor

I think the version upgrade from 0.15.0 to 0.15.1 might be a red herring, instead maybe it's related to the issues described here? cert-manager/cert-manager#7540
@stefanandres if you downgrade external-dns back to 0.15.0 does the issue remain?

Down grading mine to 1.15.0 fixed it for me.

Was the helm chart downgraded or only the image?

I upgraded both, both downgrading the image only fixes the error.

@stefanandres Are you saying you had to downgrade the image and the Helm chart for it to work?

@shawnhwei
Copy link

I also encountered this issue on updating the apex record. In my case it is an A record.

Downgrading to 1.15.0 fixed it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

10 participants