Skip to content

[Discussion] - Breaking change resulting from data type changes of mapped attributes #13861

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ShourieG opened this issue May 9, 2025 · 2 comments
Assignees
Labels
discuss integration Label used for meta issues tracking each integration Team:Security-Service Integrations Security Service Integrations team [elastic/security-service-integrations] Team:Sit-Crest Crest developers on the Security Integrations team [elastic/sit-crest-contractors]

Comments

@ShourieG
Copy link
Contributor

ShourieG commented May 9, 2025

In some scenarios we might have to make subsequent updates to existing integrations where the update would involve changing the data type of an existing mapped attribute.

This could be because of reasons such as :-

  1. Existing attribute was initially assigned a wrong type.
  2. Updates in the original data source/ api has changed the original data type due to internal reasons.
  3. Performance reasons (this would be very rare), where indexing the data with it's original data type is becoming expensive for the end-user. etc ...

In such cases, an update made to change the data type would be considered a breaking as it would create conflicts in kibana search queries / dashboards (if they were being used). Till now we have generally followed the philosophy of incrementing the major version of integrations when it comes to breaking changes. However in some cases such breaking changes have been assessed to have minimal impact mainly due to -

  1. The attribute was very recently introduced in the integration and has not seen much use.
  2. The attribute is not being used in Kibana dashboards at all.
  3. The attribute might not have been even indexed in the first place due to data type mis-match.

Such a case was recently occurred in this PR, and in the discussion here it was clarified that even if a field is not explicitly indexed (because of a data type mismatch), it is still explicitly mapped and that would be enough for conflicts to occur when the backing index for the data stream gets created, more details in the discussion.

After discussions across the team it was how ever assessed that the impact of this change would be minimal, and the conflicts would self correct with an index rollover and passage of time, hence it was decided to mark this particular PR as a bugfix and not a breaking-change. It was also kind of decided that we would analyse such cases moving forward on a case by case basis and decide if they are significant to be a breaking-change or a bugfix.

Recently another such PR has come into play, where there is a data type change leading to a breaking-change. However we have only done an increment for the "minor" version here (which is generally reserved for enhancements) and marked it as a breaking-change.

The issue is that there seems to be a bit of confusion on how to handle the version increments and the type of change it should be marked as due to the case-case policy we are going by at the moment.

Issues resulting from a case-by-case approach:

  1. Lack of clarity regarding version increments.
  2. Incrementing major versions often for such small but impactful changes can cause disruption in the community and with end users in general.
  3. Often hard to assess the impact of these changes for end users without team wide consultation, since the data volume and persistence can vary from user to user.
  4. Such changes can often result in SDH's if support is not made aware and such a change is done without a major version increment.
  5. Such a strategy can lead to PR history confusion if each of the PR's don't justify in detail why a certain version increment was made and why it was tagged with a breaking-change/enhancement/bugfix.

Questions:

  1. Should we keep a generic template for such changes if we are to adopt such a case-by-case strategy ?
  2. Should we adopt a single standard for all such changes in future for end-user clarity, as this would standardize all future PR's making such a change?
  3. What kind of community awareness / support awareness should we provide when such changes are published ?
  4. For very popular and widely used integrations should we always do an impact analysis before making such changes or avoid such changes in the first place and take a workaround approach by introducing a separate variable to account for the fixed data type ?

cc: @elastic/security-service-integrations @elastic/sit-crest-contractors @andrewkroh @narph

@ShourieG ShourieG self-assigned this May 9, 2025
@ShourieG ShourieG added discuss Team:Security-Service Integrations Security Service Integrations team [elastic/security-service-integrations] Team:Sit-Crest Crest developers on the Security Integrations team [elastic/sit-crest-contractors] labels May 9, 2025
@ShourieG ShourieG assigned ShourieG and unassigned ShourieG May 9, 2025
@ShourieG ShourieG added the integration Label used for meta issues tracking each integration label May 9, 2025
@kcreddy
Copy link
Contributor

kcreddy commented May 9, 2025

Related: elastic/package-spec#817 and elastic/kibana#217257 (which adds a callout during integration upgrade in ^v9.1.0)

I also think we should have a way to suggest if the breaking-change came from the bug or a feature/enhancement. If changes[].type can take an array of values, this can provide a flexibility to even upgrade the patch version during breaking changes. Example: type: [breaking-change, bugfix]

@leandrojmp
Copy link
Contributor

leandrojmp commented May 14, 2025

Hello, just to add to the discussion, I think that anything that changes mappings should be marked as breaking-change.

I've just noticed on my Fortigate logs in Kibana that some fields were marked as having conflicts, investigating further I found that the mapping for two fields were changed from keyword to long in this PR #13668, which was marked as bugfix.

Personally I'm not affected byt those changes because I'm not using the fields that were changed, but any user using them in Kibana, be it for visualizations or Security rules may now have a conflict and security rules that stopped working.

As a user, having conflicts in Kibana is pretty critical as a lot of things can break and stop working, including custom security rules, so marking mapping change as breaking-change draws more attention to it and prompts the user to look at the PR to see exactly what may break and if they will be affected or not.

The ways to fix these conflicts in Kibana can be expensive in terms of time and money as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss integration Label used for meta issues tracking each integration Team:Security-Service Integrations Security Service Integrations team [elastic/security-service-integrations] Team:Sit-Crest Crest developers on the Security Integrations team [elastic/sit-crest-contractors]
Projects
None yet
Development

No branches or pull requests

3 participants