feat: schema based validation may return iterable of all errors #834

e3krisztian · 2025-06-25T17:48:53Z

This PR addresses two issues (at the commit level they are separately implemented):

improves validation by enriching the ValidationError with the message and path properties, and sensibly limit (shorten) error messages in size via ValidationError.get_squeezed_message() (Validation errors are hard to present safely to the user (missing abstraction) #827)
extends SchemabasedValidator with iterate_errors() to allow access to all the validation problems (feat: extend validation interface with new method to iterate over all the problems #828)

Signed-off-by: Krisztian Fekete <1246751+e3krisztian@users.noreply.github.com>

ValidationError had a single untyped property: .data, which hold different objects in the XML and JSON cases, while they do have some common properties. The new properties introduced in this commit are - .message: the error message text - .path: pointer to the location of the error in the input Signed-off-by: Krisztian Fekete <1246751+e3krisztian@users.noreply.github.com>

ValidationError.data and .message is provided by third party libraries, and they can give a message of any length. E.g. jsonschema inserts its input in all of its messages, which could be arbitrary big. To be able to show these errors to the user, some pre-processing is needed. The new method allows for squeezing these messages in a way, that is least disruptive, and has special knowledge how to shorten jsonschema messages. It is definitely still a workaround, and ideally the libraries should not yield unlimited messages. Signed-off-by: Krisztian Fekete <1246751+e3krisztian@users.noreply.github.com>

There was only one validation method, that returned a single error (and it was the first error (JSON) or the last one (XML)). The new function `SchemabasedValidator.iterate_errors()` allows to enumerate over all the validation errors. Signed-off-by: Krisztian Fekete <1246751+e3krisztian@users.noreply.github.com>

jkowalleck

thanks for working on this feature.

to me, the current implementation looks odd and not very pythonic.

Instead of the current one, I have a design proposal:
add an optional parameter to validate_str that controls the return type a bit.
Something like def validate_str(self, data: str, *, all_errors: boolean=False) -> None | ValidationError | Iterable[ValidationError]:
The implementation would return None if there were no errors at all.
The implementation would return ValidationError if there was at least one error and all_errors is False.
The implementation would return Iterable[ValidationError] if there was at least one error and all_errors is True.

after the implementation is done, we would add proper type hints for each of the implementations with @overload.

what do you think?

cyclonedx/validation/__init__.py

cyclonedx/validation/json.py

jkowalleck · 2025-06-26T07:57:08Z

thanks for working on this feature.

to me, the current implementation looks odd and not very pythonic.

Instead of the current one, I have a design proposal: add an optional parameter to validate_str that controls the return type a bit. Something like def validate_str(self, data: str, *, all_errors: boolean=False) -> None | ValidationError | Iterable[ValidationError]: The implementation would return None if there were no errors at all. The implementation would return ValidationError if there was at least one error and all_errors is False. The implementation would return Iterable[ValidationError] if there was at least one error and all_errors is True.

after the implementation is done, we would add proper type hints for each of the implementations with @overload.

what do you think?

I am working on the proposed implementation design, and will pullrequest it to your branch once i am done.

PS: pullrequested the proposed changes

see feat: SchemabasedValidator.validate_str can return an iterator over all errors e3krisztian/cyclonedx-python-lib#1

…ll errors Signed-off-by: Jan Kowalleck <jan.kowalleck@gmail.com>

Signed-off-by: Jan Kowalleck <jan.kowalleck@gmail.com>

jkowalleck · 2025-06-27T08:11:33Z

i see this PR is mixing scopes.

it makes validation return an iterable - almost done
it adds additional capabilities to the error class - still in discussion.

i like the iterable feature - it brings immediately value for downstream users.
I see the additional capabilities debatable.

this means: we have a mix of scopes, and each scope is potentially moving in a different pace - from a developer/maintainer's perspective.

Therefore, @e3krisztian , I'd ask to remove all the additional error capabilities from this PR, and open a different PR for them.
This enables us to release the "iterable" feature soon, while we discuss/refine the feature of additional error class capabilities.

A quick outlook: i could imagine we might find, in our discussion, that the whole error class shall be redesigned - which would be a breaking change I'd welcome.

e3krisztian · 2025-06-27T09:36:14Z

@jkowalleck I appreciate the quick reviews and responses!

You are probably right, that this PR had attempted to change too much.

Unfortunately we clearly have a different view of a library. For me it includes providing guarantees, and being responsible for not leaking the underlying implementation abstractions (which here in the XML case is not even a public one!), and if necessary working around/covering/or making explicit the problems in them.

This PR attempted to somewhat address what was missing in my view around the validation interface, but it is clear, that our views of what needs to be abstracted away from the underlying implementations differ, as well as our view on what is "Pythonic code", and thus I find it very hard, and expensive to continue discussing it at code level.

Since the iterate over errors is your implementation, and above you are asking me to remove all the rest, I ask you to take over this PR, and remove or keep whatever you feel right, or close it and open one for your error iteration implementation.

We can continue the discussions about the error interfaces in the original issues, but I no longer intend to open PR-s implementing them.

jkowalleck · 2025-06-27T16:03:54Z

@jkowalleck I appreciate the quick reviews and responses!

You are probably right, that this PR had attempted to change too much.

Unfortunately we clearly have a different view of a library. For me it includes providing guarantees, and being responsible for not leaking the underlying implementation abstractions (which here in the XML case is not even a public one!), and if necessary working around/covering/or making explicit the problems in them.

Nope. no different views - I share your opinion.
This is why I ask to split the PR, so the two scopes can be handled in their respective pace.
I do not want to lose your ideas of improving the error class. The error class was nothing people cared about much in the past - that's why it is so sloppy today. I appreciate your ideas.

This PR attempted to somewhat address what was missing in my view around the validation interface, but it is clear, that our views of what needs to be abstracted away from the underlying implementations differ, as well as our view on what is "Pythonic code", and thus I find it very hard, and expensive to continue discussing it at code level.

Since the iterate over errors is your implementation, and above you are asking me to remove all the rest, I ask you to take over this PR, and remove or keep whatever you feel right, or close it and open one for your error iteration implementation.

We can continue the discussions about the error interfaces in the original issues, but I no longer intend to open PR-s implementing them.

I see. thats why requirements engineering is usually done in tickets (#827), not PRs ;-)

i will remove the error class changes from this very PR, and probably merge/release the rest soon.
This way we have access to all underlying errors sooner than the more complex improvements on the error class.

Afterwards, I will continue the discussion for error class improvements in #827 - which already has this scope of improving the error class to make it valuable and independent of 3rd party implementations. More details in the tickets soon - your opinion is appreciated very much!

PS: removed debatable parts and started/continued discussion in #827 (comment)

Signed-off-by: Jan Kowalleck <jan.kowalleck@gmail.com>

jkowalleck · 2025-06-30T08:26:41Z

this feature was released via https://github.com/CycloneDX/cyclonedx-python-lib/releases/tag/v10.3.0

## v10.3.0 (2025-06-30) ### Documentation - Instructions for code style ([`160810f`](CycloneDX/cyclonedx-python-lib@160810f)) ### Features - Schema based validation may return iterable of all errors ([#834](CycloneDX/cyclonedx-python-lib#834), [`f95576f`](CycloneDX/cyclonedx-python-lib@f95576f))

e3krisztian added 4 commits June 18, 2025 14:14

fix: typo _validata_data

e23446b

Signed-off-by: Krisztian Fekete <1246751+e3krisztian@users.noreply.github.com>

e3krisztian requested a review from a team as a code owner June 25, 2025 17:48

jkowalleck requested changes Jun 26, 2025

View reviewed changes

cyclonedx/validation/__init__.py Outdated Show resolved Hide resolved

cyclonedx/validation/__init__.py Outdated Show resolved Hide resolved

cyclonedx/validation/__init__.py Outdated Show resolved Hide resolved

cyclonedx/validation/json.py Outdated Show resolved Hide resolved

jkowalleck changed the title ~~Validation improvements: ValidationError fields, limiting message sizes and iterating over all errors~~ feat: ValidationError fields, limiting message sizes and iterating over all errors Jun 26, 2025

jkowalleck mentioned this pull request Jun 26, 2025

feat: SchemabasedValidator.validate_str can return an iterator over all errors e3krisztian/cyclonedx-python-lib#1

Merged

e3krisztian force-pushed the validation-error-fields branch 2 times, most recently from 45153a4 to 9890561 Compare June 26, 2025 18:19

jkowalleck added 2 commits June 26, 2025 20:20

feat: SchemabasedValidator.validate_str can return an iterator over a…

6e68224

…ll errors Signed-off-by: Jan Kowalleck <jan.kowalleck@gmail.com>

tests

c2471ed

Signed-off-by: Jan Kowalleck <jan.kowalleck@gmail.com>

jkowalleck added 2 commits June 27, 2025 18:15

reverted debatable improvements on error class

98d9a0b

Signed-off-by: Jan Kowalleck <jan.kowalleck@gmail.com>

tidy

41dd477

Signed-off-by: Jan Kowalleck <jan.kowalleck@gmail.com>

jkowalleck changed the title ~~feat: ValidationError fields, limiting message sizes and iterating over all errors~~ feat: schema based validation may return iterable of all errors Jun 27, 2025

jkowalleck self-requested a review June 27, 2025 16:35

rollbacks

3c3421f

Signed-off-by: Jan Kowalleck <jan.kowalleck@gmail.com>

jkowalleck approved these changes Jun 27, 2025

View reviewed changes

Merge branch 'main' into validation-error-fields

b7e501a

jkowalleck merged commit f95576f into CycloneDX:main Jun 28, 2025
42 checks passed

jkowalleck mentioned this pull request Jul 22, 2025

feat: extend validation interface with new method to iterate over all the problems #828

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat: schema based validation may return iterable of all errors #834

feat: schema based validation may return iterable of all errors #834

Uh oh!

e3krisztian commented Jun 25, 2025

Uh oh!

jkowalleck left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jkowalleck commented Jun 26, 2025 •

edited

Loading

Uh oh!

jkowalleck commented Jun 27, 2025

Uh oh!

e3krisztian commented Jun 27, 2025

Uh oh!

jkowalleck commented Jun 27, 2025 •

edited

Loading

Uh oh!

Uh oh!

jkowalleck commented Jun 30, 2025

Uh oh!

Uh oh!

Uh oh!

feat: schema based validation may return iterable of all errors #834

feat: schema based validation may return iterable of all errors #834

Uh oh!

Conversation

e3krisztian commented Jun 25, 2025

Uh oh!

jkowalleck left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jkowalleck commented Jun 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jkowalleck commented Jun 27, 2025

Uh oh!

e3krisztian commented Jun 27, 2025

Uh oh!

jkowalleck commented Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

jkowalleck commented Jun 30, 2025

Uh oh!

Uh oh!

jkowalleck left a comment •

edited

Loading

jkowalleck commented Jun 26, 2025 •

edited

Loading

jkowalleck commented Jun 27, 2025 •

edited

Loading