Skip to content

Custom Documentation #606

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 28 commits into
base: main
Choose a base branch
from
Open

Custom Documentation #606

wants to merge 28 commits into from

Conversation

brian-mckinney
Copy link
Contributor

@brian-mckinney brian-mckinney commented Mar 28, 2025

Description

This PR adds a python module that can be used to generate custom documentation for endpoint fields. It uses existing mapped fields and leverages their descriptions and examples for the custom documentation.

Current Custom Documentation: https://github.com/elastic/endpoint-package/tree/main/custom_documentation/doc/endpoint

Newly formatted Documentation: https://github.com/elastic/endpoint-package/tree/example_custom_documentation/custom_documentation/doc/new_output

See the Readme.md in this PR for more information.

Related Issue: #605

@brian-mckinney brian-mckinney marked this pull request as ready for review March 31, 2025 17:27
@brian-mckinney brian-mckinney requested review from a team as code owners March 31, 2025 17:27
@ferullo
Copy link
Contributor

ferullo commented Apr 1, 2025

@szwarckonrad and @parkiino hold off reviewing this.

@pzl pzl removed request for szwarckonrad and parkiino April 1, 2025 13:41
Copy link
Contributor

@intxgo intxgo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work. I have objection regarding the overrides, though.

I believe the prerequisite is the package is well formed, not corrupted, etc., which is expected on main, so I didn't pay much attention what documentation output would be produced if the package was malformed.

I know this is a stretch, which we can even implement later, but it could allow to generate output for documentation team, see example doc PR containing documentation rendered as Elastic Endpoint Command reference

The documentation as markdown is referenced from the actual documentation here
https://www.elastic.co/guide/en/security/current/siem-field-reference.html

image

However the link to Git is not easy to consume, and more importantly to update as the documentation is bound to the package with a release tag.

In my opinion Endpoint event documentation should be a sub point of Elastic Security ECS field reference, not requiring visiting any external Github link, and versioned with the rest of the documentation

image

with such utility it seems very feasible to generate output for docs team PR once per release 🙂 Obiviously we will still need the current version here in this repo for our CI validation.

Comment on lines +49 to +50
-l {DEBUG,INFO,WARNING,ERROR,CRITICAL}, --log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}
Set logging verbosity level
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
-l {DEBUG,INFO,WARNING,ERROR,CRITICAL}, --log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}
Set logging verbosity level
-l, --log-level [DEBUG,INFO,WARNING,ERROR,CRITICAL]
Set logging verbosity level

nitipck

- name: agent.type
overrides:
default:
example: endpoint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the overrides is a bad idea. If you'd like to change some description, please change it at the source. Correct me if I'm wrong, it's an endpoint-package, everything here is addressing Endpoint and should be consistent. BTW, I guess the filebeat was some copy-paste from ECS. Changing it LGTM, but you missed Filebeat in the description having this example.

I think we need something like documentation_supplement.yaml which will fill-in the gaps. The package contains only fields (and their descriptions, examples) which need to be index mapped so that they can be used in KQL and other places in Kibana. However custom documentation lists all possible fields per document, some of them may not be mapped. The supplement should error out if a conflict (override attempt) is detected.

I've tried running this utility with --csv missing.csv (thanks for this option!) but it didn't produce anything. Indeed I can't find an example field in the existing documentation which is not mapped. It looks like we eventually mapped everything, but I'm pretty sure it's not intended or required. @ferullo I think you are explained this to me

I'd say the script should indicate a failure if any field with missing description was generated. The description should be provided in the package or in the proposed supplement file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the feedback. I think this makes sense, but I was asked to add the override option. I'm not sure if there is a reason you and I are not thinking of right now.

Additionally, I think that adding this to the official elastic documentation would be great, but probably out of scope for this PR since this documentation is already referenced on the official docs page. I'd be happy to make a follow on issue for that though.

I'd like to hear what @ferullo @pzl and @nfritts think about these two things.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have an objection to the ability to override. For mapped fields that we have defined (i.e. they are not ecs), we are creating their description and info in custom_schemas. Overriding those don't make a lot of sense if we can put it in the source we control.. but if it gets us out of any jams to re-describe that field in a per-platform manner, then it seems fine.

For ecs-sourced fields, I'm also not opposed. ECS will be presenting very generalized information applicable to any integration or source. We will likely have more & specialized information about that field's usage in our context. Our example values can be much more informative.

We also haven't updated ECS since release v8.10.0. So the information is stale anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, if we want the override ability, I'd feel more comfortable with it if it was defined as override "what" "with", i.e. explicitly put in the override definition current expected value to be overridden with the new one (and error if it doesn't match). This way reading/reviewing the override will became easier.

also, additional thought about the non-mapped fields. If I'm right about them, then it would be nice to tag those explicitly in the documentation to make it clear you can't search for a document using this field and/or it's lost when the index is stripped to syntetic source.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

obviously we can add the asciidoc output later! just wanted to mention that documentation can be updated at any time, even retroactively, so we are able to generate our events documentation for past versions and put it online removing the link to git

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for overrides key? It seems like an unneeded level of nesting.

Standard documentation happens at the level of the index, so it can't be more specific than the index itself. Custom documentation's purpose is to document events based on each event type as a human understands them. To me it makes sense to be able to have more specific documentation for a given field when that documentation is for each event rather than the index level. (Also, can you override documentation for ECS fields in the package? I thought that had to match the ECS documentation that is "scaled" even broader to all indexes).

As a specific case, consider event.action. It's documentation is what's below, which is pretty broad.

The action captured by the event.

This describes the information in the event. It is more specific than event.category. Examples are group-add, process-started, file-created. The value is normally defined by the implementer.

When documenting it for Defend/Endpoint it could be re-written to drop "The value is normally defined by the implementer." or to list all possible actions values or to even list all action values per data stream if the package infrastructure allows it. But without overrides at the level here how would you document that in a Defend/Endpoint Windows Device Mount event that event.action is always "mount". In other words, the best documentation for a Windows Device Mount event's event.action field is "The action captured by the event. This will always be the string 'mount'".

I'm not saying that level of documentation work should be undertaken but I can't see why it would be undesirable to be able to have overrides like that when needed/appropriate.


Additionally, I think that adding this to the official elastic documentation would be great, but probably out of scope for this PR since this documentation is already referenced on the official docs page. I'd be happy to make a follow on issue for that though.

I don't know the original requirements but I agree it seems out of scope.

Copy link
Contributor

@intxgo intxgo Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

however, I'm still chewing on this. I'd lean towards keeping even the original ECS description, maybe tagging it also as ECS for clarity, and if really needed just append Endpoint specific documentation line and/or example. How that sounds?

EDIT, so let's draw an example:

image

Copy link
Contributor

@intxgo intxgo Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @ferullo I see we have similar thoughts, yup I've also just asked about the level of details we want in a follow up review comment just above this thread https://github.com/elastic/endpoint-package/pull/606/files#r2027242535
I like the idea of clarifying the Endpoint content of generic ECS fields you highlighted with examples.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I like the idea of keeping the original ECS description but also specify the the endpoint override. It seems pretty natural to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Endpoint specific fields, where there's no ECS counterpart, could be Origin | Endpoint package

we didn't clarify yet the need of ability to add supplement description for non-mapped fields. For this release it's not even needed as everything seems to be mapped, but I'm curious if it's something TODO, or old things are no longer applicable and everything is expected to be mapped

@intxgo
Copy link
Contributor

intxgo commented Apr 3, 2025

and one more thing, this should be integrated with the Makefile because if anyone changes any description in the package and run make it should also update the description in generated custom documentation.

@brian-mckinney
Copy link
Contributor Author

I tried to incorporate some of the feedback and pushed up two more possible formats for the generated documentation. Please let me know what you think, feel free to comment in the PR. Just adding here for visibility. @ferullo @intxgo @nfritts @pzl

Variant 1

  • No tables, works with github outline

Variant 2

  • Uses details/summary tags, does not play well with outline

Variant 3

  • Uses tables, works well with outline

I haven't uploaded any changes to this branch yet, will do so when I hear back about the updated formats

@intxgo
Copy link
Contributor

intxgo commented Apr 4, 2025

I'm not sure what you mean by "github outline"

I wouldn't use any fancy html for markdown, but rather make it readable as a plain text (not rendered)

https://github.com/intxgo/actions_test/blob/main/example1.md

or, eventually adding just a little bit of native markdown formatting

https://github.com/intxgo/actions_test/blob/main/example2.md

Anyway, choose whatever you like, it's a really unimportant detail IMO

Tip: if you really would like to create a beautiful HTML documentation, not restricted by markdown rendering, then generate HTML and ask admin to enable GitPages for it, so it'll appear as https://elastic.github.io/endpoint-package

Here's an example where to enable it
image

@brian-mckinney
Copy link
Contributor Author

brian-mckinney commented Apr 4, 2025

Here is an example of the latest documentation

  • @lesio, I liked the way your examples looked, but the format slightly
  • I updated the overrides to not include the unnecessary nesting
  • I also fixed the missing csv file bug you pointed out

I'm not sure what you mean by "github outline"

On the top right corner of the open file there is an button to open the "outline" which essentially uses the headers for easy point and click navigation. I figured it was useful for something like this considering how many fields there are
image

@intxgo
Copy link
Contributor

intxgo commented Apr 4, 2025

that last example is quite clean, and thanks for the tip about the outline!!

have you noticed that using the back tick formatting no the fields justify them slightly to the right? Without it it's perfectly aligned.

image

I'm just mentioning this, if you intended to make it that way disregard my comment

Copy link
Contributor

@intxgo intxgo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it, one remaining point is about the missing documentation. Do we really want to tolerate this? I'd say the script should end with non zero return code, indicating incomplete documentation. The overrides file should be used to fill up all the gaps.

Not necessary in this PR, but we should do a follow up PR to fill up all missing documentation pieces.

PS. It's no longer an override, so maybe instead of documentation_overrides.yaml use more suitable name for the purpose, like:

  • documentation_supplement.yaml
  • documentation_appendix.yaml
  • missing-references.yaml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants