Skip to content

Fix for issue #1074 - add IsDocumentation function #2408

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 14, 2025

Conversation

colin-stubbs
Copy link
Contributor

@colin-stubbs colin-stubbs commented Feb 14, 2025

Fixes #1074

Adds an IsDocumentation function to check if IP's are reserved IP's used for documentation, these should be permitted in sample/test events in integrations.

IsDocumentation() is then used within isAllowedIPValue() to allow documentation IP's in the a similar way that elastic-package allows private RFC-1918, loopback, multicast, link-local and unspecified addresses via go net package provided functions.

Unfortunatly the go net package does not currently contain any kind of similar function so it needs to be local.

Without this addition package developers have to waste time substituting their lab/test/already replaced IP's - which are already documentation IP's! - for IP's that conform to the random set of real public Internet IP's provided in allowed_geo_ips.txt which makes no sense whatsoever.

e.g. they get this kind of noise back from elastic-package when running tests,

user@box beelzebub % elastic-package test pipeline       
Run pipeline tests for the package
--- Test results for package: beelzebub - START ---
FAILURE DETAILS:
beelzebub/logs test-beelzebub-logs-ndjson.log:
[0] parsing field value failed: the IP "203.0.113.1" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[1] parsing field value failed: the IP "203.0.113.100" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[2] parsing field value failed: the IP "203.0.113.103" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[3] parsing field value failed: the IP "203.0.113.108" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[4] parsing field value failed: the IP "203.0.113.116" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[5] parsing field value failed: the IP "203.0.113.121" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[6] parsing field value failed: the IP "203.0.113.13" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[7] parsing field value failed: the IP "203.0.113.133" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[8] parsing field value failed: the IP "203.0.113.14" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[9] parsing field value failed: the IP "203.0.113.147" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[10] parsing field value failed: the IP "203.0.113.149" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[11] parsing field value failed: the IP "203.0.113.15" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[12] parsing field value failed: the IP "203.0.113.151" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[13] parsing field value failed: the IP "203.0.113.155" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[14] parsing field value failed: the IP "203.0.113.164" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[15] parsing field value failed: the IP "203.0.113.171" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[16] parsing field value failed: the IP "203.0.113.173" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[17] parsing field value failed: the IP "203.0.113.174" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[18] parsing field value failed: the IP "203.0.113.177" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[19] parsing field value failed: the IP "203.0.113.186" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[20] parsing field value failed: the IP "203.0.113.195" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[21] parsing field value failed: the IP "203.0.113.209" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[22] parsing field value failed: the IP "203.0.113.228" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[23] parsing field value failed: the IP "203.0.113.241" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[24] parsing field value failed: the IP "203.0.113.245" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[25] parsing field value failed: the IP "203.0.113.251" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[26] parsing field value failed: the IP "203.0.113.254" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[27] parsing field value failed: the IP "203.0.113.32" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[28] parsing field value failed: the IP "203.0.113.34" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[29] parsing field value failed: the IP "203.0.113.36" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[30] parsing field value failed: the IP "203.0.113.38" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[31] parsing field value failed: the IP "203.0.113.43" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[32] parsing field value failed: the IP "203.0.113.53" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[33] parsing field value failed: the IP "203.0.113.58" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[34] parsing field value failed: the IP "203.0.113.66" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[35] parsing field value failed: the IP "203.0.113.68" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[36] parsing field value failed: the IP "203.0.113.70" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[37] parsing field value failed: the IP "203.0.113.71" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[38] parsing field value failed: the IP "203.0.113.73" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[39] parsing field value failed: the IP "203.0.113.76" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[40] parsing field value failed: the IP "203.0.113.85" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[41] parsing field value failed: the IP "203.0.113.94" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[42] parsing field value failed: the IP "203.0.113.96" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)
[43] parsing field value failed: the IP "203.0.113.97" is not one of the allowed test IPs (see: https://github.com/elastic/elastic-package/blob/main/internal/fields/_static/allowed_geo_ips.txt)


╭───────────┬─────────────┬───────────┬───────────────────────────────────────────────────────────┬─────────────────────────────────────────────────────────────────────────────┬──────────────╮
│ PACKAGE   │ DATA STREAM │ TEST TYPE │ TEST NAME                                                 │ RESULT                                                                      │ TIME ELAPSED │
├───────────┼─────────────┼───────────┼───────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────────┼──────────────┤
│ beelzebub │ logs        │ pipeline  │ (ingest pipeline warnings test-beelzebub-logs-ndjson.log) │ PASS                                                                        │ 244.512542ms │
│ beelzebub │ logs        │ pipeline  │ test-beelzebub-logs-ndjson.log                            │ FAIL: test case failed: one or more problems with fields found in documents │    3.748082s │
╰───────────┴─────────────┴───────────┴───────────────────────────────────────────────────────────┴─────────────────────────────────────────────────────────────────────────────┴──────────────╯
--- Test results for package: beelzebub - END   ---
Done
Error: one or more test cases failed
user@box beelzebub % 

I've rebuilt and testing my locally built elastic-package using this diff, I no longer get noise.

user@box beelzebub % elastic-package test pipeline 
2025/02/14 14:08:41  INFO New version is available - v0.109.1. Download from: https://github.com/elastic/elastic-package/releases/tag/v0.109.1
Run pipeline tests for the package
--- Test results for package: beelzebub - START ---
╭───────────┬─────────────┬───────────┬───────────────────────────────────────────────────────────┬────────┬──────────────╮
│ PACKAGE   │ DATA STREAM │ TEST TYPE │ TEST NAME                                                 │ RESULT │ TIME ELAPSED │
├───────────┼─────────────┼───────────┼───────────────────────────────────────────────────────────┼────────┼──────────────┤
│ beelzebub │ logs        │ pipeline  │ (ingest pipeline warnings test-beelzebub-logs-ndjson.log) │ PASS   │ 345.539417ms │
│ beelzebub │ logs        │ pipeline  │ test-beelzebub-logs-ndjson.log                            │ PASS   │     3.38368s │
╰───────────┴─────────────┴───────────┴───────────────────────────────────────────────────────────┴────────┴──────────────╯
--- Test results for package: beelzebub - END   ---
Done
user@box beelzebub % 

Used in isAllowedIPValue()
@jsoriano jsoriano requested a review from a team February 14, 2025 09:59
@jsoriano
Copy link
Member

jsoriano commented Feb 14, 2025

Hey @colin-stubbs, thanks!

We were actually discussing these days about this issue, with the aim of replacing all public IPs in tests with IPs in the documentation range. This consists on three parts:

  • Allowing all IPs in the documentation ranges. This is what you do here, so thanks!
  • Find or forge a new geoip database for testing, that provides dummy data for these IPs. Ideas welcome about this.
  • Eventually migrate test files in the integrations repository to the new IPs, ideally with some tooling to help on that.
  • Finally remove the current list of allowed ranges.

@jsoriano
Copy link
Member

/test

Comment on lines +1259 to +1262
// IsDocumentation reports whether ip is a reserved address for documentation,
// according to RFC 5737 (IPv4 Address Blocks Reserved for Documentation) and
// RFC 3849 (IPv6 Address Prefix Reserved for Documentation).
func IsDocumentation(ip net.IP) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit.

Suggested change
// IsDocumentation reports whether ip is a reserved address for documentation,
// according to RFC 5737 (IPv4 Address Blocks Reserved for Documentation) and
// RFC 3849 (IPv6 Address Prefix Reserved for Documentation).
func IsDocumentation(ip net.IP) bool {
// isDocumentationIP reports whether ip is a reserved address for documentation,
// according to RFC 5737 (IPv4 Address Blocks Reserved for Documentation) and
// RFC 3849 (IPv6 Address Prefix Reserved for Documentation).
func isDocumentationIP(ip net.IP) bool {

Comment on lines +1259 to +1262
// IsDocumentation reports whether ip is a reserved address for documentation,
// according to RFC 5737 (IPv4 Address Blocks Reserved for Documentation) and
// RFC 3849 (IPv6 Address Prefix Reserved for Documentation).
func IsDocumentation(ip net.IP) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. Consider adding some test case in validate_test.go and/or in some test package under test/packages.

@colin-stubbs
Copy link
Contributor Author

Hey @colin-stubbs, thanks!

We were actually discussing these days about this issue, with the aim of replacing all public IPs in tests with IPs in the documentation range. This consists on three parts:

  • Allowing all IPs in the documentation ranges. This is what you do here, so thanks!
  • Find or forge a new geoip database for testing, that provides dummy data for these IPs. Ideas welcome about this.
  • Eventually migrate test files in the integrations repository to the new IPs, ideally with some tooling to help on that.
  • Finally remove the current list of allowed ranges.

Yeap... I do see the value in having a dummy geoip database (or dummy code that always provides geoip data regardless of IP) so that if geoip ingest processors are used it'll return something that will wind up on documents regardless of what the IP is.

This would better ensure validity of ingest processor usage and field definitions for integrations.

I'll take a look and see if I can't work out how to do it.

@colin-stubbs
Copy link
Contributor Author

colin-stubbs commented Feb 14, 2025

Also, in terms of tooling to en masse replace any IP's currently in test/sample files... something like this?

#!/bin/bash

function usage() {
  echo "Usage: ${0} package_name"
  exit 1
}

PACKAGE=${1}

#IPv4_LEAD="1.128.0."
#IPv6_LEAD="2a02:cf40:"
IPv4_LEAD="203.0.113." # RFC 5737 - TEST-NET-3
IPv6_LEAD="2001:db8:" # RFC 3849

test -z "${1}" && echo "ERROR: package name not provided" && usage
test ! -d "./packages/${PACKAGE}" && echo "ERROR: folder does not exist at ./packages/${PACKAGE}" && usage

for FILE in ./packages/${PACKAGE}/data_stream/*/_dev/test/pipeline/test-*.log ./packages/${PACKAGE}/data_stream/*/_dev/test/pipeline/test-*.json ; do
  echo "### Fixing IP's in ${FILE}"
  sed -r -i.backup "s/\"[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.([0-9]{1,3})/\"${IPv4_LEAD}\1/g" "${FILE}" && rm -f "${FILE}.backup"
  sed -r -i.backup "s/\"(([A-F0-9]{1,4}:){2,2})((:|:[A-F0-9]{1,4}){1,5}|([A-F0-9]{1,4}:){1,5}:|([A-F0-9]{1,4}:){1,4}:([A-F0-9]{1,4})|([A-F0-9]{1,4}:){5,6}([A-F0-9]{1,4}))/\"${IPv6_LEAD}\3/gi" "${FILE}" && rm -f "${FILE}.backup"
done 

# EOF

@jsoriano
Copy link
Member

I'll take a look and see if I can't work out how to do it.

Great, thanks, in any case this can be handled as a separate change, we can go on with this PR without this.

Also, in terms of tooling to en masse replace any IP's currently in test/sample files... something like this?

Yes, something like this :D

@jsoriano jsoriano merged commit e28d152 into elastic:main Feb 14, 2025
3 checks passed
@colin-stubbs
Copy link
Contributor Author

colin-stubbs commented Feb 15, 2025

@jsoriano is there an existing issue or somewhere else that I can dump this info?

I've worked out how to generate dummy GeoIP Lite2 .mmdb files that will respond to ANY IP with the same ASN/City/Country info.

This would mean that if a geoip ingest processor is used as part of package testing it will always get valid GeoIP data back to attach to the document. e.g. this is what it looks like after replacing the .mmdb files that elastic-package injects into the elasticsearch container,

e.g. with the package I'm currently working on, with 203.0.113.0/24 TEST-NET-3 IP's... here's some of the new fields that get attached to test events. The actual City name, Country details, location coords etc could be dummied further to not refer to a real place at all.

user@box beelzebub % elastic-package test pipeline
2025/02/15 11:02:30  INFO New version is available - v0.109.1. Download from: https://github.com/elastic/elastic-package/releases/tag/v0.109.1
Run pipeline tests for the package

--- Test results for package: beelzebub - START ---
FAILURE DETAILS:
beelzebub/logs test-beelzebub-logs-ndjson.log:
--- want
+++ got
@@ -206,6 +206,24 @@
                 ]
             },
             "source": {
+                "as": {
+                    "number": 64496,
+                    "organization": {
+                        "name": "Documentation ASN"
+                    }
+                },
+                "geo": {
+                    "city_name": "Greenwich",
+                    "continent_name": "Europe",
+                    "country_iso_code": "GB",
+                    "country_name": "United Kingdom",
+                    "location": {
+                        "lat": 51.5142,
+                        "lon": -0.0931
+                    },
+                    "region_iso_code": "GB-ENG",
+                    "region_name": "England"
+                },
                 "ip": "203.0.113.133",
                 "port": 60748
             },
@@ -257,6 +275,24 @@
%{BREVITY}%
@@ -54798,6 +66498,24 @@
                 ]
             },
             "source": {
+                "as": {
+                    "number": 64496,
+                    "organization": {
+                        "name": "Documentation ASN"
+                    }
+                },
+                "geo": {
+                    "city_name": "Greenwich",
+                    "continent_name": "Europe",
+                    "country_iso_code": "GB",
+                    "country_name": "United Kingdom",
+                    "location": {
+                        "lat": 51.5142,
+                        "lon": -0.0931
+                    },
+                    "region_iso_code": "GB-ENG",
+                    "region_name": "England"
+                },
                 "ip": "203.0.113.53",
                 "port": 40742
             },



╭───────────┬─────────────┬───────────┬───────────────────────────────────────────────────────────┬─────────────────────────────────────────────────────────────────────────┬──────────────╮
│ PACKAGE   │ DATA STREAM │ TEST TYPE │ TEST NAME                                                 │ RESULT                                                                  │ TIME ELAPSED │
├───────────┼─────────────┼───────────┼───────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────┼──────────────┤
│ beelzebub │ logs        │ pipeline  │ (ingest pipeline warnings test-beelzebub-logs-ndjson.log) │ PASS                                                                    │ 359.849459ms │
│ beelzebub │ logs        │ pipeline  │ test-beelzebub-logs-ndjson.log                            │ FAIL: test case failed: Expected results are different from actual ones │  42.7460275s │
╰───────────┴─────────────┴───────────┴───────────────────────────────────────────────────────────┴─────────────────────────────────────────────────────────────────────────┴──────────────╯
--- Test results for package: beelzebub - END   ---
Done
Error: one or more test cases failed
user@box beelzebub %

@jsoriano
Copy link
Member

@colin-stubbs how are you creating this database? It looks great, do you think we could add some different locations?

is there an existing issue or somewhere else that I can dump this info?

We have an internal meta issue that includes this, I have just created a public one: #2414

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allowed IP List from allowed_geo_ips.txt is insane
2 participants