-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Description
Expected behavior and actual behavior.
Given a geometry spatial filter or a bounding box spatial filter, the Arrow interface returns more features than intersect the geometry for some drivers (GPKG, FlatGeobuf), whereas the regular interface returns the expected features. It appears that the Arrow interface uses the bounding boxes of the geometries in the data source instead of their actual geometries.
Other drivers tested (Shapefile, GeoJSON) produce expected results when using both the regular and Arrow interfaces.
For example when querying NaturalEarth countries (WGS84 coordinates), a point located in the middle of Canada returns only a single record for Canada when using the regular interface, whereas it returns Canada, Russia, and the USA when using the Arrow interface (bounding boxes for Russia and USA wrap around the anti-meridian).
First observed in pyogrio #285
Steps to reproduce the problem.
Using a test GPKG file created from NaturalEarth countries (1:110m)
from osgeo import ogr
path = "/tmp/test.gpkg"
driver = ogr.GetDriverByName("GPKG")
dataSource = driver.Open(path, 0)
layer = dataSource.GetLayer()
# point located in Canada
layer.SetSpatialFilter(ogr.CreateGeometryFromWkt("Point (-105 55)"))
# or
# layer.SetSpatialFilterRect(-105, 54, -104, 55)
iso_a3 = []
for feature in layer:
iso_a3.append(feature.GetField("iso_a3"))
print(f"Using regular interface: {iso_a3}")
stream = layer.GetArrowStreamAsPyArrow()
iso_a3_arrow = []
for batch in stream:
iso_a3_arrow.extend(batch.field("iso_a3").tolist())
print(f"Using arrow interface: {iso_a3_arrow}")
Outputs:
Using regular interface: ['CAN']
Using arrow interface: ['RUS', 'USA', 'CAN']
Operating system
MacOS 12.6.5 (M1)
GDAL version and provenance
Reproduced using both:
- 3.7.1 installed via Homebrew
gdal
python package (3.7.1.1) installed viapip