Skip to content

Containment with cap cells #13

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
weiji14 opened this issue Dec 16, 2020 · 4 comments
Open

Containment with cap cells #13

weiji14 opened this issue Dec 16, 2020 · 4 comments
Labels
bug Something isn't working
Milestone

Comments

@weiji14
Copy link

weiji14 commented Dec 16, 2020

Kia ora! I realize this library is still young but just had a question on using rHEALPix as a sortable polygon "geohash". I've been looking for a way to consistently index/name some polygon features (whose outlines can change slightly over time), and tried using the rhealpixdggs.conversion.get_finest_containing_cell added in #12, but it doesn't seem to work for some of the polygons I've tried.

import geopandas as gpd
import rhealpixdggs.conversion

gdf: gpd.GeoDataFrame = gpd.read_file(filename="lakeconway.geojson")
polygon: shapely.geometry.Polygon = gdf.loc[0].geometry

rhealpixdggs.conversion.get_finest_containing_cell(polygon=polygon)
# returns None instead of string like S1234567

Sample polygon is lakeconway.geojson.zip (using EPSG:4326). This is a location rather close to the South Pole (about Lat: 84.°S, Lon: 150°W). My question is whether rHEALPix-py is able to handle polar projections with some tweaking, or if there are some inherent limitations with the system.

@alpha-beta-soup
Copy link
Member

This is a key usecase, a clear method to do this will be part of v0.6.0.

@alpha-beta-soup alpha-beta-soup added this to the v0.6.0 milestone Dec 18, 2023
@alpha-beta-soup alpha-beta-soup added documentation Improvements or additions to documentation enhancement New feature or request labels Dec 18, 2023
@ppKrauss
Copy link

ppKrauss commented Feb 11, 2024

Hello, we can collaborate with the "binary Geohash", as described for example in this article
https://mmcloughlin.com/posts/geohash-assembly
PS: is possible to adapt to binary, the neighbours operations are also fast and non-geometrical operations, as demonstrated here.

Ideally, we would also like to use PostgreSQL/PostGIS instead of Python. With PostGIS we can use PROJ's rHEALPix by registered SRID for inverse.

Binary Geohash and human representations

Other bases, in addition to the classic Geohash base32, can be used to express the cell identifier as a geocode, see for example base32nvu or base16h at this interactive illustration:

image

image

So, we can use a system of square grids with ~23 hierarchical levels, very useful for multifunctional applications.

For a complete, bit by bit cell representation... We can duplicate the number of hierarchical levels (~45) using the "degenerated grid" (rectangular), to express base32 and to quadtree applications:

image

Any "integer level", with indexes i, can be transformed into a "half level" with indexes j:

image

PS: the transformation is valid also for Hilbert curves, so is possible to express a "Hilbert Geohash", when application need continuous intervals — and the application is not concerned with aperiodicity in the degenerated grid.


Practical example of the use of the binary Geohash encode, with PostgreSQL in bigint (64 bits integer) or varbit type, and translating it to humans with base16h, for a postal address point in Brasil, national grid:

image

@alpha-beta-soup
Copy link
Member

alpha-beta-soup commented Nov 1, 2024

I think the issue may be that get_finest_containing_cell is generating invalid geometries when generating child cells to test containment.

>>> from rhealpixdggs.cell import Cell
>>> import rhealpixdggs.conversion
>>> from shapely.geometry import Polygon
>>> gdf = gpd.read_file(filename="lakeconway.geojson")
>>> polygon = gdf.loc[0].geometry
>>> parent_cell = Cell(rdggs=WGS84_003, suid=tuple("S"))
>>> children_cells = [cell for cell in parent_cell.subcells()]
>>> children_poly = [Polygon(cell.vertices(plane=False)) for cell in children_cells]
>>> list(map(lambda x: x.is_valid, children_poly))
[False, True, True, False, True, True, True, True, True]

They're invalid because...

>>> from shapely.validation import explain_validity
>>> list(map(lambda x: explain_validity(x), children_poly))
['Ring Self-intersection[-150 -41.9378539101601]', 'Valid Geometry', 'Valid Geometry', 'Self-intersection[117 -45.1864691893437]', 'Valid Geometry', 'Valid Geometry', 'Valid Geometry', 'Valid Geometry', 'Valid Geometry']

That's one issue, and actually a red herring. These are invalid due to issues with the antimeridian, and wrapping coordinates causing self-intersections. But the problematic cell is technically valid:

>>> str(children_cells[4])
'S4'
>>> children_poly[4].wkt
'POLYGON ((-179.99999999999991 -74.42400670199598, -90 -74.42400670199598, -0.0000000000000064 -74.42400670199598, 89.99999999999993 -74.42400670199599, -179.99999999999991 -74.42400670199598))'
>>> children_cells[4].vertices(plane=False)
[(np.float64(-179.99999999999991), np.float64(-74.42400670199598)), (np.float64(-90.0), np.float64(-74.42400670199598)), (np.float64(-6.3611093629270335e-15), np.float64(-74.42400670199598)), (np.float64(89.99999999999993), np.float64(-74.42400670199599))]
>>> children_cells[4].vertices(plane=True)
[(np.float64(-16679257.796284417), np.float64(-8339628.898142208)), (np.float64(-13343406.237027533), np.float64(-8339628.898142208)), (np.float64(-13343406.237027533), np.float64(-11675480.457399093)), (np.float64(-16679257.796284417), np.float64(-11675480.457399093))]

Cell S4 is a cap cell (diagram shows N4 here, but it's the same for S), and this happens to be the cell that actually intersects your sample feature.

Notice that the non-planar values for S4's vertices have the value -74.42400670199598 four times.

As your feature is contained by this cap cell, then the test for containment fails, because it isn't actually contained by this improper representation of the cell.

However, it should work fine for any features that aren't contained by polar cap cells. We can prove this by translating your feature north, and expecting a result that isn't None:

>>> str(rhealpixdggs.conversion.get_finest_containing_cell(polygon=shapely.affinity.translate(polygon, yoff=80)))
'O460'

I'll take the liberty of renaming the issue to express that this is primarily an issue with cap cells.

However I don't think Shapely can actually represent a cap cell in non-planar coordinates. It will have to correctly deal with the antimeridian as well as the pole. But if the input is first projected, then the test can be done with planar cells.

@alpha-beta-soup alpha-beta-soup changed the title Use as a polygon geohash? Containment with cap cells Nov 1, 2024
@alpha-beta-soup alpha-beta-soup added bug Something isn't working and removed documentation Improvements or additions to documentation enhancement New feature or request labels Nov 1, 2024
@alpha-beta-soup
Copy link
Member

alpha-beta-soup commented Nov 1, 2024

Incidentally, if you're after the representation of the feature at a particular fidelity (resolution), you can try:

from rhealpixdggs.conversion import CellZoneFromPoly
# Offset the sample polygon, so it's not in the (broken) cap cell:
polygon = shapely.affinity.translate(polygon, yoff=80)

zone = CellZoneFromPoly([None, polygon], 6, True)
list(map(str, zone.cells_list))
# ['O460014', 'O460015', 'O460022', 'O460023', 'O460024', 'O460025', 'O460028', 'O460100', 'O460101', 'O460102', 'O460103', 'O460104', 'O460105', 'O460106', 'O460107', 'O460110', 'O460111', 'O460112', 'O460113', 'O460114', 'O460115', 'O460120', 'O460121', 'O460122', 'O460123', 'O460200', 'O460201', 'O460202', 'O460210']

zone = CellZoneFromPoly([None, polygon], 7, True)
list(map(str, zone.cells_list))
# ['O4600127', 'O4600128', 'O4600144', 'O4600145', 'O4600147', 'O4600148', 'O4600150', 'O4600151', 'O4600152', 'O4600153', 'O4600154', 'O4600155', 'O4600156', 'O4600157', 'O4600158', 'O4600206', 'O4600207', 'O4600208', 'O4600216', 'O4600217', 'O4600218', 'O4600223', 'O4600224', 'O4600225', 'O4600226', 'O4600227', 'O4600228', 'O460023', 'O460024', 'O460025', 'O4600283', 'O4600284', 'O4600285', 'O4601003', 'O4601004', 'O4601005', 'O4601006', 'O4601007', 'O4601008', 'O4601013', 'O4601014', 'O4601015', 'O4601016', 'O4601017', 'O4601018', 'O4601023', 'O4601024', 'O4601025', 'O4601026', 'O4601027', 'O4601028', 'O460103', 'O460104', 'O4601050', 'O4601051', 'O4601052', 'O4601053', 'O4601054', 'O4601055', 'O4601056', 'O4601057', 'O4601058', 'O4601063', 'O4601064', 'O4601065', 'O4601073', 'O4601074', 'O4601075', 'O4601083', 'O4601103', 'O4601104', 'O4601105', 'O4601106', 'O4601107', 'O4601108', 'O4601113', 'O4601114', 'O4601115', 'O4601116', 'O4601117', 'O4601118', 'O4601123', 'O4601124', 'O4601125', 'O4601126', 'O4601127', 'O4601128', 'O4601130', 'O4601131', 'O4601132', 'O4601133', 'O4601134', 'O4601135', 'O4601136', 'O4601137', 'O4601138', 'O4601140', 'O4601141', 'O4601142', 'O4601143', 'O4601144', 'O4601145', 'O4601146', 'O4601150', 'O4601151', 'O4601152', 'O4601153', 'O4601154', 'O4601155', 'O4601203', 'O4601204', 'O4601205', 'O4601206', 'O4601207', 'O4601208', 'O4601213', 'O4601214', 'O4601215', 'O4601216', 'O4601217', 'O4601218', 'O4601223', 'O4601224', 'O4601225', 'O4601226', 'O4601227', 'O4601228', 'O4601230', 'O4601231', 'O4601232', 'O4601233', 'O4601234', 'O4601235', 'O4601240', 'O4601241', 'O4601242', 'O4601243', 'O4601250', 'O4601251', 'O4601252', 'O4602003', 'O4602004', 'O4602005', 'O4602006', 'O4602007', 'O4602008', 'O4602013', 'O4602014', 'O4602015', 'O4602016', 'O4602017', 'O4602018', 'O4602023', 'O4602024', 'O4602025', 'O4602026', 'O4602027', 'O4602028', 'O4602030', 'O4602031', 'O4602032', 'O4602103', 'O4602104', 'O4602106', 'O4602107', 'O4602108']

I'm not particularly fond of that interface, but there it is at the moment. The compressed representation is available via compress_order_cells, but you have to cast the cell objects to string:

compress_order_cells(map(str, zone.cells_list))
# ['O460015', 'O460023', 'O460024', 'O460025', 'O460103', 'O460104', 'O460105', 'O460113', 'O4600127', 'O4600128', 'O4600144', 'O4600145', 'O4600147', 'O4600148', 'O4600206', 'O4600207', 'O4600208', 'O4600216', 'O4600217', 'O4600218', 'O4600223', 'O4600224', 'O4600225', 'O4600226', 'O4600227', 'O4600228', 'O4600283', 'O4600284', 'O4600285', 'O4601003', 'O4601004', 'O4601005', 'O4601006', 'O4601007', 'O4601008', 'O4601013', 'O4601014', 'O4601015', 'O4601016', 'O4601017', 'O4601018', 'O4601023', 'O4601024', 'O4601025', 'O4601026', 'O4601027', 'O4601028', 'O4601063', 'O4601064', 'O4601065', 'O4601073', 'O4601074', 'O4601075', 'O4601083', 'O4601103', 'O4601104', 'O4601105', 'O4601106', 'O4601107', 'O4601108', 'O4601113', 'O4601114', 'O4601115', 'O4601116', 'O4601117', 'O4601118', 'O4601123', 'O4601124', 'O4601125', 'O4601126', 'O4601127', 'O4601128', 'O4601140', 'O4601141', 'O4601142', 'O4601143', 'O4601144', 'O4601145', 'O4601146', 'O4601150', 'O4601151', 'O4601152', 'O4601153', 'O4601154', 'O4601155', 'O4601203', 'O4601204', 'O4601205', 'O4601206', 'O4601207', 'O4601208', 'O4601213', 'O4601214', 'O4601215', 'O4601216', 'O4601217', 'O4601218', 'O4601223', 'O4601224', 'O4601225', 'O4601226', 'O4601227', 'O4601228', 'O4601230', 'O4601231', 'O4601232', 'O4601233', 'O4601234', 'O4601235', 'O4601240', 'O4601241', 'O4601242', 'O4601243', 'O4601250', 'O4601251', 'O4601252', 'O4602003', 'O4602004', 'O4602005', 'O4602006', 'O4602007', 'O4602008', 'O4602013', 'O4602014', 'O4602015', 'O4602016', 'O4602017', 'O4602018', 'O4602023', 'O4602024', 'O4602025', 'O4602026', 'O4602027', 'O4602028', 'O4602030', 'O4602031', 'O4602032', 'O4602103', 'O4602104', 'O4602106', 'O4602107', 'O4602108']

len(compress_order_cells(map(str, zone.cells_list))) < len(zone.cells_list)
# True

Resolution 7 and 8:

Screenshot from 2024-11-01 23-15-23
Screenshot from 2024-11-01 23-15-30

@ndemaio ndemaio mentioned this issue Apr 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants