Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add node-specific tolerances to intersection consolidation #1160

Merged
merged 18 commits into from
Apr 25, 2024

Conversation

EwoutH
Copy link
Contributor

@EwoutH EwoutH commented Apr 18, 2024

This PR enhances the consolidate_intersections function in the OSMnx simplification module to support variable node consolidation tolerances. Instead of a single tolerance parameter, this update introduces the tolerance parameter that can accept either a float or a dictionary. This allows users to specify precise tolerance levels per node directly within their graph, enabling finer control over intersection consolidation.

Motivation

Currently consolidate_intersections applies a uniform tolerance across a network, which may not suit complex urban models where different areas require different levels of detail. For example, a dense city center often needs a lower tolerance due to more intricate street layouts, while suburban areas might warrant higher tolerances.

See #1150.

Key Changes

  • The tolerance parameter now accepts a dictionary mapping node IDs to specific tolerance values, in addition to the existing option of a single float.
  • The internal function _merge_nodes_geometric has been updated to dynamically adjust buffer distances based on node-specific tolerances provided in the dictionary, with a default fallback tolerance if specific values are not provided.

Usage Example

import osmnx as ox
G = ox.graph_from_place('Delft, The Netherlands', network_type='drive')

# Apply a lower tolerance for high-connectivity nodes
tolerances = {node: 5 if count >= 4 else 10 for node, count in ox.stats.streets_per_node(G).items()}

G_consolidated = ox.consolidate_intersections(G, tolerance=tolerances)

In this example, nodes with higher street connectivity are assigned a smaller tolerance value (5), suitable for areas with denser road networks. Other nodes default to a higher tolerance of 10.


Closes #1150.

This branch can be installed with the following command:

pip install -U -e git+https://github.com/EwoutH/osmnx@consolidate_conditional_tolerance#egg=osmnx

@anastassiavybornova, @martinfleis, @jdmcbr and @jGaboardi please test and review if you have the chance :)

Copy link
Contributor

@martinfleis martinfleis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given we are in the realm of networkx.MultiDiGraph here, a "column" does not really have a meaning. The conversion to gdf is internal only. Since we are fetching the value from a node attribute, I would suggest using something like tolerance_attribute instead.

Do you have any idea how would you define local tolerance for each node? One thing is to enable it in the function, the other how to actually use it.

osmnx/simplification.py Outdated Show resolved Hide resolved
Copy link
Contributor

@martinfleis martinfleis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you are touching the function anyway, I would advise using dissolve and explode over unary_union and iteration over parts.

Like this (including my comments from above)

def _merge_nodes_geometric(G, tolerance, tolerance_column):
    gdf_nodes = convert.graph_to_gdfs(G, edges=False)

    if tolerance_column and tolerance_column in gdf_nodes.columns:
        buffer_distances = gdf_nodes[tolerance_column].fillna(tolerance)
        merged = gdf_nodes['geometry'].buffer(distance=buffer_distances).dissolve()
    else:
        merged = gdf_nodes['geometry'].buffer(distance=tolerance).dissolve()
    
    return merged.explode(ignore_index=True)

EwoutH and others added 3 commits April 18, 2024 11:41
- Remove unused numpy import (after refactoring it wasn't necessary anymore).
- Add tolerance_attribute docstring to private functions
@EwoutH
Copy link
Contributor Author

EwoutH commented Apr 18, 2024

Thanks for the swift review! I implemented the suggested changes.

Do you have any idea how would you define local tolerance for each node?

I think we can leave that up to users. An example is given in the PR description, but you can basically do any kind of custom mask, loop, function etc.

Since you are touching the function anyway, I would advise using dissolve and explode over unary_union and iteration over parts.

Sounds like a good idea, but I would like to keep this PR as small as possible (and atomic), so let's do that in another PR.

Copy link

codecov bot commented Apr 18, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.31%. Comparing base (a559a22) to head (e4e972e).

Additional details and impacted files
@@           Coverage Diff           @@
##               v2    #1160   +/-   ##
=======================================
  Coverage   98.30%   98.31%           
=======================================
  Files          24       24           
  Lines        2365     2370    +5     
=======================================
+ Hits         2325     2330    +5     
  Misses         40       40           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@gboeing
Copy link
Owner

gboeing commented Apr 23, 2024

I'll try to take a look later this week. In the meantime, please base this PR off the v2 branch. The main branch (soon to be v1) is for maintenance only at this point, rather than feature development.

@EwoutH
Copy link
Contributor Author

EwoutH commented Apr 23, 2024

Right. If that’s the case, could you set v2 as the default branch?

@gboeing
Copy link
Owner

gboeing commented Apr 24, 2024

Yes, soon the v2 branch will finally be merged into main after #1157 (and potentially this PR) are merged in, ideally later this week (and a permanent v1 maintenance branch will split off of main beforehand). The branch chaos will at long last be at its end.

@EwoutH EwoutH changed the base branch from main to v2 April 24, 2024 06:04
@EwoutH
Copy link
Contributor Author

EwoutH commented Apr 24, 2024

I think I resolved al the conflicts when targeting the v2 branch correctly.

Copy link
Owner

@gboeing gboeing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @EwoutH. If a simple user-defined per-node tolerance like this is useful to the community, then it certainly makes sense to add it. I've requested a change. Additionally, the tests are currently failing. Consider running the pre-commit hooks locally for ease.

osmnx/simplification.py Outdated Show resolved Hide resolved
This commit updates the consolidate_intersections function to accept the tolerance parameter as either a float or a dictionary mapping node IDs to floats.

It removes the previously suggested tolerance_attribute.
@EwoutH EwoutH force-pushed the consolidate_conditional_tolerance branch from f0b3817 to 933a7e3 Compare April 25, 2024 07:50
@EwoutH
Copy link
Contributor Author

EwoutH commented Apr 25, 2024

Reimplemented tolerance to accept a dict, removed the tolerance_attribute argument, updated the docstring and updated the PR message.

I tested it on one of my own notebooks, and it works beautifully!

# Create a dict mapping node ids to tolerances: 10 if part of the city network, 50 if part of the surrounding area network
tolerance_dict = {node: 10 if data["network"] == city_name else 50 for node, data in merged_network.nodes(data=True)}

# Consolidate intersections
merged_network = ox.consolidate_intersections(merged_network, tolerance=tolerance_dict, rebuild_graph=True)

CC @anastassiavybornova, @martinfleis, @jdmcbr and @jGaboardi, please test and review if you have the chance :)

Update `_merge_nodes_geometric` to manage absent tolerance values by reverting to original geometries instead of creating POLYGON EMPTY.
Copy link
Owner

@gboeing gboeing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @EwoutH. I left a few final comments.

In addition, please 1) update the changelog, and 2) add simple tests to hit the changed lines to maintain coverage. I will then merge.

osmnx/simplification.py Outdated Show resolved Hide resolved
osmnx/simplification.py Outdated Show resolved Hide resolved
osmnx/simplification.py Outdated Show resolved Hide resolved
osmnx/simplification.py Outdated Show resolved Hide resolved
@gboeing
Copy link
Owner

gboeing commented Apr 25, 2024

I would advise using dissolve and explode over unary_union and iteration over parts.

@martinfleis I missed this comment earlier, but your efficiency suggestion is well taken. Quick question for you: only GeoDataFrame has dissolve, and not GeoSeries right? The object in question here is a GeoSeries.

Don't cover the actual behaviour, just check if passing a dictionary (with and without all nodes covered) leads to a runtime error or not.
@EwoutH
Copy link
Contributor Author

EwoutH commented Apr 25, 2024

Thanks for the review. All comments should be addressed.

Note that the tests don't cover the actual behaviour, just check if passing a dictionary (with and without all nodes covered) leads to a runtime error or not. While not ideal, this is in line with the other tests currently.

@EwoutH
Copy link
Contributor Author

EwoutH commented Apr 25, 2024

CI is stuck on a type hinting thing, it doesn't like the int values in the tolerance dict, while is doesn't find it a problem when inputted directly just a few lines above.

Maybe we should create a custom "number" type that int | float and use that throughout.

Preferably in a separate PR though, I'm leaving for a week tonight.

@martinfleis
Copy link
Contributor

only GeoDataFrame has dissolve, and not GeoSeries right?

Correct, I didn't realise that.

@gboeing
Copy link
Owner

gboeing commented Apr 25, 2024

CI is stuck on a type hinting thing, it doesn't like the int values in the tolerance dict, while is doesn't find it a problem when inputted directly just a few lines above.

Maybe we should create a custom "number" type that int | float and use that throughout.

You had to type hint the dict for typeguard to know what data type comes out of G.nodes. Fixed in e4e972e.

@gboeing gboeing merged commit 71f8d7f into gboeing:v2 Apr 25, 2024
7 checks passed
@gboeing
Copy link
Owner

gboeing commented Apr 25, 2024

Do you have any idea how would you define local tolerance for each node? One thing is to enable it in the function, the other how to actually use it.

@martinfleis yes, that is along the lines of what I was saying earlier in the issue: #1150 (comment)

It's relatively straightforward to implement a per-node tolerance, but the real challenge is in actually defining it per node. They are only useful in tandem, and the latter is infinitely variable depending on the theoretical and empirical needs of the individual analysis, so I have been reluctant to include this functionality in the past. I believe defining the local tolerance will most likely remain outside of OSMnx's scope accordingly. @EwoutH identified a group of folks who want to implement per-node tolerance, and since his solution hands off the definition of tolerance values to the user, I think this PR satisfies my scope and implementation concerns. The rest is up to the users now.

@gboeing gboeing mentioned this pull request Apr 25, 2024
13 tasks
@EwoutH
Copy link
Contributor Author

EwoutH commented Apr 26, 2024

Thanks for merging!

I think this PR satisfies my scope and implementation concerns. The rest is up to the users now.

Agreed! I think we landed on a quite elegant solution that still offers a lot of flexibility to users.

@gboeing
Copy link
Owner

gboeing commented May 3, 2024

The first pre-release OSMnx v2 beta has been released. Testers needed! See #1123 for details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Conditional tolerance for intersection consolidation
3 participants