becode, team challenge
- A visualisation of the USA with chipotle locations
- Visualization of the different clusters
- Intrinsic analysis comparison of the clusters of at least 2 methods with varying arguments (using euclidian distance as criteria)
- A chosen centroid to live. Make your argument of why the chosen centroid is superior to others. Examples of arguments are:
- highest density
- greatest uninterrupted link of chipotle locations with smallest link-to-link distance
- ...
- a Github page where results are visualized
- Create the repository
- Install geopandas
- Plot the US map
- Visualize your data on this map
- Plot a dendogram of your data to help you decide the appropropriate clustering resolution
- Compare and analyse different clustering methods using intrinsic analysis to decide on a chosen method.
- Choose a centroid/adress to live
- Publish your results to a Github page with an explanation of your method.
Dataset "chipotle_stores" as "df"
Dataset "states" using geopandas
At first we tried to draw the map of the states using geopandas...
...in order to be able to plot our points directly on the map.
Now we have to choosing only relevant states
Easy using this code
df = df.groupby('state').filter(lambda x : len(x)>20)
And next ...
- Check number of invalid metric entries.
- Adjusting index
Our map looks much clearer like this