You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Maybe, it could be a good addition to allow for printing/plotting permanence metrics, e.g. silhouette_score or any other that is more applicable in the case of AgglomerativeClustering. Or you prefer to keep those apart?
The text was updated successfully, but these errors were encountered:
defplot_score(data:pd.DataFrame, metric:str, linkage:str, max_clusters:int, score:str='silhouette'):
"""Plot clustering perfomance score for different number of clusters. Args: data (pd.DataFrame): Data to cluster. metric (str): Metric to use for clustering. linkage (str): Linkage method to use for clustering. max_clusters (int): Maximum number of clusters to try. score (str, optional): Score to use. Defaults to 'silhouette'. Raises: ValueError: If the score is unknown. Returns: None """ifscore=='silhouette':
score_function=silhouette_scoreelifscore=='calinski_harabasz':
score_function=calinski_harabasz_scoreelifscore=='davies_bouldin':
score_function=davies_bouldin_scoreelse:
raiseValueError(f'Unknown score: {score}')
scores= {}
foriinrange(2, max_clusters+1):
labels=clusterer.apply_agglomerative_clustering(data, i, metric=metric, linkage=linkage)
scores[i] =score_function(data, labels, metric=metric)
fig, ax=plt.subplots(figsize=(6, 4))
ax.plot(list(scores.keys()), list(scores.values()))
ax.set_xlabel('Number of clusters')
ax.set_ylabel(f'{score.capitalize()} score')
ax.spines['right'].set_visible(False)
ax.spines['top'].set_visible(False)
Maybe, it could be a good addition to allow for printing/plotting permanence metrics, e.g.
silhouette_score
or any other that is more applicable in the case ofAgglomerativeClustering
. Or you prefer to keep those apart?The text was updated successfully, but these errors were encountered: