Grouping skiers into clusters. Source: Marcus Lofvenberg, Unsplash.com

Unsupervised Clustering Models

Building additional models on unlabeled ski resort data

sklearn.cluster

K-Means Cluster Model

# Create a copy of cleaned, standardized data
kmeans_all_customers = all_customers.copy()
# Set clusters = 5 and include random state for reproducibility
kmeans_all_customers_model_5 = KMeans(n_clusters=5,random_state=123)
# Fit and predict the model based on the following columns
y=kmeans_all_customers_model_5.fit_predict(scaled_all_customers_df[['Number of Trips','Total Revenue','Mean Order Time','Adult Tickets', 'Youth/Senior Tickets','Miles to Resort']])
# Add cluster feature back to data frame copy
kmeans_all_customers['Cluster'] = y
# Create a groupby of the clusters
kmeans_model_clusters = kmeans_all_customers.groupby('Cluster')
# Run my examine_cluster function to get a better look at the cluster stats:
examine_clusters_again(kmeans_all_customers)
# Get a visual on the revenue characteristics of each cluster using revenue.
sns.boxplot(x='Cluster', y='Total Revenue', data=kmeans_all_customers)
plt.title('KMeans Customer Revenue', fontsize = 24)
plt.xlabel('Cluster Number', fontsize = 20)
plt.ylabel('Total Revenue', fontsize = 20)
plt.show()

Gaussian Mixture

# Create a new copy of cleaned, standardized data
gausmix_all_customers = all_customers.copy()
# Set clusters = 5 and include random state for reproducibility
gaussian_all_customers_model_5 = GaussianMixture(n_components=5, random_state=123)
# Fit and predict the model based on columns
y=gaussian_all_customers_model_5.fit_predict(scaled_all_customers_df[['Number of Trips','Total Revenue','Mean Order Time','Adult Tickets', 'Youth/Senior Tickets','Miles to Resort']])
# Add cluster feature back to data frame copy
gausmix_all_customers['Cluster'] = y
# Create a groupby of the clusters
gausmix_model_clusters = gausmix_all_customers.groupby('Cluster')
# Run my examine_cluster function to get a better look at the cluster stats:
examine_clusters_again(gausmix_all_customers)

Agglomerative Clustering

Density-Based Spatial Clustering of Applications with Noise (DBSCAN)

Spectral Clustering

Aspiring Data Scientist

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store