Kmeans clustering is a well known algorithm. Basically you create k cluster centers and fit them to your data. New data points are assigned to the cluster that is closest to them. Here is an example of clusters we trained on random data:
Once we got the clusters we created sets of input data that we predicted labels for using our cpu and our gpu. We used NumPy and scikit-learns kmeans implementation on our cpu. And we wrote a cuda kernel function that predicts labels for our data and runs on our gpu. For our kernel function we used as number of blocks the number of data points divided by 256 and 256 threads per block.
Here is how we called our kmeans kernel function:
kmeanspredict<<<(N+255)/256, 256>>>(N, 3, d_feature1, d_feature2, d_model, d_label);
Since the time complexity of our task is O(n), we expect both implementations to increase in runtime with about the same rate. We saw that our gpu implementation runs about 100 times faster than the cpu code. Here are our runtime measurements for an increasing number of data points:
We used a geforce gtx 1050ti for our test. The kernel code is on our github.


Keine Kommentare:
Kommentar veröffentlichen