Skip to content

Commit 72d8fac

Browse files
committed
initial release beta version
1 parent dc3414b commit 72d8fac

File tree

4 files changed

+28
-4
lines changed

4 files changed

+28
-4
lines changed

.github/workflows/CompatHelper.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ jobs:
1313
steps:
1414
- uses: julia-actions/setup-julia@latest
1515
with:
16-
version: 1.3
16+
version: 1.4
1717
- name: Pkg.add("CompatHelper")
1818
run: julia -e 'using Pkg; Pkg.add("CompatHelper")'
1919
- name: CompatHelper.main()

.github/workflows/benchmarks.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ jobs:
1010
- uses: actions/checkout@v2
1111
- uses: julia-actions/setup-julia@latest
1212
with:
13-
version: 1.3
13+
version: 1.4
1414
- name: Install dependencies
1515
run: julia -e 'using Pkg; pkg"add PkgBenchmark Distances StatsBase BenchmarkTools BenchmarkCI@0.1"'
1616
- name: Run benchmarks

docs/src/benchmark_image.png

-476 KB
Loading

docs/src/index.md

Lines changed: 26 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -86,6 +86,15 @@ some_results = kmeans([algo], input_matrix, k; kwargs)
8686
r = kmeans(Lloyd(), X, 3) # same result as the default
8787
```
8888

89+
```julia
90+
# r contains all the learned artifacts which can be accessed as;
91+
r.centers # cluster centers (d x k)
92+
r.assignments # label assignments (n)
93+
r.totalcost # total cost (i.e. objective)
94+
r.iterations # number of elapsed iterations
95+
r.converged # whether the procedure converged
96+
```
97+
8998
### Supported KMeans algorithm variations.
9099
- [Lloyd()](https://cs.nyu.edu/~roweis/csc2515-2006/readings/lloyd57.pdf)
91100
- [Hamerly()](https://www.researchgate.net/publication/220906984_Making_k-means_Even_Faster)
@@ -124,8 +133,7 @@ scatter(iris.PetalLength, iris.PetalWidth, marker_z=result.assignments,
124133
using ParallelKMeans
125134

126135
# Single Thread Implementation of Lloyd's Algorithm
127-
b = [ParallelKMeans.kmeans(X, i, n_threads=1;
128-
tol=1e-6, max_iters=300, verbose=false).totalcost for i = 2:10]
136+
b = [ParallelKMeans.kmeans(X, i, n_threads=1; tol=1e-6, max_iters=300, verbose=false).totalcost for i = 2:10]
129137

130138
# Multi Thread Implementation of Lloyd's Algorithm by default
131139
c = [ParallelKMeans.kmeans(X, i; tol=1e-6, max_iters=300, verbose=false).totalcost for i = 2:10]
@@ -142,9 +150,25 @@ Currently, this package is benchmarked against similar implementation in both Py
142150
Currently, the benchmark speed tests are based on the search for optimal number of clusters using the [Elbow Method](https://en.wikipedia.org/wiki/Elbow_method_(clustering)) since this is a practical use case for most practioners employing the K-Means algorithm.
143151

144152

153+
### Benchmark Results
154+
145155
![benchmark_image.png](benchmark_image.png)
146156

147157

158+
_________________________________________________________________________________________________________
159+
160+
| 1 million (ms) | 100k (ms) | 10k (ms) | 1k (ms) | package | language |
161+
|:--------------:|:---------:|:--------:|:-------:|:-----------------------:|:--------:|
162+
| 600184.00 | 31959.00 | 832.25 | 18.19 | Clustering.jl | Julia |
163+
| 35733.00 | 4473.00 | 255.71 | 8.94 | Lloyd | Julia |
164+
| 12617.00 | 1655.00 | 122.53 | 7.98 | Hamerly | Julia |
165+
| 1430000.00 | 146000.00 | 5770.00 | 344.00 | Sklearn Kmeans | Python |
166+
| 30100.00 | 3750.00 | 613.00 | 201.00 | Sklearn MiniBatchKmeans | Python |
167+
| 218200.00 | 15510.00 | 733.70 | 19.47 | Knor | R |
168+
169+
_________________________________________________________________________________________________________
170+
171+
148172
## Release History
149173
- 0.1.0 Initial release
150174

0 commit comments

Comments
 (0)