You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/src/index.md
+5-2Lines changed: 5 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -72,15 +72,16 @@ git checkout experimental
72
72
-[X] Implementation of [Hamerly implementation](https://www.researchgate.net/publication/220906984_Making_k-means_Even_Faster).
73
73
-[X] Interface for inclusion in Alan Turing Institute's [MLJModels](https://github.com/alan-turing-institute/MLJModels.jl#who-is-this-repo-for).
74
74
-[X] Full Implementation of Triangle inequality based on [Elkan - 2003 Using the Triangle Inequality to Accelerate K-Means"](https://www.aaai.org/Papers/ICML/2003/ICML03-022.pdf).
75
-
-[X] Implementation of [Yinyang K-Means: A Drop-In Replacement of the Classic K-Means with Consistent Speedup](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/ding15.pdf)
75
+
-[X] Implementation of [Yinyang K-Means: A Drop-In Replacement of the Classic K-Means with Consistent Speedup](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/ding15.pdf).
76
+
-[X] Implementation of [Coresets](http://proceedings.mlr.press/v51/lucic16-supp.pdf).
76
77
-[ ] Implementation of [Geometric methods to accelerate k-means algorithm](http://cs.baylor.edu/~hamerly/papers/sdm2016_rysavy_hamerly.pdf).
78
+
-[X] Support for weighted K-means.
77
79
-[ ] Support for other distance metrics supported by [Distances.jl](https://github.com/JuliaStats/Distances.jl#supported-distances).
78
80
-[ ] Support of MLJ Random generation hyperparameter.
79
81
-[ ] Native support for tabular data inputs outside of MLJModels' interface.
80
82
-[ ] Refactoring and finalizaiton of API desgin.
81
83
-[ ] GPU support.
82
84
-[ ] Distributed calculations support.
83
-
-[ ] Implementation of other K-Means algorithm variants based on recent literature.
84
85
-[ ] Optimization of code base.
85
86
-[ ] Improved Documentation
86
87
-[ ] More benchmark tests.
@@ -123,6 +124,7 @@ r.converged # whether the procedure converged
123
124
-[Hamerly()](https://www.researchgate.net/publication/220906984_Making_k-means_Even_Faster) - Hamerly is good for moderate number of clusters (< 50?) and moderate dimensions (<100?).
124
125
-[Elkan()](https://www.aaai.org/Papers/ICML/2003/ICML03-022.pdf) - Recommended for high dimensional data.
125
126
-[Yinyang()](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/ding15.pdf) - Recommended for large dimensions and/or large number of clusters.
127
+
-[Coreset()](http://proceedings.mlr.press/v51/lucic16-supp.pdf) - Recommended for very fast clustering of very large datasets, when extreme accuracy is not important.
0 commit comments