Skip to content

Commit 6854197

Browse files
authored
Merge pull request #64 from xiaodaigh/patch-1
typos and grammar
2 parents a7107da + dee61b4 commit 6854197

File tree

1 file changed

+17
-17
lines changed

1 file changed

+17
-17
lines changed

docs/src/index.md

Lines changed: 17 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -2,21 +2,21 @@
22

33
## Motivation
44

5-
It's actually a funny story led to the development of this package.
6-
What started off as a personal toy project trying to re-construct the K-Means algorithm in native Julia blew up after a heated discussion on the Julia Discourse forum when I asked for Julia optimizaition tips. Long story short, Julia community is an amazing one! Andrey offered his help and together, we decided to push the speed limits of Julia with a parallel implementation of the most famous clustering algorithm. The initial results were mind blowing so we have decided to tidy up the implementation and share with the world as a maintained Julia pacakge.
5+
It's actually a funny story that led to the development of this package.
6+
What started off as a personal toy project trying to re-construct the K-Means algorithm in native Julia blew up after a heated discussion on the Julia Discourse forum when I asked for Julia optimization tips. Long story short, the Julia community is an amazing one! Andrey offered his help and together, we decided to push the speed limits of Julia with a parallel implementation of the most famous clustering algorithm. The initial results were mind blowing so we have decided to tidy up the implementation and share with the world as a maintained Julia package.
77

88
Say hello to `ParallelKMeans`!
99

10-
This package aims to utilize the speed of Julia and parallelization (both CPU & GPU) to offer an extremely fast implementation of the K-Means clustering algorithm and its variations via a friendly interface for practioners.
10+
This package aims to utilize the speed of Julia and parallelization (both CPU & GPU) to offer an extremely fast implementation of the K-Means clustering algorithm and its variants via a friendly interface for practioners.
1111

12-
In short, we hope this package will eventually mature as the "one stop" shop for everything K-Means on both CPUs and GPUs.
12+
In short, we hope this package will eventually mature as the "one-stop-shop" for everything K-Means on CPUs and GPUs.
1313

1414
## K-Means Algorithm Implementation Notes
1515

16-
Since Julia is a column major language, the input (design matrix) expected by the package in the following format;
16+
Since Julia is a column major language, the input (design matrix) expected by the package must be in the following format;
1717

1818
- Design matrix X of size n×m, the i-th column of X `(X[:, i])` is a single data point in n-dimensional space.
19-
- Thus, the rows of the design design matrix represents the feature space with the columns representing all the training examples in this feature space.
19+
- Thus, the rows of the design matrix represent the feature space with the columns representing all the training samples in this feature space.
2020

2121
One of the pitfalls of K-Means algorithm is that it can fall into a local minima.
2222
This implementation inherits this problem like every implementation does.
@@ -26,28 +26,28 @@ As a result, it is useful in practice to restart it several times to get the cor
2626

2727
You can grab the latest stable version of this package from Julia registries by simply running;
2828

29-
*NB:* Don't forget to Julia's package manager with `]`
29+
*NB:* Don't forget to invoke Julia's package manager with `]`
3030

3131
```julia
3232
pkg> add ParallelKMeans
3333
```
3434

35-
For the few (and selected) brave ones, one can simply grab the current experimental features by simply adding the experimental branch to your development environment after invoking the package manager with `]`:
35+
The few (and selected) brave ones can simply grab the current experimental features by simply adding the experimental branch to your development environment after invoking the package manager with `]`:
3636

3737
```julia
3838
dev git@github.com:PyDataBlog/ParallelKMeans.jl.git
3939
```
4040

41-
Don't forget to checkout the experimental branch and you are good to go with bleeding edge features and breaks!
41+
Don't forget to checkout the experimental branch and you are good to go with bleeding edge features and breakages!
4242

4343
```bash
4444
git checkout experimental
4545
```
4646

4747
## Features
4848

49-
- Lightening fast implementation of Kmeans clustering algorithm even on a single thread in native Julia.
50-
- Support for multi-theading implementation of K-Means clustering algorithm.
49+
- Lightning fast implementation of Kmeans clustering algorithm even on a single thread in native Julia.
50+
- Support for multi-threading implementation of K-Means clustering algorithm.
5151
- 'Kmeans++' initialization for faster and better convergence.
5252
- Implementation of available classic and contemporary variants of the K-Means algorithm.
5353

@@ -68,7 +68,7 @@ git checkout experimental
6868

6969
## How To Use
7070

71-
Taking advantage of Julia's brilliant multiple dispatch system, the package exposes users to a very easy to use API.
71+
Taking advantage of Julia's brilliant multiple dispatch system, the package exposes users to a very easy-to-use API.
7272

7373
```julia
7474
using ParallelKMeans
@@ -90,7 +90,7 @@ r = kmeans(Lloyd(), X, 3) # same result as the default
9090
```
9191

9292
```julia
93-
# r contains all the learned artifacts which can be accessed as;
93+
# r contains all the learned artifacts that can be accessed as;
9494
r.centers # cluster centers (d x k)
9595
r.assignments # label assignments (n)
9696
r.totalcost # total cost (i.e. objective)
@@ -121,7 +121,7 @@ iris = dataset("datasets", "iris");
121121
# features to use for clustering
122122
features = collect(Matrix(iris[:, 1:4])');
123123

124-
# various artificats can be accessed from the result ie assigned labels, cost value etc
124+
# various artifacts can be accessed from the result i.e. assigned labels, cost value etc
125125
result = kmeans(features, 3);
126126

127127
# plot with the point color mapped to the assigned cluster index
@@ -140,14 +140,14 @@ using ParallelKMeans
140140
# Single Thread Implementation of Lloyd's Algorithm
141141
b = [ParallelKMeans.kmeans(X, i, n_threads=1; tol=1e-6, max_iters=300, verbose=false).totalcost for i = 2:10]
142142

143-
# Multi Thread Implementation of Lloyd's Algorithm by default
143+
# Multi-threaded Implementation of Lloyd's Algorithm by default
144144
c = [ParallelKMeans.kmeans(X, i; tol=1e-6, max_iters=300, verbose=false).totalcost for i = 2:10]
145145

146146
```
147147

148148
## Benchmarks
149149

150-
Currently, this package is benchmarked against similar implementation in both Python and Julia. All reproducible benchmarks can be found in [ParallelKMeans/extras](https://github.com/PyDataBlog/ParallelKMeans.jl/tree/master/extras) directory. More tests in various languages are planned beyond the initial release version (`0.1.0`).
150+
Currently, this package is benchmarked against similar implementations in both Python and Julia. All reproducible benchmarks can be found in [ParallelKMeans/extras](https://github.com/PyDataBlog/ParallelKMeans.jl/tree/master/extras) directory. More tests in various languages are planned beyond the initial release version (`0.1.0`).
151151

152152
*Note*: All benchmark tests are made on the same computer to help eliminate any bias.
153153

@@ -179,7 +179,7 @@ ________________________________________________________________________________
179179

180180
## Contributing
181181

182-
Ultimately, we see this package as potentially the one stop shop for everything related to KMeans algorithm and its speed up variants. We are open to new implementations and ideas from anyone interested in this project.
182+
Ultimately, we see this package as potentially the one-stop-shop for everything related to KMeans algorithm and its speed up variants. We are open to new implementations and ideas from anyone interested in this project.
183183

184184
Detailed contribution guidelines will be added in upcoming releases.
185185

0 commit comments

Comments
 (0)