Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -59,3 +59,4 @@ data/
/pyenv
/python/pysol.cpp
/log
!requirements.txt
25 changes: 15 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,9 +39,13 @@ To get started, please read the ``Quick Start'' section first.

Table of Contents
=================
- Installation
- Quick Start
- Additional Information
+ [Installation](#installation)
+ [Install from source](#install-from-source)
+ [Known Issues of Python Wrappers](#known-issues-of-python-wrappers)
+ [Quick Start](#quick-start)
+ [Comparison of Online Learning Algorithms](#comparison-of-online-learning-algorithms)
+ [License and Citation](#licence-and-citation)
+ [Additional Information](#additional-information)

Installation
======================
Expand All @@ -62,7 +66,7 @@ Both the python scripts and C++ executables & Libraries are dependent on the sam
SOL features a very simple installation procedure. The project is managed by `CMake` for C++ and `setuptools` for python.


###Getting the code
### Getting the code

There exists a `CMakeLists.txt` in the root directory.
The latest version of SOL is always available via 'github' by invoking one
Expand All @@ -74,7 +78,7 @@ of the following:
## For HTTP-based Git interaction
$ git clone https://github.com/LIBOL/SOL.git

###Build C++ Executables and Dynamic Libraries
### Build C++ Executables and Dynamic Libraries

1. Prerequisites

Expand Down Expand Up @@ -130,6 +134,7 @@ We highly recommend users to install python packages in a virtual enviroment.

+ Build and install the python scripts

$ pip install -r requirements.txt
$ python setup.py build
$ python setup.py install

Expand Down Expand Up @@ -244,11 +249,11 @@ and LIBLINEAR. To quikly get a comparison on the small dataset ``a1a`` as
provided in the data folder:

$ cd experiments
$ python experiment.py --shufle 10 a1a ../data/a1a ../data/a1a.t
$ python experiment.py --repeat 10 a1a ../data/a1a ../data/a1a.t

The script will conduct cross validation to select best parameters for each
algorithm. Then the script will shuffle the training 10 times. For each
shuffled data, the script will train and test for each algorithm. The final
algorithm. Then the script will repeat the training 10 times. For each
repeatd data, the script will train and test for each algorithm. The final
output is the average of all results. And a final table report will be shown as follows.

algorithm train train test test
Expand All @@ -273,7 +278,7 @@ output is the average of all results. And a final table report will be shown as
There will also be three pdf figures displaying the update number, training error rate, and test error rate over model sparsity.

Users can also compare on the multi-class dataset
[``mnist``](https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass.html#mnist) with the follow command (Note that we only shuffle the training data once in this example, so the standard deviation is zero):
[``mnist``](https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass.html#mnist) with the follow command (Note that we only repeat the training data once in this example, so the standard deviation is zero):

$ python experiment.py mnist ../data/mnist.scale ../data/mnist.scale.t

Expand Down Expand Up @@ -301,7 +306,7 @@ The output is:
The tables and figures in our paper description are obtained with the following
command:

$ python experiment.py --shuffle 10 rcv1 ../data/rcv1_train ../data/rcv1_test
$ python experiment.py --repeat 10 rcv1 ../data/rcv1_train ../data/rcv1_test


License and Citation
Expand Down
7 changes: 7 additions & 0 deletions python/pysol.pxd
Original file line number Diff line number Diff line change
Expand Up @@ -25,3 +25,10 @@ cdef extern from "sol/c_api.h":
int sol_convert_data(const char* src_path, const char* src_type, const char* dst_path, const char* dst_type, bint binarize, float binarize_thresh)
int sol_shuffle_data(const char* src_path, const char* src_type, const char* dst_path, const char* dst_type)
int sol_split_data(const char* src_path, const char* src_type, int fold, const char* output_prefix, const char* dst_type, bint shuffle)

cdef class SOL:
cdef void* _c_model
cdef void* _c_data_iter
cdef const char* algo
cdef int class_num
cdef bint verbose
6 changes: 0 additions & 6 deletions python/pysol.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -43,12 +43,6 @@ cdef void inspect_iteration(void* user_context,
handler(data_num, iter_num, update_num, err_rate)

cdef class SOL:
cdef void* _c_model
cdef void* _c_data_iter
cdef const char* algo
cdef int class_num
cdef bint verbose

def __cinit__(self, const char* algo = NULL, int class_num = -1, int
batch_size=256, int buf_size = 2, verbose=False, **params):
"""Create a new Handle for SOL C Library
Expand Down
26 changes: 26 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
asn1crypto==0.24.0
backports.functools-lru-cache==1.6.1
cryptography==2.1.4
cycler==0.10.0
Cython==0.29.21
enum34==1.1.6
idna==2.6
ipaddress==1.0.17
keyring==10.6.0
keyrings.alt==3.0
kiwisolver==1.1.0
matplotlib==2.2.5
mercurial==4.5.3
numpy==1.16.6
pycrypto==2.6.1
pygobject==3.26.1
pyparsing==2.4.7
python-dateutil==2.8.1
pytz==2020.4
pyxdg==0.25
PyYAML==3.12
scikit-learn==0.20.4
scipy==1.2.3
SecretStorage==2.3.1
six==1.11.0
subprocess32==3.5.4