Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
90 commits
Select commit Hold shift + click to select a range
1860870
implemented ExponentialStatistics in core.py
alanmazankiewicz Jan 22, 2021
24d99e6
added add and mul functionality to ExponentialStatistics in core.py. …
alanmazankiewicz Jan 23, 2021
62880c0
implemented c version of ExponentialStatistics in fast.pyx. Not teste…
alanmazankiewicz Jan 24, 2021
872a2b0
Debugged fast.pyx ExponentialStatistics. Removed except clause from c…
alanmazankiewicz Jan 24, 2021
c1ad27c
added cython ExponentialStatistics to __init__, debugged cython versi…
alanmazankiewicz Jan 24, 2021
139953c
implemented tests for exponential statistics
alanmazankiewicz Jan 31, 2021
5b2ec1e
added decay test in test_exponential_batch
alanmazankiewicz Jan 31, 2021
4dbc0ae
ExpStats: Implemented get_decay, added to unit tests, added to tests/…
alanmazankiewicz Feb 1, 2021
f917979
test commit for docu
alanmazankiewicz Feb 2, 2021
c20efab
revert last commit: test docs
alanmazankiewicz Feb 2, 2021
6677055
Docu: Wrote intro, adding exponential stats, added to api.rst
alanmazankiewicz Feb 2, 2021
0265291
Docu: Added major part for ExponentialStats
alanmazankiewicz Feb 2, 2021
7ddbc45
Docu: Added major part for ExponentialStats
alanmazankiewicz Feb 2, 2021
12d966d
finished updating docu for ExponentialStats in readme
alanmazankiewicz Feb 3, 2021
584674b
Minor corrections to readme wrt changes for exponential statistics
alanmazankiewicz Feb 3, 2021
619732f
added docstring to make_exponential_statistics at core.py
alanmazankiewicz Feb 4, 2021
26c1769
rebased branch with upstream
alanmazankiewicz Feb 7, 2021
d5fc3a4
corrected name of batch_exponential_statistics test
alanmazankiewicz Feb 7, 2021
aabca62
corrected for flake8
alanmazankiewicz Feb 7, 2021
e5a2350
updated finch link to readme, changed clear method in ExponentialStat…
alanmazankiewicz Feb 8, 2021
2a4a1cf
merged upstream to master fork
alanmazankiewicz Feb 8, 2021
4fbee02
merged master in cov_exp
alanmazankiewicz Feb 8, 2021
317721b
debugged ExpCov
alanmazankiewicz Feb 8, 2021
8dc2825
implemented true_exp tests for ExpontialStatistics and ExponentialCov…
alanmazankiewicz Feb 14, 2021
2f9bc44
debugged Readme Test: Changed <class 'runstats.core.Statistics'> to <…
alanmazankiewicz Feb 14, 2021
ca3b8b4
created test_exponential_covariance
alanmazankiewicz Feb 14, 2021
30ffb77
implemented add and mul for ExponentialCovariance
alanmazankiewicz Feb 14, 2021
8615835
refactored tests
alanmazankiewicz Feb 14, 2021
47521f3
finised all tests for ExponentialCovariance
alanmazankiewicz Feb 14, 2021
0f444a4
applied blue
alanmazankiewicz Feb 14, 2021
41e810e
changed name of test_pickle_exponential_statistics(ExponentialCovari…
alanmazankiewicz Feb 14, 2021
6ca1ef1
took care of style CI errors (pylint, isort ...)
alanmazankiewicz Feb 15, 2021
97480df
debugging ci pipeline, error on ubuntu with readme
alanmazankiewicz Feb 16, 2021
066a08d
renamend ExponentialStatistics to ExponentialMovingStatistics
alanmazankiewicz Feb 16, 2021
661a973
debugging ci pipe
alanmazankiewicz Feb 16, 2021
eefbc49
debugging ci pipe
alanmazankiewicz Feb 16, 2021
be2b786
debugging ci pipe
alanmazankiewicz Feb 16, 2021
e4a0197
debugging ci pipe
alanmazankiewicz Feb 16, 2021
c12dbac
debugging pipeline - issue with readme
alanmazankiewicz Feb 18, 2021
16e6f12
reverted readme
alanmazankiewicz Feb 18, 2021
96985ef
Merge branch 'rename' into exp_cov
alanmazankiewicz Feb 18, 2021
b3bc0a8
implemented clear() test for exponential classes
alanmazankiewicz Feb 19, 2021
b59ae50
implemented time based ExpoStats
alanmazankiewicz Feb 19, 2021
4fd9b26
added docstring to time based methods, blue .
alanmazankiewicz Feb 19, 2021
cac8db9
implemented is_time_based for exp_stats
alanmazankiewicz Feb 19, 2021
24c0432
implemented tests for time based exp_stats, not ready yet
alanmazankiewicz Feb 20, 2021
3e4fd84
implemented tests for time based ExpMovingStats
alanmazankiewicz Feb 21, 2021
5d352e3
updated docstring for ExponentialMovingStatistics
alanmazankiewicz Feb 21, 2021
ef6f344
renamed ExponentialCoveriance to ExponentialMovingCovariance
alanmazankiewicz Feb 22, 2021
86a4855
adjused tests/_main_ for ExpCov
alanmazankiewicz Feb 22, 2021
bcc3aaf
adjusted clear() in readme for ExpMovStats
alanmazankiewicz Feb 22, 2021
ae2561d
extended readme for time based ExpMovingStats
alanmazankiewicz Feb 27, 2021
bf26224
added exp_cov/cor to __main__.py
alanmazankiewicz Feb 27, 2021
c436221
resolved integration test failures<
alanmazankiewicz Feb 28, 2021
15eb66c
trying to resolve doc8
alanmazankiewicz Feb 28, 2021
febb2bd
undone change in readme for doc8
alanmazankiewicz Feb 28, 2021
47f097b
debugged ExponentialMovingStatistics delay setter
alanmazankiewicz Feb 28, 2021
8c42617
updated readme: ExponentialMovingStatistics clear method
alanmazankiewicz Feb 28, 2021
d2007df
Merge remote-tracking branch 'upstream/master' into rebase
alanmazankiewicz Jun 28, 2021
f5a7cdd
adjusted ExponentialMovingCovariance to new interface
alanmazankiewicz Jun 28, 2021
a08e7f6
implemented Cython for ExponentialMovingCovarinace
alanmazankiewicz Jun 28, 2021
0adec42
removed invalid test that was wrongfully taken over during merge
alanmazankiewicz Jun 28, 2021
63765e9
debugged test_add_exponential_statistics, mock was not woring with Cy…
alanmazankiewicz Jul 3, 2021
8a483a2
removed _core.pxd and _core.py incorrectly checked in to repo, added …
alanmazankiewicz Jul 3, 2021
a50f0ea
debugged pytest raises testes
alanmazankiewicz Jul 3, 2021
8ecb9bf
added i_add and i_mul test cases
alanmazankiewicz Jul 3, 2021
44861c8
reformatted blue
alanmazankiewicz Jul 3, 2021
1c43f79
fixed doc8, mission link to runstats docu in readme
alanmazankiewicz Jul 3, 2021
36fbdf1
removed TODO in core.pxd
alanmazankiewicz Jul 3, 2021
a42ccd6
improved readme wrt time based exponential statistics usage
alanmazankiewicz Jul 4, 2021
e204718
Merge branch 'master' of github.com:grantjenks/python-runstats into r…
alanmazankiewicz Jul 4, 2021
6c089d7
fixed benchmarking: Renamed ExponentialStatistics to ExponentialMovin…
alanmazankiewicz Jul 4, 2021
a3c3cc9
fixed benchmark: Using ExponentialMovingStatitics at object construct…
alanmazankiewicz Jul 4, 2021
39398b2
fixed benchmark: wrong indentation for ExponentialMovingCovariance
alanmazankiewicz Jul 4, 2021
20d6452
Made member variables of ExponentialMovingStatistics exclusivley floa…
alanmazankiewicz Jul 4, 2021
4c4ce0c
debugged core.pxd: renamed ExponentialMovingCovariance make_regressio…
alanmazankiewicz Jul 4, 2021
7bb14fa
added non_time based serliasiation test for ExponentialMovingStatistics
alanmazankiewicz Jul 6, 2021
b9ab638
resolved flake8: duplicated test_pickle_exponential_statistics_time_b…
alanmazankiewicz Jul 6, 2021
01ca96f
resolved flake8: tests/test_runstats.py:988:15: E271 multiple spaces …
alanmazankiewicz Jul 6, 2021
5248920
ExponentialMovingStatistics: Moved None to NAN conversion to from _se…
alanmazankiewicz Jul 8, 2021
f466e65
adjusted time.sleep in unit test from 0.01 to 0.5 to pass build on wi…
alanmazankiewicz Jul 8, 2021
9861679
Mocked time.time() for test_exponential_statistics_freeze_unfreeze
alanmazankiewicz Jul 12, 2021
99791ef
applied blue and isort
alanmazankiewicz Jul 12, 2021
d30f872
fixed flake8: time imported but unused in runstats.__init__
alanmazankiewicz Jul 12, 2021
aaa8e7d
adjusted time based test to use time_mock, introduced is_freezed() fu…
alanmazankiewicz Jul 13, 2021
3ba42fd
fixed flake8, fixed cython
alanmazankiewicz Jul 13, 2021
46c60e8
blue
alanmazankiewicz Jul 13, 2021
7889d44
fixed pylint: added docstring to is_freezed()
alanmazankiewicz Jul 13, 2021
ef82116
added bint type to cython is_freezed()
alanmazankiewicz Jul 13, 2021
cc0ba05
Merge pull request #1 from alanmazankiewicz/rebase
alanmazankiewicz Jul 13, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,7 @@

# macOS metadata
.DS_Store

# compiled files
runstats/_core.pxd
runstats/_core.py
178 changes: 136 additions & 42 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,8 @@ the system based on the recent past. In these cases exponential statistics are
used. Instead of weighting all values uniformly in the statistics computation,
an exponential decay weight is applied to older values. The decay rate is
configurable and provides a mechanism for balancing recent values with past
values.
values. The exponential weighting may be on a 'per data point' or 'per time
step' basis.

The Python `RunStats`_ module was designed for these cases by providing classes
for computing online summary statistics and online linear regression in a
Expand Down Expand Up @@ -70,27 +71,32 @@ function:
>>> help(runstats) # doctest: +SKIP
>>> help(runstats.Statistics) # doctest: +SKIP
>>> help(runstats.Regression) # doctest: +SKIP
>>> help(runstats.ExponentialStatistics) # doctest: +SKIP
>>> help(runstats.ExponentialMovingStatistics) # doctest: +SKIP
>>> help(runstats.ExponentialMovingCovariance) # doctest: +SKIP


Tutorial
--------

The Python `RunStats`_ module provides three types for computing running
statistics: Statistics, ExponentialStatistics and Regression.The Regression
object leverages Statistics internally for its calculations. Each can be
initialized without arguments:
The Python `RunStats`_ module provides four types for computing running
statistics: Statistics, ExponentialMovingStatistics,
ExponentialMovingCovariance and Regression.
The Regression object leverages Statistics internally for its calculations
while ExponentialMovingCovariance uses ExponentialMovingStatistics.
Each can be initialized without arguments:

.. code-block:: python

>>> from runstats import Statistics, Regression, ExponentialStatistics
>>> from runstats import Statistics, Regression, ExponentialMovingStatistics, ExponentialMovingCovariance
>>> stats = Statistics()
>>> regr = Regression()
>>> exp_stats = ExponentialStatistics()
>>> exp_stats = ExponentialMovingStatistics()
>>> exp_cov = ExponentialMovingCovariance()

Statistics objects support four methods for modification. Use `push` to add
values to the summary, `clear` to reset the summary, sum to combine Statistics
summaries and multiply to weight summary Statistics by a scalar.
values to the summary, `clear` to reset the the object to its initialization
state, sum to combine Statistics summaries and multiply to weight summary
Statistics by a scalar.

.. code-block:: python

Expand Down Expand Up @@ -200,13 +206,13 @@ Both constructors accept an optional iterable that is consumed and pushed into
the summary. Note that you may pass a generator as an iterable and the
generator will be entirely consumed.

The ExponentialStatistics are constructed by providing a decay rate, initial
mean, and initial variance. The decay rate has default 0.9 and must be between
0 and 1. The initial mean and variance default to zero.
The ExponentialMovingStatistics are constructed by providing a decay rate,
initial mean, and initial variance. The decay rate defaults to 0.9 and must be
between 0 and 1. The initial mean and variance default to zero.

.. code-block:: python

>>> exp_stats = ExponentialStatistics()
>>> exp_stats = ExponentialMovingStatistics()
>>> exp_stats.decay
0.9
>>> exp_stats.mean()
Expand All @@ -215,9 +221,9 @@ mean, and initial variance. The decay rate has default 0.9 and must be between
0.0

The decay rate is the weight by which the current statistics are discounted
by. Consequently, (1 - decay) is the weight of the new value. Like the `Statistics` class,
there are four methods for modification: `push`, `clear`, sum and
multiply.
by. Consequently, (1 - decay) is the weight of the new value. Like the
`Statistics` class, there are four methods for modification: `push`, `clear`,
sum and multiply.

.. code-block:: python

Expand All @@ -230,8 +236,8 @@ multiply.
>>> exp_stats.stddev()
3.4049127627507683

The decay of the exponential statistics can also be changed. The value must be
between 0 and 1.
The decay of the exponential statistics can also be changed during the lifetime
of the object.

.. code-block:: python

Expand All @@ -245,30 +251,18 @@ between 0 and 1.
...
ValueError: decay must be between 0 and 1

The clear method allows to optionally set a new mean, new variance and new
decay. If none are provided mean and variance reset to zero, while the decay is
not changed.
Combining `ExponentialMovingStatistics` is done by adding them together. The
mean and variance are simply added to create a new object. To weight each
`ExponentialMovingStatistics`, multiply them by a constant factor.
Note how this behaviour differs from the two previous classes. When two
`ExponentialMovingStatistics` are added the decay of the left object is used for
the new object. The clear method resets the object to its state at
construction. `len`, minimum and maximum are not supported.

.. code-block:: python

>>> exp_stats.clear()
>>> exp_stats.decay
0.5
>>> exp_stats.mean()
0.0
>>> exp_stats.variance()
0.0

Combining `ExponentialStatistics` is done by adding them together. The mean and
variance are simply added to create a new object. To weight each
`ExponentialStatistics`, multiply them by a constant factor. If two
`ExponentialStatistics` are added then the leftmost decay is used for the new
object. The `len` method is not supported.

.. code-block:: python

>>> alpha_stats = ExponentialStatistics(iterable=range(10))
>>> beta_stats = ExponentialStatistics(decay=0.1)
>>> alpha_stats = ExponentialMovingStatistics(iterable=range(10))
>>> beta_stats = ExponentialMovingStatistics(decay=0.1)
>>> for num in range(10):
... beta_stats.push(num)
>>> exp_stats = beta_stats * 0.5 + alpha_stats * 0.5
Expand All @@ -277,6 +271,100 @@ object. The `len` method is not supported.
>>> exp_stats.mean()
6.187836645

The `ExponentialMovingCovariance` works equivalently to
`ExponentialMovingStatistics`.

.. code-block:: python

>>> exp_cov = ExponentialMovingCovariance(
... decay=0.9,
... mean_x=0.0,
... variance_x=0.0,
... mean_y=0.0,
... variance_y=0.0,
... covariance=0.0,
... iterable=(),
... )
>>> for num in range(10):
... exp_cov.push(num, num + 5)
>>> round(exp_cov.covariance(), 2)
17.67
>>> round(exp_cov.correlation(), 2)
0.96

`ExponentialMovingStatistics` can also work in a time-based mode i.e. old
statistics are not simply discounted by the decay rate each time a value is
pushed. Instead an effective decay rate is calculated based on the provided
'nominal' decay rate as well as the time difference between the last push and
the current push.`ExponentialMovingStatistics` operate in time based mode when
a `delay > 0` is provided at construction. The delay is the no. of seconds that
need to pass for the effective decay rate to be equal to the provided decay rate.
For example, if a delay of 60 and a decay of 0.9 is provided, then after 60
seconds pass between calls to push() the effective decay rate for discounting
the old statistics equals 0.9, when 120 seconds pass than it equals
0.9 ** 2 = 0.81 and so on. The exact formula for calculating the effective
decay rate at a given call to push is:
`decay ** ((current_timestamp - timestamp_at_last_push) / delay)`. The initial
timestamp is the timestamp when delay has been set.

.. code-block:: python

>>> import time
>>> alpha_stats = ExponentialMovingStatistics(decay=0.9, delay=1)
>>> time.sleep(1)
>>> alpha_stats.push(100)
>>> round(alpha_stats.mean())
10
>>> alpha_stats.clear() # note that clear() resets the timer as well
>>> time.sleep(2)
>>> alpha_stats.push(100)
>>> round(alpha_stats.mean())
19

There are a few things to note about an time_based
`ExponentialMovingStatistics` object:
- When providing an iterable at construction together with a delay, the iterable
is first processed in non-time based mode i.e. as if there would be no delay
- The delay can also be set after object construction. In this case the initial
timestamp is the time when the delay is set. If a non `None` delay is changed,
this does not effect the timer. Setting delay to `None` deactivates time based
mode.
- When two ExponentialMovingStatistics objects are added the state of the delay
is taken from the left object. If the left object is time-based (non `None`
delay) the timer is reset during an regular __add__ (a + b) for the resulting
object while it is not during an incremental add __iadd__ (a += b).
- The timer can be stopped with a call to `freeze()`. This can
be useful when saving the state of the object (`get_state()`) for later usage
or when serializing the object to pickle.
With a call to `unfreeze()` the timer continues where it left of (e.g. after
loading).
- Pushes onto a freezed object use a effective decay rate based on the time
difference between the last call to push and the moment `freeze()` was called.
- With a call to `clear_timer()` the timer can be reset.
- It is not recommended to use time based discounting for use cases that
require high precision on below seconds granularity.

.. code-block:: python

>>> alpha_stats = ExponentialMovingStatistics(decay=0.9, delay=1)
>>> time.sleep(1)
>>> alpha_stats.freeze()
>>> saved_state = alpha_stats.get_state()
>>> time.sleep(2)
>>> beta_stats = ExponentialMovingStatistics.fromstate(saved_state)
>>> beta_stats.push(10)
>>> round(beta_stats.mean())
1
>>> beta_stats.unfreeze()
>>> time.sleep(1)
>>> beta_stats.push(10)
>>> round(beta_stats.mean())
3


Sources
-------

All internal calculations of the Statistics and Regression classes are based
entirely on the C++ code by John Cook as posted in a couple of articles:

Expand All @@ -286,9 +374,15 @@ entirely on the C++ code by John Cook as posted in a couple of articles:
.. _`Computing Skewness and Kurtosis in One Pass`: http://www.johndcook.com/blog/skewness_kurtosis/
.. _`Computing Linear Regression in One Pass`: http://www.johndcook.com/blog/running_regression/

The ExponentialStatistics implementation is based on:
The ExponentialMovingStatistics implementation is based on:

* `Finch, 2009, Incremental Calculation of Weighted Mean and Variance`_

.. _`Finch, 2009, Incremental Calculation of Weighted Mean and Variance`: https://fanf2.user.srcf.net/hermes/doc/antiforgery/stats.pdf


* Finch, 2009, Incremental Calculation of Weighted Mean and Variance
Pure Python and Cython
----------------------

The pure-Python version of `RunStats`_ is directly available if preferred.

Expand Down
6 changes: 3 additions & 3 deletions docs/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,9 @@ Regression
:special-members:


ExponentialStatistics
.....................
ExponentialMovingStatistics
...........................

.. autoclass:: runstats.ExponentialStatistics
.. autoclass:: runstats.ExponentialMovingStatistics
:members:
:special-members:
5 changes: 3 additions & 2 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,11 @@

import os
import sys
sys.path.insert(0, os.path.abspath('..'))

import runstats

sys.path.insert(0, os.path.abspath('..'))


# -- Project information -----------------------------------------------------

Expand Down Expand Up @@ -89,7 +91,6 @@
]
}


# -- Options for todo extension ----------------------------------------------

# If true, `todo` and `todoList` produce output, else they produce nothing.
Expand Down
21 changes: 18 additions & 3 deletions runstats/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,26 @@
"""

try:
from ._core import ExponentialStatistics, Regression, Statistics
from ._core import (
ExponentialMovingCovariance,
ExponentialMovingStatistics,
Regression,
Statistics,
)
except ImportError: # pragma: no cover
from .core import ExponentialStatistics, Regression, Statistics
from .core import (
ExponentialMovingCovariance,
ExponentialMovingStatistics,
Regression,
Statistics,
)

__all__ = ['Statistics', 'Regression', 'ExponentialStatistics']
__all__ = [
'Statistics',
'Regression',
'ExponentialMovingStatistics',
'ExponentialMovingCovariance',
]
__title__ = 'runstats'
__version__ = '2.0.0'
__author__ = 'Grant Jenks'
Expand Down
Loading