Skip to content

PT-2536 - Correct EMA calculation for WeightedAvgRate internal averages#1127

Open
marcelohpf wants to merge 2 commits into
percona:3.xfrom
marcelohpf:fix-ewma-calculation
Open

PT-2536 - Correct EMA calculation for WeightedAvgRate internal averages#1127
marcelohpf wants to merge 2 commits into
percona:3.xfrom
marcelohpf:fix-ewma-calculation

Conversation

@marcelohpf
Copy link
Copy Markdown
Contributor

Average number of rows and average time taken to process rows are convergent to the average by adding the 1-weight to the calculation.

Previous calculation was divergent and a not correct implementation of ema (Exponential Moving Average);

The result of update() is not affected due to the proportional calculation of number-rows/time-taken

You can verify the behavior of current calculation with :

use WeightedAvgRate;

my $weight = 0.8;
my $w = WeightedAvgRate->new( target_t => 10, weight => $weight );
my @fake_n = ( 10, 5, 3, 1, 10, 3, 5, 10, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20 );

my ( $avg, $avg_ewma, $count, $sum ) = ( 0, 0, 0, 0 );
for my $n (@fake_n) {
    $sum   += $n;
    $count += 1;
    $avg = $sum / $count;
    $w->update( $n, 1 );
    $avg_ewma = $avg_ewma * $weight + $n * (1-$weight);
    printf( "%2.0f | %5.2f | %5.2f | %5.2f\n",
        $n, $avg, $w->{avg_n}, $avg, $avg_ewma );
}

1;

Root cause

The current calculation of average number of rows (avg_n) and average time (avg_t) seems to implement an smoothing algorithm to add some lag into the average convergence to protect estimation form sudden changes in a single datapoint, but also offer a faster convergence to that value on the long dataset (100+ data-points).

However, the current calculation is divergent serie because it does not apply a reduction factor to new data point every iteration.

As consequence, the avg_n and avg_t will increase forever and never converge to the "most common representative value";
Current formula is (avg * accumulative-factor) + new-value;
When it should be (avg * accumulative-factor) + (new-value * smooth-factor);

  • The contributed code is licensed under GPL v2.0
  • Contributor Licence Agreement (CLA) is signed
  • util/update-modules has been ran
  • Documentation updated
  • Test suite update

@marcelohpf marcelohpf requested a review from svetasmirnova as a code owner May 12, 2026 09:07
@it-percona-cla
Copy link
Copy Markdown

it-percona-cla commented May 12, 2026

CLA assistant check
All committers have signed the CLA.

Average number of rows and average time taken to process rows are convergent to the average by adding the 1-weight to the calculation.

Previous calculation was divergent and a not correct implementation of ewma

The result of update() is not affected due to the proportional calculation of number-rows/time-taken
@marcelohpf marcelohpf force-pushed the fix-ewma-calculation branch from 53bc027 to 03fb408 Compare May 12, 2026 09:54
@sleto-it sleto-it changed the title PT-XXXX Correct EMA calculation for WeightedAvgRate internal averages PT-2536 - Correct EMA calculation for WeightedAvgRate internal averages May 12, 2026
@svetasmirnova
Copy link
Copy Markdown
Collaborator

Also tracked in Jira at https://perconadev.atlassian.net/browse/PT-2536

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants