Processes only need steps for large speedup by xyzzy42 · Pull Request #34 · vacaboja/tg

xyzzy42 · 2021-04-02T00:44:01Z

I measured this as reducing CPU usage from ~78% to ~45%. Almost doubling the speed, which is expected as described below.

Tg processes the audio in a series of steps that double in size: 2, 4, 8, and then 16 seconds long. This is the one to four dots shown in the display.

Tg starts at step 1 (2 seconds) and goes up, stopping if a step doesn't pass. Data from smaller steps isn't used if a larger step passes.

This means when running at a constant step 4 with good signal, CPU usage is almost double what it needs to be, since steps 1-3 are processed and unused and add up to 14 seconds, almost as much as step 4's 16 seconds.

Change this to start at the previous iteration's step. With good signal, only step 4 will be processed and CPU usage is cut almost in half. If the previous step fails, it will try smaller steps. If it passes and is not yet the top step, it will try larger steps.

End result should end up on the same step as before, but get there sooner. It could be slower if the step drops a lot, e.g. from 4 to 0, but this happens far less often than the step saying the same or nearly the same.

I also change the dot display to consistently indicate averaging interval and amplitude measurement. It did both before in an inconsistent way.

See commit messages for more details.

I measured this as reducing CPU usage from ~78% to ~45%. Almost doubling the speed, which is expected as described below. Tg processes the audio in a series of steps that double in size: 2, 4, 8, and then 16 seconds long. This is the one to four dots shown in the display. Tg starts at step 1 (2 seconds) and goes up, stopping if a step doesn't pass. Data from smaller steps isn't used if a larger step passes. This means when running at a constant step 4 with good signal, CPU usage is almost double what it needs to be, since steps 1-3 are processed and unused and add up to 14 seconds, almost as much as step 4's 16 seconds. Change this to start at the previous iteration's step. With good signal, only step 4 will be processed and CPU usage is cut almost in half. If the previous step fails, it will try smaller steps. If it passes and is not yet the top step, it will try larger steps. End result should end up on the same step as before, but get there sooner. It could be slower if the step drops a lot, e.g. from 4 to 0, but this happens far less often than the step saying the same or nearly the same. To do this, I moved the step logic out of analyze_pa_data() and entirely into compute_update(). analyze_pa_data() will now just processes one step and compute_update() decides which step(s). Previously, analyze_pa_data() did all steps and compute_update() decided which step to actually use. There is a small change to the algorithm. To be good, a step needs to pass a number of changes done in process(). Then there is one more check, of sigma vs period, done in compute_update(). Previously a step didn't need to pass the sigma check and it still counted enough to increase last_tic and show up as a signal dot in the display. But the data wasn't actually used unless it passed to sigma check too. I no longer keep track of "partially" passing steps like this. Either it passes all checks, including the sigma check, or not. last_tic and signal level only count fully passing steps. I see no practical difference in my tests, but I think it could show up with some kind of marginal signal that has a high error in period estimation with the longer steps.

Previously max signal level (4 dots) but with no amplitude measured was counted as signal level 3. But level 3 or lower with no amplitude sill counts as the same level. Have compute_update() no longer do this signal level adjustment and just report the level used, which indicates the averaging interval. The dot graphic will now indicate "no amplitude" by using a hollow dot for the final signal level's dot. This way the number of dots always shows the averaging interval and a hollow dot always shows that the signal is too poor to measure amplitude.

When there is just one period in the processing buffer it is not possible to calculate the sample standard deviation (sigma). In this case, the period value was being used as the sigma estimate, which effectively gives a huge sigma when there is 1 period in the buffer. This results in the processing buffer being rejected as bad and a larger buffer, which would have multiple periods, is never tried. One period per buffer would happen with a combination of long period and short buffer, e.g. a small BPH in light mode, since light mode uses buffers half as long as normal mode. Use 0 as the sigma value for 1 sample. This means the buffer will pass the sigma check and a larger buffer will be tried.

xyzzy42 added 2 commits April 1, 2021 17:44

xyzzy42 force-pushed the trentpi/faster-processing branch from e199e2d to 87e0622 Compare April 2, 2021 00:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Processes only need steps for large speedup#34

Processes only need steps for large speedup#34
xyzzy42 wants to merge 3 commits into
vacaboja:masterfrom
xyzzy42:trentpi/faster-processing

xyzzy42 commented Apr 2, 2021 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

xyzzy42 commented Apr 2, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

xyzzy42 commented Apr 2, 2021 •

edited

Loading