Processes only need steps for large speedup#34
Open
xyzzy42 wants to merge 3 commits into
Open
Conversation
I measured this as reducing CPU usage from ~78% to ~45%. Almost doubling the speed, which is expected as described below. Tg processes the audio in a series of steps that double in size: 2, 4, 8, and then 16 seconds long. This is the one to four dots shown in the display. Tg starts at step 1 (2 seconds) and goes up, stopping if a step doesn't pass. Data from smaller steps isn't used if a larger step passes. This means when running at a constant step 4 with good signal, CPU usage is almost double what it needs to be, since steps 1-3 are processed and unused and add up to 14 seconds, almost as much as step 4's 16 seconds. Change this to start at the previous iteration's step. With good signal, only step 4 will be processed and CPU usage is cut almost in half. If the previous step fails, it will try smaller steps. If it passes and is not yet the top step, it will try larger steps. End result should end up on the same step as before, but get there sooner. It could be slower if the step drops a lot, e.g. from 4 to 0, but this happens far less often than the step saying the same or nearly the same. To do this, I moved the step logic out of analyze_pa_data() and entirely into compute_update(). analyze_pa_data() will now just processes one step and compute_update() decides which step(s). Previously, analyze_pa_data() did all steps and compute_update() decided which step to actually use. There is a small change to the algorithm. To be good, a step needs to pass a number of changes done in process(). Then there is one more check, of sigma vs period, done in compute_update(). Previously a step didn't need to pass the sigma check and it still counted enough to increase last_tic and show up as a signal dot in the display. But the data wasn't actually used unless it passed to sigma check too. I no longer keep track of "partially" passing steps like this. Either it passes all checks, including the sigma check, or not. last_tic and signal level only count fully passing steps. I see no practical difference in my tests, but I think it could show up with some kind of marginal signal that has a high error in period estimation with the longer steps.
Previously max signal level (4 dots) but with no amplitude measured was counted as signal level 3. But level 3 or lower with no amplitude sill counts as the same level. Have compute_update() no longer do this signal level adjustment and just report the level used, which indicates the averaging interval. The dot graphic will now indicate "no amplitude" by using a hollow dot for the final signal level's dot. This way the number of dots always shows the averaging interval and a hollow dot always shows that the signal is too poor to measure amplitude.
e199e2d to
87e0622
Compare
When there is just one period in the processing buffer it is not possible to calculate the sample standard deviation (sigma). In this case, the period value was being used as the sigma estimate, which effectively gives a huge sigma when there is 1 period in the buffer. This results in the processing buffer being rejected as bad and a larger buffer, which would have multiple periods, is never tried. One period per buffer would happen with a combination of long period and short buffer, e.g. a small BPH in light mode, since light mode uses buffers half as long as normal mode. Use 0 as the sigma value for 1 sample. This means the buffer will pass the sigma check and a larger buffer will be tried.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I measured this as reducing CPU usage from ~78% to ~45%. Almost doubling the speed, which is expected as described below.
Tg processes the audio in a series of steps that double in size: 2, 4, 8, and then 16 seconds long. This is the one to four dots shown in the display.
Tg starts at step 1 (2 seconds) and goes up, stopping if a step doesn't pass. Data from smaller steps isn't used if a larger step passes.
This means when running at a constant step 4 with good signal, CPU usage is almost double what it needs to be, since steps 1-3 are processed and unused and add up to 14 seconds, almost as much as step 4's 16 seconds.
Change this to start at the previous iteration's step. With good signal, only step 4 will be processed and CPU usage is cut almost in half. If the previous step fails, it will try smaller steps. If it passes and is not yet the top step, it will try larger steps.
End result should end up on the same step as before, but get there sooner. It could be slower if the step drops a lot, e.g. from 4 to 0, but this happens far less often than the step saying the same or nearly the same.
I also change the dot display to consistently indicate averaging interval and amplitude measurement. It did both before in an inconsistent way.
See commit messages for more details.