Test updates#10
Conversation
Add __lt__ operator to allow for sorting lists of ECPubKeys. The comparison is done lexicographically.
This function is only needed for generating the test vectors.
Amount is not used for anything, so better to remove it. To ensure determinism with the tests, the addresses are now sorted lexicographically before generating outputs. This is necessary because the outputs will change based on the order in which addresses for the same recipient are created. For the receiver, this doesnt matter. This is only relevant for determinism in the test vectors.
Run ./generate-test-vectors.py with all the new changes: 1. Remove amounts 2. Add new labels test case 3. Different outputs for tests with multiple outputs for the same recipient
f60e49b to
d462ac2
Compare
| # same sender but with different labels. To account for this in testing, set `perumute=True`. This will generate | ||
| # all possible sets of outputs (based on all possible orderings of recipient addresses). | ||
| # | ||
| # NOTE: THIS IS A TEST ONLY FEATURE AND NOT RELEVANT FOR NORMAL SENDING/RECEIVING |
There was a problem hiding this comment.
Given that's a reference implementation that's not optimal to include test only code.
There was a problem hiding this comment.
I don't see an issue so long as its documented. I considered a few alternatives, but this seemed the simplest approach. Do you have another approach in mind?
|
I don't understand the problem behind 55c3df6 Isn't the ordering of recipient addresses deterministic? They are listed in a particular order in the test. Also, with the fix we have more expected outputs now in the test vectors. Do we expect to have all of them or some of them? I'm very confused |
The issue is that different orderings of recipient addresses will produce different output sets when sending to multiple labeled addresses for the same recipient in the same transaction (i.e. For the sender, there is a requirement to group the addresses by For testing sending and receiving this is not an issue since the receiver can always find their outputs for any valid output set the sender produces. However, if someone is implementing only sending, the tests need to provide a way for the sender to check if the outputs they have generated are valid. Previously, the tests were relying on a specific ordering (i.e. sorting by amount) so that the sender would always produce the same output set that could be checked against the test vectors. I ran into this issue while working on the
Yep! This field now represents all possible valid address sets the sender can produce for the receiver. In |
|
Thank you for detailed explanation. I think I fully understood the problem now.
I'd still say it's still a problem because output of sending phase could not match expected output from the test vectors. Let me summarise the problem. The protocol doesn't prescribe deterministic result for a given input. Or to say it differently, for any input there is a number of results that are correct. This makes testing the protocol a bit tricky, because we can't just provide one result in the test vectors.
I think there is a problem here. How do they know if they fully matched a set and can safely ignore the rest? Let's take a simple case when we have two outputs for the same address. There are two valid sets of two outputs. However the test vector has a flat list of four outputs. A wrong implementation can generate only one correct output and ignore the rest three. Not good. One fix would be to provide a Alternatively, we can consider making the protocol deterministic. For example by defining ascending order of |
For every silent payment address provided, the sender must to generate a taproot scriptPubKey. From there, the sender needs to also check that each of their generated scriptPubKeys exists in the set of all possible scriptPubKeys that could have been generated. I don't think there's a problem here (aside from needing to update the reference implementation to explicitly check that
I don't see the benefit here. We can fully test the protocol in the way I've described, so adding a sorting step to the protocol specifically for testing the sender side seems like extra complexity for no benefit? It would require the receiver to do a non-trivial sort as part of an already expensive scanning process. |
Amount is not used for anything, so better to remove it. To ensure determinism with the tests, the addresses are now sorted lexicographically before generating outputs. This is necessary because the outputs will change based on the order in which addresses for the same recipient are created. For the receiver, this doesnt matter. This is only relevant for determinism in the test vectors.
different orderings of the recipient addresses will produce different outputs in some scenarios. for sending and receiving, this does not matter. however, for implementations only testing sending e.g. testing that they can generate the correct outputs from an input set, this is fragile and leads to confusion and wasted time. essentially, the tester needs to fiddle with the order of the recipient addresses until the correct permutation is hit (when in reality any permutation is valid).
Run ./generate-test-vectors.py with all the new changes: 1. Remove amounts 2. All possible permutations in outputs for tests with multiple outputs for the same recipient 3. Make smallest outpoint an excluded input
|
@S3RK came back to this and cleaned some stuff up, mostly getting rid of duplicate code in the generate script. I also separated the generation code vs reference code by creating a copy of In addition, instead of having all output sets as one array, the sender now has an array of arrays of outputs: each possible "candidate set." This is less error prone than what I had before. For the receiver, I just use the first "candidate set" from the sender. |
general refactor, namely: * reduce reused code * make test generation deterministic * only generate multiple output sets for sender * give each candidate set as its own result * add new test case for lexicographic sort * remove test related code from reference.py
S3RK
left a comment
There was a problem hiding this comment.
I did a pass and the changes look good.
It was a little bit hard to review big chunks of changes, so not sure I caught everything.
Need a little bit more time to verify new test vectors on my own secret implementation that I did from the bip without looking at your implementation.
|
|
||
|
|
||
| def decode_silent_payment_address(address: str, hrp: str = "tsp") -> Tuple[ECPubKey, ECPubKey]: | ||
| version, data = decode(hrp, address) |
There was a problem hiding this comment.
in a026b44
I think we should check a) that version is in the expected range b) fail if version is 31
| label_address_two = reference.create_labeled_silent_payment_address(scan1, Spend1, m=l2, hrp=HRP) | ||
| label_address_one2 = reference.create_labeled_silent_payment_address(scan2, Spend2, m=l1, hrp=HRP) | ||
| label_address_two2 = reference.create_labeled_silent_payment_address(scan2, Spend2, m=l2, hrp=HRP) | ||
| addresses1 = [label_address_one, address, address2, label_address_one, label_address_one2, label_address_one, label_address_two2, label_address_two, label_address_two, label_address_two2] |
There was a problem hiding this comment.
Will need to look again, but IIRC, this test wasn't adding any coverage by having 2 recipients, so was complicated in that it needed to two recipients but for no real benefit. Will take a look again to confirm, tho!
| # In[17]: | ||
|
|
||
|
|
||
| def generate_unknown_segwit_ver_test(): |
There was a problem hiding this comment.
in 404fe33
nit: should we still have a test for unknown segwit version?
|
|
||
| # Check if the found output public keys match the expected output public keys | ||
| assert add_to_wallet == expected["outputs"], "Receiving test failed" | ||
| # Note: order doesn't matter for creating/finding the outputs. However, different orderings of the recipient addresses |
There was a problem hiding this comment.
I think this comment here is irrelevant for the recipient part, no?
There was a problem hiding this comment.
Ah, good call. This is a leftover from when I did have all possible output sets for the recipient. But now for the recipient there is just one of the possible output sets generated by the sender, so this comment is indeed irrelevant.
Just fixing a few things that have been annoying me. Main change is to remove amounts.