Test updates by josibake · Pull Request #10 · S3RK/bip-352

josibake · 2024-03-21T13:46:22Z

Just fixing a few things that have been annoying me. Main change is to remove amounts.

Add __lt__ operator to allow for sorting lists of ECPubKeys. The comparison is done lexicographically.

This function is only needed for generating the test vectors.

Amount is not used for anything, so better to remove it. To ensure determinism with the tests, the addresses are now sorted lexicographically before generating outputs. This is necessary because the outputs will change based on the order in which addresses for the same recipient are created. For the receiver, this doesnt matter. This is only relevant for determinism in the test vectors.

Run ./generate-test-vectors.py with all the new changes: 1. Remove amounts 2. Add new labels test case 3. Different outputs for tests with multiple outputs for the same recipient

S3RK · 2024-04-03T07:58:13Z

+        # same sender but with different labels. To account for this in testing, set `perumute=True`. This will generate
+        # all possible sets of outputs (based on all possible orderings of recipient addresses).
+        #
+        # NOTE: THIS IS A TEST ONLY FEATURE AND NOT RELEVANT FOR NORMAL SENDING/RECEIVING


Given that's a reference implementation that's not optimal to include test only code.

I don't see an issue so long as its documented. I considered a few alternatives, but this seemed the simplest approach. Do you have another approach in mind?

S3RK · 2024-04-03T08:01:47Z

I don't understand the problem behind 55c3df6

Isn't the ordering of recipient addresses deterministic? They are listed in a particular order in the test.

Also, with the fix we have more expected outputs now in the test vectors. Do we expect to have all of them or some of them?

I'm very confused

josibake · 2024-04-03T09:01:20Z

I don't understand the problem behind 55c3df6

The issue is that different orderings of recipient addresses will produce different output sets when sending to multiple labeled addresses for the same recipient in the same transaction (i.e. hash(shared_secret || k) + Bm results in different outputs based on the value of k and Bm).

For the sender, there is a requirement to group the addresses by Bspend which means it is very likely the a sending implementation will change the order of the addresses from how they are listed in the test. Furthermore, the sender should be able to change the order of addresses (so long as they are grouped correctly) since there is no order dependence in the protocol by design.

For testing sending and receiving this is not an issue since the receiver can always find their outputs for any valid output set the sender produces. However, if someone is implementing only sending, the tests need to provide a way for the sender to check if the outputs they have generated are valid. Previously, the tests were relying on a specific ordering (i.e. sorting by amount) so that the sender would always produce the same output set that could be checked against the test vectors. I ran into this issue while working on the libsecp256k1 module where the tests were passing until we added a heapsort step to group the addresses. heapsort is an unstable sorting algorithm (i.e. the relative order of equal elements is not preserved) so there was no way to rely on the ordering of the addresses in the test. We could have reordered the addresses in the test to match libsecp256k1s heapsort algorithm but that would have then caused implementations not using a heapsort or not sorting at all and using some other method for grouping to fail.

Also, with the fix we have more expected outputs now in the test vectors. Do we expect to have all of them or some of them?

Yep! This field now represents all possible valid address sets the sender can produce for the receiver. In reference.py, outputs are generated for all possible orderings of the recipient addresses. This means for any sending implementation, if they implement the protocol correctly the will match exactly one permutation (and ignore the rest of the outputs). Likewise, for the receiver, they will match exactly one permutation when scanning and ignore the rest.

S3RK · 2024-04-04T08:03:30Z

Thank you for detailed explanation. I think I fully understood the problem now.

For the sender, there is a requirement to group the addresses by Bspend which means it is very likely the a sending implementation will change the order of the addresses from how they are listed in the test.

I'd still say it's still a problem because output of sending phase could not match expected output from the test vectors.

Let me summarise the problem. The protocol doesn't prescribe deterministic result for a given input. Or to say it differently, for any input there is a number of results that are correct. This makes testing the protocol a bit tricky, because we can't just provide one result in the test vectors.

This means for any sending implementation, if they implement the protocol correctly the will match exactly one permutation (and ignore the rest of the outputs).

I think there is a problem here. How do they know if they fully matched a set and can safely ignore the rest?

Let's take a simple case when we have two outputs for the same address. There are two valid sets of two outputs. However the test vector has a flat list of four outputs. A wrong implementation can generate only one correct output and ignore the rest three. Not good.

One fix would be to provide a list<set<output>> of expected outputs, where each set in the list has to be matched fully. However, this is a bit complicated still as it breaks usual testing conventions. Normally one asserts equality of the results from your implementation agains expected result from the test vectors. It's definitely possible to implement the correct asserting semantics, but I'm not sure it's part of standard library and the implementor might need to understand how equality of containers (i.e. set<output>) is working in target language.

Alternatively, we can consider making the protocol deterministic. For example by defining ascending order of B_m in a grouped set. But that's adjusting protocol for testing purposes. I can see how this can be frowned upon, but testability is an important aspect of good design.

josibake · 2024-04-04T08:48:30Z

I think there is a problem here. How do they know if they fully matched a set and can safely ignore the rest?

For every silent payment address provided, the sender must to generate a taproot scriptPubKey. From there, the sender needs to also check that each of their generated scriptPubKeys exists in the set of all possible scriptPubKeys that could have been generated. I don't think there's a problem here (aside from needing to update the reference implementation to explicitly check that len(generated_outputs) == len(sp_addresses).

Alternatively, we can consider making the protocol deterministic. For example by defining ascending order of B_m in a grouped set. But that's adjusting protocol for testing purposes. I can see how this can be frowned upon, but testability is an important aspect of good design.

I don't see the benefit here. We can fully test the protocol in the way I've described, so adding a sorting step to the protocol specifically for testing the sender side seems like extra complexity for no benefit? It would require the receiver to do a non-trivial sort as part of an already expensive scanning process.

Amount is not used for anything, so better to remove it. To ensure determinism with the tests, the addresses are now sorted lexicographically before generating outputs. This is necessary because the outputs will change based on the order in which addresses for the same recipient are created. For the receiver, this doesnt matter. This is only relevant for determinism in the test vectors.

different orderings of the recipient addresses will produce different outputs in some scenarios. for sending and receiving, this does not matter. however, for implementations only testing sending e.g. testing that they can generate the correct outputs from an input set, this is fragile and leads to confusion and wasted time. essentially, the tester needs to fiddle with the order of the recipient addresses until the correct permutation is hit (when in reality any permutation is valid).

Run ./generate-test-vectors.py with all the new changes: 1. Remove amounts 2. All possible permutations in outputs for tests with multiple outputs for the same recipient 3. Make smallest outpoint an excluded input

josibake · 2024-04-30T17:33:48Z

@S3RK came back to this and cleaned some stuff up, mostly getting rid of duplicate code in the generate script. I also separated the generation code vs reference code by creating a copy of create_outputs which handles creating all possible output sets.

In addition, instead of having all output sets as one array, the sender now has an array of arrays of outputs: each possible "candidate set." This is less error prone than what I had before. For the receiver, I just use the first "candidate set" from the sender.

general refactor, namely: * reduce reused code * make test generation deterministic * only generate multiple output sets for sender * give each candidate set as its own result * add new test case for lexicographic sort * remove test related code from reference.py

S3RK

I did a pass and the changes look good.

It was a little bit hard to review big chunks of changes, so not sure I caught everything.

Need a little bit more time to verify new test vectors on my own secret implementation that I did from the bip without looking at your implementation.

S3RK · 2024-05-14T07:01:43Z



 def decode_silent_payment_address(address: str, hrp: str = "tsp") -> Tuple[ECPubKey, ECPubKey]:
-    version, data = decode(hrp, address)


in a026b44

I think we should check a) that version is in the expected range b) fail if version is 31

S3RK · 2024-05-14T07:22:34Z

-    label_address_two = reference.create_labeled_silent_payment_address(scan1, Spend1, m=l2, hrp=HRP)
-    label_address_one2 = reference.create_labeled_silent_payment_address(scan2, Spend2, m=l1, hrp=HRP)
-    label_address_two2 = reference.create_labeled_silent_payment_address(scan2, Spend2, m=l2, hrp=HRP)
-    addresses1 = [label_address_one, address, address2, label_address_one, label_address_one2, label_address_one, label_address_two2, label_address_two, label_address_two, label_address_two2]


in 404fe33

Is it on purpose you dropped this test?

Will need to look again, but IIRC, this test wasn't adding any coverage by having 2 recipients, so was complicated in that it needed to two recipients but for no real benefit. Will take a look again to confirm, tho!

S3RK · 2024-05-14T07:25:07Z

-# In[17]:
-
-
-def generate_unknown_segwit_ver_test():


in 404fe33

nit: should we still have a test for unknown segwit version?

S3RK · 2024-05-14T07:33:19Z


-            # Check if the found output public keys match the expected output public keys
-            assert add_to_wallet == expected["outputs"], "Receiving test failed"
+            # Note: order doesn't matter for creating/finding the outputs. However, different orderings of the recipient addresses


I think this comment here is irrelevant for the recipient part, no?

Ah, good call. This is a leftover from when I did have all possible output sets for the recipient. But now for the recipient there is just one of the possible output sets generated by the sender, so this comment is indeed irrelevant.

josibake added 9 commits March 21, 2024 09:19

add __lt__ method to ECPubKey

80b5f50

Add __lt__ operator to allow for sorting lists of ECPubKeys. The comparison is done lexicographically.

move derive_keypair func to generate script

fb0bd41

This function is only needed for generating the test vectors.

add test for labels with multiple recipients

5d1f848

fix typing errors in reference

a026b44

update tests

45f72cb

Run ./generate-test-vectors.py with all the new changes: 1. Remove amounts 2. Add new labels test case 3. Different outputs for tests with multiple outputs for the same recipient

clean up whitespace

580715a

add comment for reference

3f978a7

update tests

beceb2b

josibake force-pushed the test-updates branch 2 times, most recently from f60e49b to d462ac2 Compare March 21, 2024 19:22

josibake force-pushed the test-updates branch from d462ac2 to 08c4f5a Compare April 2, 2024 11:49

S3RK reviewed Apr 3, 2024

View reviewed changes

josibake force-pushed the test-updates branch from 9201dd6 to 2f14e49 Compare April 30, 2024 17:15

josibake added 4 commits April 30, 2024 19:22

make smallest outpoint an ignored output

39848c1

update tests

8b999d5

Run ./generate-test-vectors.py with all the new changes: 1. Remove amounts 2. All possible permutations in outputs for tests with multiple outputs for the same recipient 3. Make smallest outpoint an excluded input

josibake force-pushed the test-updates branch from 2f14e49 to d5484f6 Compare April 30, 2024 17:29

josibake force-pushed the test-updates branch from d5484f6 to 404fe33 Compare April 30, 2024 17:36

S3RK reviewed May 14, 2024

View reviewed changes



		def decode_silent_payment_address(address: str, hrp: str = "tsp") -> Tuple[ECPubKey, ECPubKey]:
		version, data = decode(hrp, address)

Conversation

josibake commented Mar 21, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

S3RK commented Apr 3, 2024

Uh oh!

josibake commented Apr 3, 2024

Uh oh!

S3RK commented Apr 4, 2024

Uh oh!

josibake commented Apr 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

josibake commented Apr 30, 2024

Uh oh!

S3RK left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

josibake commented Apr 4, 2024 •

edited

Loading