Skip to content

Improve testing using generated metadata #1444

@jku

Description

@jku

This is an umbrella issue for a plan I have:

  • Improve Metadata API for the case of Metadata generation (in other words, creating Metadata not deserializing it)
  • Make sure all of it can be done in-memory (so writing things to disk is not needed)
  • Improve testing for Metadata API itself by addding tests with generated data (see e.g. Metadata API: Implement threshold verification #1436: testing signature validation with various thresholds)
  • Improve testing for ngclient/MetadataBundle by addding tests with generated data

The reason I want to do this is that including metadata files in the test data seems hard to scale: We should aim to test de/serialization with test data included with the test (as in #1391), and then generate the test data for the cases where this is impractical (like ngclient MetadataBundle tests).

In the end I believe the improvements will be beneficial for the future repository tools as well, but that's a longer term goal: the above list is something where we could have results in a few weeks.

For background below is what I came up with for generating full metadata for a repository and modifying it just a bit using the current Metadata API (it currently writes to files but just for debugging purposes: to_file() calls can be removed). There's a lot of boilerplate but also quite a few low hanging fruit that would improve the experience:

from collections import OrderedDict
from datetime import datetime
from typing import Tuple

from securesystemslib.keys import generate_ed25519_key
from securesystemslib.signer import SSlibSigner, Signer
from tuf.api.metadata import DelegatedRole, Delegations, Key, MetaFile, Metadata, Role, Root, Snapshot, TargetFile, Targets, Timestamp

# todo should come from metadata
SPEC_VERSION = "1.0.999"

def generate_test_key() -> Tuple[Key, Signer]:
    keydict = generate_ed25519_key()
    # TODO maybe this should be Key.from_securesystemslib_key()
    key = Key(keydict["keyid"], keydict["keytype"], keydict["scheme"], {"public": keydict["keyval"]["public"]})
    signer = SSlibSigner(keydict)

    return (key, signer)

# generate keys
delegate_key, delegate_signer = generate_test_key()
targets_key, targets_signer = generate_test_key()
snapshot_key, snapshot_signer = generate_test_key()
timestamp_key, timestamp_signer = generate_test_key()
root_key, root_signer = generate_test_key()


# Create delegated targets
# TODO Metadata.new_targets_with_defaults(expires: datetime) ?
# TODO Add TargetFile.from_data(data: Union[bytes, BinaryIO]) so this works:
#     delegate = Metadata.new_targets_with_defaults(datetime(3000,1,1))
#     delegate.targets["world"] = TargetFile.from_data(b"hello")
#     delegate.sign(delegate_signer)
targetfile = TargetFile(5, {"sha256": "5891b5b522d5df086d0ff0b110fbd9d21bb4fc7163af34d08286a2e846f6be03"})
delegate_signed = Targets(1, SPEC_VERSION, datetime(3000,1,1), {"world": targetfile}, None)
delegate = Metadata(delegate_signed, OrderedDict())
delegate.sign(delegate_signer)
# TODO: Metadata.to_file should be able to optionally handle versioned file names
delegate.to_file("delegate.json")

# Create targets
# TODO: better API to modify delegations? The correct one is not obvious though.
# One option is to add add_key/remove_key() to delegations as well (see Root)
role = DelegatedRole("delegate", [delegate_key.keyid], 1, True)
delegations = Delegations({delegate_key.keyid: delegate_key}, [role])
targets_signed = Targets(1, SPEC_VERSION, datetime(3000,1,1), {}, delegations)
targets = Metadata(targets_signed, OrderedDict())
targets.sign(targets_signer)
targets.to_file("targets.json")

# create snapshot
# TODO maybe Metafile.from_metadata(targets: Metadata) 
# what Metafile will contain is unclear... it's also not the same as TargetFile.from_data()
# TODO maybe Metadata.new_snapshot_with_defaults(expires: datetime, target_metadatas: list[Metadata])
# (again the issue here is making a decision on what data to include in MetaFile?)
meta = {"targets.json": MetaFile(targets.signed.version)}
snapshot_signed = Snapshot(1, SPEC_VERSION, datetime(3000,1,1), meta)
snapshot = Metadata(snapshot_signed, OrderedDict())
snapshot.sign(snapshot_signer)
snapshot.to_file("snapshot.json")

# create timestamp
# TODO Metadata.new_timestamp_with_defaults(expires: datetime, snapshot: Metadata)
# (again issue is making a decision on what data to include in MetaFile?)
meta = {"snapshot.json": MetaFile(snapshot.signed.version)}
timestamp_signed = Timestamp(1, SPEC_VERSION, datetime(3000,1,1), meta)
timestamp = Metadata(timestamp_signed, OrderedDict())
timestamp.sign(timestamp_signer)
timestamp.to_file("timestamp.json")

# create root
# TODO Metadata.new_root_with_defaults(expires, root_key, timestamp_key, snapshot_key, targets_key)
# TODO alternative: Metadata.new_root_with_defaults(expires) + 
#                               Metadata.set_keys(rolename: list[Key], threshold)
# There's no need to make role management easier as the roles should never change: only keys and role thresholds will
keys = {key.keyid:key for key in [root_key, snapshot_key, timestamp_key, targets_key]}
roles = {
    "root": Role([root_key.keyid], 1),
    "targets": Role([targets_key.keyid], 1),
    "snapshot": Role([snapshot_key.keyid], 1),
    "timestamp": Role([timestamp_key.keyid], 1),
}

root_signed = Root(1, SPEC_VERSION, datetime(3000,1,1), keys, roles, True)
root = Metadata(root_signed, OrderedDict())
root.sign(root_signer)
root.to_file("root.json")


# Test MetadataBundle with the in-memory metadata
# TODO: Add Metadata.to_bytes() to avoid dealing with serializer
#serializer = JSONSerializer()
# UNTESTED: bundle = MetadataBundle(serializer.serialize(root))

# Create root to version 2 with rotated key
root.signed.remove_key("timestamp", timestamp_key.keyid)

timestamp_key, timestamp_signer = generate_test_key()
timestamp.signed.version += 1
timestamp.sign(timestamp_signer)
timestamp.to_file("timestamp.json")

# TODO should add_key() remove the old key from keys if it does not have users 
# anymore like remove _key() does?
root.signed.add_key("timestamp", timestamp_key)
root.signed.version += 1
root.sign(root_signer)
root.to_file("root.json")

# Update bundle to version 2
# UNTESTED: bundle.update_root(serializer.serialize(root))

# create version 3 with higher threshold
root2_key, root2_signer = generate_test_key()
root.signed.add_key("root", root2_key)
root.signed.roles["root"].threshold = 2
root.signed.version += 1

# TODO should signing be a single operation sign(list[signers])?
# alternatively: sign_new_version(list[signers]) that also bumps version?
# alternatively: Metadata can be given a list of signers earlier, always uses them to sign
# when needed? How does repository usually figure out which signers it needs?
root.sign(root_signer)
root.sign(root2_signer, append=True)
root.to_file("root.json")

# Update bundle to version 3
# UNTESTED: bundle.update_root(serializer.serialize(root))

# create version 3 with only one signature
root.signed.version += 1
root.sign(root_signer)
root.to_file("root.json")

# Update bundle to version 3 (this should fail: not enough signatures)
# UNTESTED: bundle.update_root(serializer.serialize(root))

# UNTESTED: bundle.root_update_finished()

(the MetadataBundle testing code is not tested yet as the branch is not up-to-date with develop)

Metadata

Metadata

Assignees

Labels

backlogIssues to address with priority for current development goalstesting

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions