-
Notifications
You must be signed in to change notification settings - Fork 183
osbuild: use bootc install to deploy the container #4224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Skipping CI for Draft Pull Request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
The pull request introduces changes to use bootc install to deploy the container, which simplifies the image build process. There are a few critical issues in the YAML manifest related to copy-paste errors that lead to incorrect configurations for the 4k image builds and missing options for loopback
devices. These issues need to be addressed.
I switched the CI on this to run against |
A few diffs picked up by We should probably profile each diff (maybe in coreos/fedora-coreos-tracker#1827) and evaluate whether it's a change we want to make or not. |
I can't get a built qemu image to boot. I suspect probably the root= and boot= UUIDs added on the kernel command line? |
do you mind sharing more logs ? What I am getting locally is ignition failing on coreos/fedora-coreos-tracker#1250 |
Ahh. I see that too now:
|
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
looks like removing those make the boot process go further (ignition completes), and out of the initramfs but fail to mount the boot partition. |
Blocked on bootc-dev/bootc#1441 |
ok this works with the following PRs :
for the bootc PR, it can be built then added into the image through |
59f1061
to
254f877
Compare
follow-up : either find a way to get the boot components inside cosa, or change the bootc code to call bootupd from the deployed root . I think the latter is preferable. |
Made bootc-dev/bootc#1460 |
bb4270f
to
310bd60
Compare
Alright, marking this as ready for review as all the bits are in place. This will need a release of bootc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments.
I think there are a few things we need to iron out before we can really move forward with this:
- supporting both old and new paths at the same time
Do we need to? Usually when we make a change this large we roll it out slowly, which means we have to support both ways for some time.
This PR is ignoring that fact, but TBH looking at OSBuild configs that support both would be pretty intimidating, so I'm not excited about trying to do that either. I'd be interested in @jlebon or @travier's thoughts.
- We need to make sure any/every diff that exists between images generated this way and the old way are considered and acknowleged as acceptable before we'd make this change.
Rolling this out in FCOS first would be preferable indeed. Instead of adding mpp conditionals, could we just have a separate set of manifests to use temporarily? Not ideal either, I know. If we're sufficiently convinced by the diff, we could do a hard cut over. The problem is in ensuring the diff captures everything well. I think it helps a lot though that the main changes here are happening at the filesystem level. |
48a30b4
to
77121d0
Compare
I went up this route : duplicated the manifest and added a switch to the osbuild wrapper to use it when releasever is >44. We can drop this when we are confident it can go to all streams. I cleaned up the commits a bit and incorporated @dustymabe reviews suggestions. |
Ok I found another quirk : bootc-dev/bootc#1559 A workaround would be to use the container as the osbuild buildroot (but that would only work with rawhide as we need python). This is what I was doing until now so that's why I dind't ran into it before. But since we are only using this manifest for rawhide for now, why not.. |
requires coreos/bootupd#990 |
77121d0
to
9037fa7
Compare
The log disk usage message comming every 10 seconds is quite noisy, hide it by default. We can always uncomment it to investigate. I aslo added a couple of helpful tips in comments given by @dustymabe to work with osbuild.
Bootc is looking for the prepare-root config file in the buildroot environnement because the main assumption is that it's run from the target container. However, in osbuild, it's run from te buildroot, because podman inside bwrap (inside supermin in our case) causes issues. It's fine for RHCOS and SCOS where we use the target container as the buildroot but we cannot do that for FCOS because we require python in the buildroot. For now, insert a prepare-root file in the supermin VM (use as the buildroot for osbuild) until either : - bootc learn to look into the container for it [1] - we ship python in our images and can use them as buildroot. Another approach would be to layer python and the osbuild dependencies on top of our image and use that as the buildroot, but that would create room for packages drift (what was in the repos at build time?). At least using COSA it's easier to keep track of versions. [1] bootc-dev/bootc#1410
Instead of deploying the container to the tree then copy all the contents to the disk image, use bootc to directly manage the installation to the target filesystems. Right now this requires to use the image as the buildroot so this requires python (for osbuild). This is tracked in [1]. As we have python in rawhide now I duplicated the manifest and added a switch in the osbuild wrapper script. We can keep the manifest duplicated until we are confident to roll this to all streams. [1] bootc-dev/bootc#1410 Requires: bootc-dev/bootc#1460 bootc-dev/bootc#1451 osbuild/osbuild#2149 osbuild/osbuild#2152 All of which have landed in osbuild-159 and bootc 1.6
9037fa7
to
d54d764
Compare
Rebased and removed osbuild patches as they have landed in an upstream release now (requires a COSA rebuild though) |
@jbtrystram: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
set +x | ||
|
||
log_disk_usage | ||
#log_disk_usage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm against this change. This is useful because running out of disk space in the cache qcow2 could be the reason why a failure occurs. Having these breadcrumbs in the logs are how we know.
Now, what I will admit is that it's really not helpful when you have a RUNVM_SHELL
environment and you are trying to type on a bash prompt. Maybe we could find some way to auto disable it if that is the case? Perhaps we could change this line to:
(cd ${workdir}; RUNVM_SHELL=${RUNVM_SHELL} bash)
and then conditionalize if on $RUNVM_SHELL
here in the code.
# have a prepare-root config in the build environnement for bootc to read. | ||
# This workaround can be removed when https://github.com/bootc-dev/bootc/issues/1410 | ||
# is fixed or we have python in all streams which allows us to use the OCI image as the buildroot. | ||
# Note that RHCOS and SCOS use the OCI as buildroot so they should not be affected by this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still kinda wish we could just get python in here too to get away from the shenanigans
# Use the bootc install to-filesystem manifest for rawhide | ||
# See discussion in https://github.com/coreos/coreos-assembler/pull/4224#pullrequestreview-3076134370 | ||
# Remove when we switch to use bootc install to-filesystem to all streams | ||
releasever=${build%%.*} | ||
with_bootc_install="" | ||
if [ "$releasever" -ge 44 ]; then | ||
with_bootc_install=".bootc" | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is effective, but brittle. It would probably give us the most flexibility if we made it configurable via an environment variable and also possibly a manifest setting like was done over in
coreos/fedora-coreos-config#3754 and 532225d
i.e. I could see us wanting to roll this out in next
and also RHCOS 10/SCOS 10 first and having those knobs would help that.
images: | ||
type: org.osbuild.containers | ||
origin: org.osbuild.pipeline | ||
references: | ||
name:oci-archive: | ||
name: coreos.ociarchive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are losing some compat here. In the existing code we have the ability to execute either via an ociarchive OR a container repo:tag
coreos-assembler/src/osbuild-manifests/coreos.osbuild.x86_64.mpp.yaml
Lines 191 to 192 in e9034ee
# Deploy via ociarchive or container | |
- mpp-if: ociarchive != '' |
I don't actually know if this is important. I think in practice we have just been executing the ociarchive flow. But custom-coreos-disk-images at one point I think allowed the repo:tag flow.
That being said, I do have a local patch set that might be a reason to use the repo:tag flow again.
Thank you for your patience here JB. I don't see anything too alarming in the patch set here. Mostly small stuff. I do want to spend some time with this change testing it out locally and poking around the built images. I'll try to do so tomorrow. |
A couple of things:
|
Thank you so much for looking into this Dusty ! another set of eyes is really helpful. |
Instead of deploying the container to the tree then copy all the contents to the disk image, use bootc to directly manage the installation to the target filesystems.
Right now this requires to use the image as the buildroot so this requires python (for osbuild). This is tracked in [1].
[1] bootc-dev/bootc#1410 Requires osbuild/osbuild#2149