buildkitd, util exec format error on ARM
# 🌱|help-and-getting-started
w
garden version: 0.13.30 kubernetes version: v1.28.8+k3s1 OS: macOS 14.4 (23E214) Arch: Apple Silicon I'm running
multipass
with
k3s
on a MacBook M3, and it appears that Garden is not compatible with Kubernetes clusters on ARM processors. I'm encountering an
exec format error
with the images for buildkitd, util, and default-backend. Upon checking Docker Hub, I noticed there are no
linux/arm64
versions available. To support ARM-based Kubernetes clusters, it might be as straightforward as adjusting the pipeline for these repositories to also push arm64 images. Can I compile these images myself and configure them within Garden? Is that part of the setup configurable? https://cdn.discordapp.com/attachments/1239306293307637932/1239306293752365117/image.png?ex=6642716d&is=66411fed&hm=3620fd7c79f745b637a927aca1b6c47f5fcc0441cfcd56b143e4d1330e620458& https://cdn.discordapp.com/attachments/1239306293307637932/1239306294092107816/image.png?ex=6642716d&is=66411fed&hm=10f671367b9ef4e60a3fd3bbf187afc7ae838d7cbf57a384c23ff6ba80b28ebf&
f
Hi @witty-pilot-66879 , you are correct we currently don't have these images as arm builds. We will discuss if that is something we want to add right away. In the meantime you could create a amd64 VM with multipass to run k3s on or use orbstack or rancher-desktop to run k3s locally. Do you have a hard dependency on running k3s on arm64?
w
Hi @freezing-pharmacist-34446, thanks for looking into the issue. I'm using multipass to run k3s already but since all our computing power are arm based eg, AppleSilicon and Supermicro NVIDIA GH200 Grace Hopper Superchip. I can easily say we have hard dependencies on arm64
Please let me know if you don't have capacity to triage this issue. I can open PRs and enhance the circle-ci pipeline. Thanks!
f
Hi Hassan, we talked about it today. We think it makes sense, but can't promise a timeline. Happy to receive contributions! Let me know if you want some pointers!
w
Yes some pointers would help! I'm aware of the following repo https://github.com/garden-io/garden/tree/main/images
f
Hi Hassan, sorry for the delay. This is the correct place where these images are build. They are build with garden and we haven't properly implemented multi platform support yet (it is possible but not quite where it should be). Implementing this is actually in our short-term roadmap. In either case at second glance contributing here is not quite straight forward. I will be looking into this myself after all, will keep you posted.
w
That's very cool! I didn't know that I can use Garden for CI/CD build pipelines! Thanks for your efforts @freezing-pharmacist-34446
I just gave it a try! Unfortunately, only
linux/arm64
will covert generic
arm
Silicons but for AppleSilicons
linux/arm/v6
and
linux/arm/v7
are needed
f
Gotcha, let me take another look
Turns out that some of the base images we are using are only available as amd64 or arm64.
b
@witty-pilot-66879 We're working on publishing
linux/arm64
versions of all our images at the moment. On Apple Silicon, the correct platform is
linux/arm64
linux/arm/v6
and
linux/arm/v7
(where
linux/arm
is an alias for
v7
) are referring to older 32-bit ARM architectures. Are you sure v6 and v7 are needed for you on Apple Silicon?
w
Hello @big-spring-14945, I gave it a try this morning!
moby/buildkit
supports
linux/arm64
,
linux/arm/v7
I was able to run
moby/buildkit:v0.12.5-rootless
on a VM running on my MacBook M3. So, it seems like
linux/arm/v6
is not required
It worked fine on a second run. It seems like it was a racing condition between my manual changes on the
deployment
while
garden
command was still running
Do you know when
garden
cli will use
gardendev/buildkit:v0.13.2
as a default image?
f
Happy to hear it worked. The next garden release will use all the multi arch images, we are probably going to release some time next week.
w
Thanks everyone!
s
Note: We published a new release yesterday, so we should be good to go here.
b
@witty-pilot-66879 Does it work for you with the latest version of Garden?
f
Hi @witty-pilot-66879, i took a look at this and even though there is a new multi arch image for the buildkit sidecar that includes mutagen (your command is failing while syncing the build context with mutagen) but this new image is not yet. It is not yet used because we introduced some changes to mutagen a while ago and feature flagged them, which means that the new arm image is basically behind a feature-flag. Can you set this environment and retry your command? Please make sure to delete the environment beforehand.
Copy code
GARDEN_ENABLE_NEW_SYNC=true garden deploy
c
@witty-pilot-66879 can you please try the following commands from the project root directory before running Garden?
Copy code
GARDEN_ENABLE_NEW_SYNC=false garden util mutagen daemon stop
GARDEN_ENABLE_NEW_SYNC=true garden util mutagen daemon stop
It will stop the sync Daemons, after that you can try to use
garden deploy --sync
w
Hey @curved-intern-91221 sorry I had to shift gear to something else! But now back on track! I ran the commands as you suggested. However, I got a handshake error from
mutagen
https://cdn.discordapp.com/attachments/1239306293307637932/1254206667713675264/image.png?ex=6678a67b&is=667754fb&hm=052a046a3dcbc9e25cd60e2a31982e142c0a8587fba6d68a2379a16b9f4c73ac&
I can confrim that this issue happen only on the ARM based cluster
c
Thanks for the update @witty-pilot-66879 ! We have released Garden 0.13.32, it has Mutagen version updated to 0.17.6. Could you please thy that out? Please try to stop all running Mutagen processes before the testing. You can run
pd -ef | grep mutagen
to get the list of syncs and daemons.
I managed to upgrade the mutagen by performing
GARDEN_ENABLE_NEW_SYNC=true garden util mutagen --version
then killed all processes. Tried again but got same
handshake
error. https://cdn.discordapp.com/attachments/1239306293307637932/1255414323526893588/image.png?ex=667d0b33&is=667bb9b3&hm=9101b17dca028ccfabe9295cfcf2d2a0cce1a452487a1bfa0caf86e140e7aa92&
c
Did you set
GARDEN_ENABLE_NEW_SYNC=true
when ou retried the deploy command? That flag must always be passed to use new Mutagen (0.17.6). otherwise, old version 0.15.0 will be used. In 0.13.34 we'll switch the default Mutagen to 0.17.6. Can you please share a minimal reproducible example with us?
w
Hello @curved-intern-91221 I have recorded my steps with debug mode enabled. I hope that gives you some insights to reproduce https://cdn.discordapp.com/attachments/1239306293307637932/1263492661931806730/output.mp4?ex=669a6ebd&is=66991d3d&hm=322e7a703da6bb64ce5aa38305a736871d31945b4a8655a748e7f8409c1c17e2&
c
@witty-pilot-66879 did you stop the old Mutagen daemons before runing this? What is the output of
ps -ef | grep mutgaen
? There should be no running mutagen processes before you trying to switch to the new sync mode
Could you please thy tto do these steps? 1. Stop all daemons by running
Copy code
GARDEN_ENABLE_NEW_SYNC=false garden util mutagen daemon stop
GARDEN_ENABLE_NEW_SYNC=true garden util mutagen daemon stop
from the Garden project root. 2. Make sure there are no running mutagen processes:
ps -ef | grep mutagen
- if anything is found, plese stop those 3. Run
deploy --sync
with the new daemon:
Copy code
GARDEN_ENABLE_NEW_SYNC=true garden deploy --sync
Please share the log file from the deploy command execution if the error persists.
c
Thank you! Could ypou please share the debug-level log file?
w
Where I can find the log file? Can I share the log file privatly with you?
c
We have found something in the recorded logs, and we're preparing a fix now. Let's try that fix once the CI build is ready. No need to share the log file now 🙂
b
You can copy & paste the logs from the terminal; There will also be a .garden directory in the root directory of your Garden project, and it contains the debug log file as well– it shouldn't differ though from the output on the terminal.
c
Could you please try the Garden binary from tthis build? https://app.circleci.com/pipelines/github/garden-io/garden/26107/workflows/abc031d0-6e2c-4a42-a387-1452bdf20482/jobs/542908/artifacts Please re-do the daemon stop before trying
c
Thanks for the update! The error is different now, I'm looking into this issue.
@witty-pilot-66879 can you please get the details of yhe target pod that you try to sync to? I'm interested mostly in the
initContainerStatuses
section:
Copy code
initContainerStatuses:
  - containerID: docker://51bb94dfb088b6773908e81caa77ffa5080d1c69794eba1950077d8292ef79d6
    image: sha256:3e941b4fb299b810e5981c08d5977115d25d0bbb4e7b4a733aab29e8112d7429
    imageID: docker-pullable://gardendev/k8s-sync@sha256:90a583672c63e61031a036900753cb6a8a6b0b7dc20909e2abcc079a1120127b
    lastState: {}
    name: garden-dev-init
    ready: true
    restartCount: 0
    started: false
    state:
      terminated:
        containerID: docker://51bb94dfb088b6773908e81caa77ffa5080d1c69794eba1950077d8292ef79d6
        exitCode: 0
        finishedAt: "2024-07-23T06:05:14Z"
        reason: Completed
        startedAt: "2024-07-23T06:05:14Z"
You can get it with this command:
Copy code
kubectl get pod <pod-name> -o yaml
I'm going to compare the outputs and the Docker image IDs with my local run. There might be some troubles in starting the Mutagen agent on the Pod side for k3s cluster on Mac M1 chip. That's just a hypothesis that I need to verify. The fix we made before had the expected effect - now the faux ssh command seems to be called, so the error message is different , and it looks like some data cannot be retrieved from the Pod side.
2 Views