How do you debug buildkit secret configuration?
# 🌱|help-and-getting-started
c
After following the documentation for enabling buildkit, Garden seems unable to locate the docker registry secret when present. This error is displayed when running build actions:
Copy code
Could not find secret 'ecr-config' in namespace 'default'. Have you correctly configured your secrets?
But querying directly shows it is present:
Copy code
Ξ» k get secret/ecr-config -n default
NAME         TYPE                             DATA   AGE
ecr-config   kubernetes.io/dockerconfigjson   1      19m
Is there some RBAC related issue here? Or is this a Garden bug?
q
Hi @clever-policeman-58407 I spun up an EKS cluster with
eksctl
to test and can't reproduce. Are you able to create a new test cluster, create your registry credential secret with
kubectl create secret docker-registry regcred --docker-server=$ECR_REGISTRY_SERVER --docker-username=AWS --docker-password=$ECR_PASSWORD
, and ensure your
project.garden.yml
file looks similar to the below?
Copy code
providers:
  - name: kubernetes
    environments: [remote]
    context: tao@garden.io@fargate-cluster.eu-north-1.eksctl.io
    ingressClass: "nginx"
    buildMode: cluster-buildkit
    clusterBuildkit:
      rootless: true
    imagePullSecrets:
      - name: regcred
        namespace: default
    deploymentRegistry:
      hostname: 000000000.dkr.ecr.eu-central-1.amazonaws.com
      namespace: aws-repository-231954e9 
    namespace: ${environment.namespace}
c
Your example uses the username and password directly, and I actually had a previously functioning version of this that did the same. So perhaps that's the issue here. The version I'm having issues with now follows the section of documentation that instructs you to use the ECR credential helper: https://docs.garden.io/kubernetes-plugins/remote-k8s/configure-registry/aws#enabling-in-cluster-building
I was already using the ECR credential helper so I can verify that it is functioning outside of garden, just not with garden.
I've spent all day experimenting and it seems like the example I had before is no longer working either. So neither your example (direct docker credentials) nor the ecr-helper example work remotely to ECR for me. Is it possible that there is some RBAC policy that needs to be added, or that Garden is deploying buildkit to the wrong namespace? The only thing that I can think of that might cause this issue is that the accessor does not have the permissions necessary to see the secret. e.g. something along the lines of buildkit deploying incorrectly (since secrets are only visible to other resources in the same namespace) or an unknown RBAC policy added by CRD somewhere
f
Hey @clever-policeman-58407 , Garden will copy the registry secrets from the namespace that you specified in your project.garden.yaml to the namespace where buildkit runs. So if you have access to the ecr-config secret in the default namespace, garden should be able to copy it. Can you post your project.garden.yaml configuration especially the part about configuring the kubernetes provider and buildkit?
c
Certainly, here's a semi-censored version of our
project.garden.yaml
config:
Copy code
apiVersion: garden.io/v1
kind: Project
name: foobaz-cloud
defaultEnvironment: local
dotIgnoreFile: .gitignore

variables:
  namespace: ${local.env.CLUSTER_NAMESPACE || kebabCase(local.username)}

environments:
  - name: local
    defaultNamespace: ${var.namespace}
    variables:
      base-hostname: localhost
      deploy-target: local
  - name: remote
    defaultNamespace: ${var.namespace}
    variables:
      base-hostname: "${var.namespace}.foobaz.dev"
      deploy-target: remote
  
providers:
  - name: local-kubernetes
    environments: [local]
    namespace: ${environment.namespace}
    defaultHostname: ${var.base-hostname}
    context: k3d-foobaz

    setupIngressController: nginx

    deploymentRegistry:
      hostname: k3d-foobazregistry
      port: 12345
      insecure: true
      namespace: ${kebabCase(local.username)}

    sync:
      defaults:
        exclude:
          - "**/node_modules"
          - "**/.next"

  - name: kubernetes
    environments: [remote]
    ingressClass: nginx
    buildMode: cluster-buildkit
    clusterBuildkit:
      rootless: true
    imagePullSecrets:
      - name: ecr-config
        namespace: ${environment.namespace}
    deploymentRegistry:
      hostname: # < ... >.dkr.ecr.us-west-2.amazonaws.com
      namespace: garden
    namespace: ${environment.namespace}
    defaultHostname: ${var.base-hostname}
    context: # < ... >

  - name: terraform
    initRoot: ./terraform/environments/dev
    autoApply: false
    allowDestroy: false
The ecr-config in question is generated from:
Copy code
{
  "credHelpers": {
    "< ... >.dkr.ecr.us-west-2.amazonaws.com": "ecr-login"
  }
}
And injected with:
kubectl create secret generic ecr-config --from-file=.dockerconfigjson=registry-config.json --type=kubernetes.io/dockerconfigjson --namespace foobaz
Assume that the namespace the secret is created in and the
deploymentRegistry.namespace
are the same, as verifiable by kubectl
The high level error message:
Copy code
βœ– build.ui                β†’ Failed resolving status for Build type=container name=ui (took 0.47 sec). This is what happened:

────────────────────────────────────────────────────────────────────────────────────────────────────
Could not find secret 'ecr-config' in namespace 'foobaz'. Have you correctly configured your secrets?
The full error message, from `-l5`:
Copy code
Error: Could not find secret 'ecr-config' in namespace 'foobaz'. Have you correctly configured your secrets?
    at readSecret (file:///Users/foobaz/Library/Application%20Support/io.garden.garden/1701117661-33XyWur.r/rollup/garden.mjs:747039:19)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async file:///Users/foobaz/Library/Application%20Support/io.garden.garden/1701117661-33XyWur.r/rollup/kind-I-cxA2pC.mjs:1950:24
    at async Promise.all (index 0)
    at async buildDockerAuthConfig (file:///Users/foobaz/Library/Application%20Support/io.garden.garden/1701117661-33XyWur.r/rollup/kind-I-cxA2pC.mjs:1949:28)
    at async prepareDockerAuth (file:///Users/foobaz/Library/Application%20Support/io.garden.garden/1701117661-33XyWur.r/rollup/kind-I-cxA2pC.mjs:2004:20)
    at async ensureBuilderSecret (file:///Users/foobaz/Library/Application%20Support/io.garden.garden/1701117661-33XyWur.r/rollup/kubernetes-exec-pWB0gJIk.mjs:749:24)
    at async file:///Users/foobaz/Library/Application%20Support/io.garden.garden/1701117661-33XyWur.r/rollup/kubernetes-exec-pWB0gJIk.mjs:1380:56
Error type: configuration
I have no idea what changed, but this seems to have fixed itself
With no code changes beyond debugging lines, I'm inclined to say that this is some kind of ephemeral issue in the same way that Docker networking will ocassionally just break inexplicably and irreparably after a system sleep until you restart your host machine. Who knows.
The fact that it was related to the Kubernetes API not being able to see a secret that could be pulled in a very obvious fashion makes this particularly baffling, but if it's fixed I've got nothing to go off of
4 Views