Graph takes 30+ seconds to run garden.io #🌱

Graph takes 30+ seconds to run

future-coat-9017

06/26/2023, 6:07 PM

Hey all, Trying to understand what is happening under the hood / if we can do something to improve the situation. We have about 240~ garden modules - the actual number of modules is more around a 100, but due to different templates we end up with that number of modules needing to be resolved by Garden on startup. We noticed the graph is taking very long, around 30+ seconds

Copy code

[2023-06-26T17:24:42.519Z] Resolving 240 modules...

[2023-06-26T17:25:14.026Z] Resolving 240 modules... → Done

This is using the Kubernetes provider. I've look at the logs with debug logs on, and noticed it's doing some FS scans which takes about a second, I assume building the DAG is also very quick.. I wonder if most of this time is spent making Kubernetes API calls? Are those necessary or is there an option here to introduce a flag to skip all of these? Is there anything else I can do to improve this situation? It would be fantastic if this was under 5 seconds. Thanks!

quaint-dress-831

06/27/2023, 11:25 AM

Hi @future-coat-9017 I'm escalating this internally

quaint-dress-831

06/27/2023, 11:30 AM

Our CEO writes > This is basically a known issue, especially in 0.12.x. We are doing several things to improve this and some of them are already in 0.13.x.One quite important thing to note is that using

.gardenignore

files is broadly more efficient than using

include/exclude

in module configs. For repos with many files, and where possible to use, this is strongly encouraged.

quaint-dress-831

06/27/2023, 11:31 AM

We have a PR open that's likely to help once merged and released https://github.com/garden-io/garden/pull/4642

future-coat-9017

06/27/2023, 3:40 PM

@quaint-dress-831 Thats awesome, do you think we can also backport the fix to 0.12 as well? We cannot upgrade to 0.13 yet due to the breaking changes, but being able to introduce this improvement with our current setup will be fantastic.

quaint-dress-831

06/27/2023, 3:46 PM

@brief-restaurant-63679 I'll keep you updated! We do recommend users migrate to Bonsai, eventually 😉

quaint-dress-831

06/27/2023, 3:48 PM

But I can promise to advocate for it if it's technically feasible. It may be that these improvements are technically locked to Bonsai

future-coat-9017

06/27/2023, 4:22 PM

🙏

quaint-dress-831

06/29/2023, 2:16 PM

@future-coat-9017 just to note we've heard graph resolution is up to 6x faster just from switching to Bonsai. This is due to changes to Bonsai we can't backport.

future-coat-9017

07/13/2023, 5:44 PM

Thanks, I also see the PR merged, so will definitely look into moving to Bonsai

future-coat-9017

07/13/2023, 7:14 PM

@quaint-dress-831 just tried out Bonsai, with

GARDEN_GIT_SCAN_MODE=repo

- unfortunately im seeing almost 2x slower performance

calm-family-29323

07/18/2023, 10:08 AM

@future-coat-9017 Slow resolution times are mostly due to unfavourable file inclusion settings. Probably tinkering with your

.gardenignore

files and

include

and

exclude

filters on the modules will increase the speed greatly.

brief-dawn-88958

07/18/2023, 2:08 PM

@future-coat-9017 excludes are not properly working with the new git scan mode; see https://github.com/garden-io/garden/issues/4783. Workaround is to use .gardenignore files instead

polite-fountain-28010

08/14/2023, 1:49 PM

Hey @future-coat-9017, we just released a new version that once again has some significant performance improvements when it comes to graph resolution. Maybe you could give it a try and report back if that made things faster again for you.

future-coat-9017

09/08/2023, 6:13 PM

Hey there.. Compared running

garden validate

with 0.12.48 vs. 0.13.13 and the results are about a 5 second improvement, from 31s in 0.12 to 25s in 0.13.13. We were hoping to reduce the time to under 10s, wondering if we should see a greater improvement than above? We ran the command with

GARDEN_GIT_SCAN_MODE=repo

as well

future-coat-9017

09/08/2023, 6:30 PM

btw I've also removed all exclusions and it seems to improved by the same amount ~5 seconds

future-coat-9017

09/08/2023, 6:39 PM

my assumption is that most time is simply due to the large number of modules its trying to resolve in our case (260~), not specifically with git scanning. it seems include/exclude/bonsai has minimal effect in our base (around ~5 second improvement)

polite-fountain-28010

09/11/2023, 8:52 AM

GARDEN_GIT_SCAN_MODE=repo

should be the fastest repo scan mode now for large repos so I'd keep using that. What OS is this on? If you're on MacOS did you make sure you have the ARM build running? Also could you share the debug logs? I would be interesting to see where it's actually spending its time. Previously the graph resolution time was the most expensive part with large projects, so we focused on that first.

future-coat-9017

09/11/2023, 4:27 PM

Good idea, making sure I was using ARM reduce time to 22 seconds which is slightly better, let me get the logs masked and post it

future-coat-9017

09/11/2023, 5:28 PM

See attached https://cdn.discordapp.com/attachments/1122951215270072452/1150845237799952394/logs.jsonl

future-coat-9017

09/11/2023, 5:28 PM

@polite-fountain-28010

quaint-dress-831

09/26/2023, 9:44 AM

@polite-fountain-28010 can I close this or this still unresolved?

polite-fountain-28010

09/26/2023, 9:55 AM

We have resolved some of the worst performance problems, but I think it's worth continuing to look into this. What I noticed is those log lines:

Copy code

{"msg":"Resolving actions and modules...","section":"graph                →","timestamp":"2023-09-11T16:25:13.171Z","level":"info"}
{"msg":"Scanning repository at /garden","section":"graph                →","timestamp":"2023-09-11T16:25:20.627Z","level":"info"}

It seems like there is quite some wait between resolving the modules and scanning the repo which might be worth checking out. Maybe we could turn this into a Github issue so it's easier to track?

polite-fountain-28010

09/26/2023, 3:03 PM

There's also the wait between these lines

Copy code

{"msg":"Found 1 files in module path after glob matching","section":"graph [debug]        →","timestamp":"2023-09-11T16:25:21.883Z","level":"debug"}
{"msg":"Converting kubernetes module ***** to actions","section":"graph [debug]        →","timestamp":"2023-09-11T16:25:28.930Z","level":"debug"}

which seems rather long. Looking at outputs from my "large garden repo" project, I cannot reproduce that, but I don't think there should be anything that takes 7 seconds in between those steps.

polite-fountain-28010

09/28/2023, 1:24 PM

We just released version 0.13.16 which includes a fix for a memory leak. I'd be curious to see if that too has a positive impact on your performance since it should be caching some calls that weren't cached properly right now.

future-coat-9017

09/28/2023, 9:15 PM

I can check, let me confirm soon. Thanks!

future-coat-9017

09/28/2023, 9:24 PM

Wow, down to 10 seconds, thats fantastic.. interestingly AMD64 is still 23~ seconds, but with ARM64 is down to 10 seconds.

future-coat-9017

09/28/2023, 9:24 PM

@polite-fountain-28010 ☝️ Thanks!

polite-fountain-28010

09/29/2023, 8:33 AM

That's great to hear!

polite-fountain-28010

11/16/2023, 11:57 AM

Hey @future-coat-9017 , we just made another release

0.13.20

which should once again give a performance boost for larger projects. I'm wondering if you could maybe try that one out and give us feedback on the impact on your project.

15 Views

Previous Next