Graph takes 30+ seconds to run
# 🌱|help-and-getting-started
Hey all, Trying to understand what is happening under the hood / if we can do something to improve the situation. We have about 240~ garden modules - the actual number of modules is more around a 100, but due to different templates we end up with that number of modules needing to be resolved by Garden on startup. We noticed the graph is taking very long, around 30+ seconds
Copy code
[2023-06-26T17:24:42.519Z] Resolving 240 modules...

[2023-06-26T17:25:14.026Z] Resolving 240 modules... → Done
This is using the Kubernetes provider. I've look at the logs with debug logs on, and noticed it's doing some FS scans which takes about a second, I assume building the DAG is also very quick.. I wonder if most of this time is spent making Kubernetes API calls? Are those necessary or is there an option here to introduce a flag to skip all of these? Is there anything else I can do to improve this situation? It would be fantastic if this was under 5 seconds. Thanks!
Hi @future-coat-9017 I'm escalating this internally
Our CEO writes > This is basically a known issue, especially in 0.12.x. We are doing several things to improve this and some of them are already in 0.13.x.One quite important thing to note is that using
files is broadly more efficient than using
in module configs. For repos with many files, and where possible to use, this is strongly encouraged.
We have a PR open that's likely to help once merged and released
@quaint-dress-831 Thats awesome, do you think we can also backport the fix to 0.12 as well? We cannot upgrade to 0.13 yet due to the breaking changes, but being able to introduce this improvement with our current setup will be fantastic.
@brief-restaurant-63679 I'll keep you updated! We do recommend users migrate to Bonsai, eventually 😉
But I can promise to advocate for it if it's technically feasible. It may be that these improvements are technically locked to Bonsai
@future-coat-9017 just to note we've heard graph resolution is up to 6x faster just from switching to Bonsai. This is due to changes to Bonsai we can't backport.
Thanks, I also see the PR merged, so will definitely look into moving to Bonsai
@quaint-dress-831 just tried out Bonsai, with
- unfortunately im seeing almost 2x slower performance
@future-coat-9017 Slow resolution times are mostly due to unfavourable file inclusion settings. Probably tinkering with your
files and
filters on the modules will increase the speed greatly.
@future-coat-9017 excludes are not properly working with the new git scan mode; see Workaround is to use .gardenignore files instead
Hey @future-coat-9017, we just released a new version that once again has some significant performance improvements when it comes to graph resolution. Maybe you could give it a try and report back if that made things faster again for you.
Hey there.. Compared running
garden validate
with 0.12.48 vs. 0.13.13 and the results are about a 5 second improvement, from 31s in 0.12 to 25s in 0.13.13. We were hoping to reduce the time to under 10s, wondering if we should see a greater improvement than above? We ran the command with
as well
btw I've also removed all exclusions and it seems to improved by the same amount ~5 seconds
my assumption is that most time is simply due to the large number of modules its trying to resolve in our case (260~), not specifically with git scanning. it seems include/exclude/bonsai has minimal effect in our base (around ~5 second improvement)
should be the fastest repo scan mode now for large repos so I'd keep using that. What OS is this on? If you're on MacOS did you make sure you have the ARM build running? Also could you share the debug logs? I would be interesting to see where it's actually spending its time. Previously the graph resolution time was the most expensive part with large projects, so we focused on that first.
Good idea, making sure I was using ARM reduce time to 22 seconds which is slightly better, let me get the logs masked and post it
@polite-fountain-28010 can I close this or this still unresolved?
We have resolved some of the worst performance problems, but I think it's worth continuing to look into this. What I noticed is those log lines:
Copy code
{"msg":"Resolving actions and modules...","section":"graph                →","timestamp":"2023-09-11T16:25:13.171Z","level":"info"}
{"msg":"Scanning repository at /garden","section":"graph                →","timestamp":"2023-09-11T16:25:20.627Z","level":"info"}
It seems like there is quite some wait between resolving the modules and scanning the repo which might be worth checking out. Maybe we could turn this into a Github issue so it's easier to track?
There's also the wait between these lines
Copy code
{"msg":"Found 1 files in module path after glob matching","section":"graph [debug]        →","timestamp":"2023-09-11T16:25:21.883Z","level":"debug"}
{"msg":"Converting kubernetes module ***** to actions","section":"graph [debug]        →","timestamp":"2023-09-11T16:25:28.930Z","level":"debug"}
which seems rather long. Looking at outputs from my "large garden repo" project, I cannot reproduce that, but I don't think there should be anything that takes 7 seconds in between those steps.
We just released version 0.13.16 which includes a fix for a memory leak. I'd be curious to see if that too has a positive impact on your performance since it should be caching some calls that weren't cached properly right now.
I can check, let me confirm soon. Thanks!
Wow, down to 10 seconds, thats fantastic.. interestingly AMD64 is still 23~ seconds, but with ARM64 is down to 10 seconds.
@polite-fountain-28010 ☝️ Thanks!
That's great to hear!
Hey @future-coat-9017 , we just made another release
which should once again give a performance boost for larger projects. I'm wondering if you could maybe try that one out and give us feedback on the impact on your project.