Hello. I was trying semble v0.4.0 on a big C++ codebase (~135k files, ~2.5GB of source code), but indexing crashed (I think for memory exhaustion) and no index cache was written at all. It looks like cache is not built incrementally, so this makes impossible to use semble on such code base.
Command run to warm up the index:
semble search "test" /workspace --top-k 1
After 90 minutes it crashed with a simple "Killed" message (searching around it could be caused by an OOM of Python).
During the run, a few times the warning "Recursion depth exceeded in chunk." appeared (but looking at the source code, it seems an handled case).
The critical problem is that no partial index cache was written. No other output about the caching progress was available (so it's impossible to understand how much RAM was used and needed for the task to complete).
Ideally an incremental build of the cache should solve this problem, since it should be able to recover from a crash. But I think that this requires also another improvement to make everything work properly: to avoid loading in memory all the index to process a search request (otherwise the OOM error would return).
Hello. I was trying semble v0.4.0 on a big C++ codebase (~135k files, ~2.5GB of source code), but indexing crashed (I think for memory exhaustion) and no index cache was written at all. It looks like cache is not built incrementally, so this makes impossible to use semble on such code base.
Command run to warm up the index:
After 90 minutes it crashed with a simple "Killed" message (searching around it could be caused by an OOM of Python).
During the run, a few times the warning "Recursion depth exceeded in chunk." appeared (but looking at the source code, it seems an handled case).
The critical problem is that no partial index cache was written. No other output about the caching progress was available (so it's impossible to understand how much RAM was used and needed for the task to complete).
Ideally an incremental build of the cache should solve this problem, since it should be able to recover from a crash. But I think that this requires also another improvement to make everything work properly: to avoid loading in memory all the index to process a search request (otherwise the OOM error would return).