perf(redis-lua): pool *lua.LState to cut GC pressure#580
perf(redis-lua): pool *lua.LState to cut GC pressure#580
Conversation
EVAL / EVALSHA previously minted a fresh *lua.LState per call.
Production heap profiles showed ~34% of in-use heap in
gopher-lua allocs (newFuncContext 11.8%, newRegistry 11.3%,
newFunctionProto 10.5%), driven by BullMQ-style ~10 scripts/s
workloads. Each Eval allocating a VM is pure waste.
This change introduces a sync.Pool-backed *lua.LState pool.
Pooled states are pre-initialised once with base libs + nil-ed
dangerous loaders + redis/cjson/cmsgpack modules. The redis
module's call/pcall closures resolve the per-eval
*luaScriptContext out of a global binding map (set on acquire,
cleared on release) instead of capturing it, so the state does
not need to be rewired with fresh closures every eval -- which
is what makes pooling both safe and cheap.
Reset logic (security invariant): on release the pool
1) walks the state's global table and deletes any key not
present in the snapshot captured at pool-fill time -- this
removes user-introduced globals such as KEYS, ARGV, or a
leaking GLOBAL_LEAK,
2 restores every snapshot key to its original value, so a
script that rebinds allowed globals (redis = nil,
string = {upper = evil}) cannot poison the next user, and
3) truncates the value stack and unbinds the script context
so stale redis.call invocations cannot fire.
No script from evaluator A can observe or influence state in
evaluator B's run. Tests:
- TestLua_VMReuseDoesNotLeakGlobals covers the
load-bearing invariant for GLOBAL_LEAK-style cases.
- TestLua_VMReuseRestoresRebindsWhitelistedGlobals covers
sabotage of whitelisted globals.
- TestLua_PoolRecordsReuseVsAllocation proves the pool is
actually being used via a hit counter.
- TestRedis_LuaPoolNoGlobalLeakEndToEnd drives the full
EVAL path on a live server.
BenchmarkLuaState_NewVsPooled (darwin/arm64, Apple M1 Max):
new_state_per_call 59255 ns/op 194417 B/op 543 allocs/op
pooled_state 13306 ns/op 39700 B/op 102 allocs/op
-> ~5x fewer allocs, ~5x less B/op, ~4.5x faster per eval.
EOF
)
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 53 minutes and 0 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces a luaStatePool to reuse *lua.LState instances, significantly reducing heap and GC pressure for high-rate Lua script workloads. The implementation includes a mechanism to snapshot and reset global variables to ensure isolation between script executions. Feedback highlights several critical areas: the current reset logic only handles string-keyed globals and fails to revert mutations within standard library tables, potentially leading to state leakage or environment poisoning. Additionally, there are potential nil pointer dereferences in redis.call and redis.pcall when the script context is missing, and the use of a global mutex for context lookup may become a performance bottleneck under high concurrency.
| // not attempt to deep-clone them; if a script mutates | ||
| // `table.insert = nil`, the restore re-binds the original function | ||
| // value which is still alive. | ||
| func snapshotGlobals(state *lua.LState) map[string]lua.LValue { |
There was a problem hiding this comment.
snapshotGlobals only captures string-keyed globals. Lua allows any value type (except nil and NaN) as a key in the global table _G. A script could intentionally or accidentally leak state by setting a non-string key (e.g., _G[42] = "secret"), which would bypass the current reset logic and persist across pooled state reuses.
References
- Avoid silently dropping entries during normalization or state capture to ensure state consistency and prevent data leaks.
| // the whitelisted ones, then truncates the value stack. It is the | ||
| // heart of the security invariant: anything the script did to globals | ||
| // must not be observable by the next user. | ||
| func (p *pooledLuaState) reset() { |
There was a problem hiding this comment.
The reset logic is vulnerable to table poisoning. While it restores global variable references (e.g., the string table itself), it does not undo mutations to the fields within those tables (e.g., string.upper = function() return "pwned" end).
Since standard library tables are mutable in gopher-lua and the snapshot only holds a reference to the table object, any mutation to the table's contents will persist and poison subsequent script executions. The test TestLua_VMReuseRestoresRebindsWhitelistedGlobals only verifies that rebinding the global name string is handled, but it does not check for mutation of the original table's fields.
| if !ok { | ||
| scriptState.RaiseError("redis.call invoked without an active script context") | ||
| return 0 | ||
| } |
There was a problem hiding this comment.
If luaBindContext is called with a nil context (as seen in the benchmarks), luaLookupContext will return nil, true. The current check if !ok will pass, leading to a nil pointer dereference when luaRedisCommand attempts to use the context. The check should explicitly verify that the context is non-nil.
| if !ok { | |
| scriptState.RaiseError("redis.call invoked without an active script context") | |
| return 0 | |
| } | |
| if !ok || ctx == nil { | |
| scriptState.RaiseError("redis.call invoked without an active script context") | |
| return 0 | |
| } |
References
- When using context.Context for managing operations, ensure it is correctly propagated and validated to prevent runtime errors.
| if !ok { | ||
| scriptState.Push(luaErrorTable(scriptState, "redis.pcall invoked without an active script context")) | ||
| return 1 | ||
| } |
There was a problem hiding this comment.
Similar to redis.call, redis.pcall does not check if the resolved context is nil. If luaLookupContext returns nil, true, this will result in a nil pointer dereference in luaRedisCommand.
| if !ok { | |
| scriptState.Push(luaErrorTable(scriptState, "redis.pcall invoked without an active script context")) | |
| return 1 | |
| } | |
| if !ok || ctx == nil { | |
| scriptState.Push(luaErrorTable(scriptState, "redis.pcall invoked without an active script context")) | |
| return 1 | |
| } |
References
- When using context.Context for managing operations, ensure it is correctly propagated and validated to prevent runtime errors.
| var luaStateBindings = struct { | ||
| sync.RWMutex | ||
| m map[*lua.LState]*luaScriptContext | ||
| }{m: map[*lua.LState]*luaScriptContext{}} |
There was a problem hiding this comment.
Using a global map protected by a single sync.RWMutex for luaStateBindings will likely become a performance bottleneck under high concurrency. Every redis.call and redis.pcall across all concurrent scripts must acquire an RLock on this global mutex to resolve the context.
Consider storing the *luaScriptContext directly within the *lua.LState's registry (using lua.RegistryIndex) when the state is acquired. This localizes the context lookup to the specific Lua state and eliminates the need for global locking.
References
- For frequently accessed fields, prefer mechanisms that avoid global mutex contention to improve performance on hot paths.
|
|
||
| // Collect all current string keys first; mutating the table in | ||
| // ForEach is unsafe. | ||
| currentKeys := make([]string, 0, len(p.globalsSnapshot)+luaResetKeySlack) |
There was a problem hiding this comment.
The currentKeys slice is allocated on every call to reset. To further reduce GC pressure (the primary goal of this PR), consider storing this slice within the pooledLuaState struct and reusing it (after clearing it) across resets.
References
- Pre-allocating or reusing buffers is acceptable to manage memory bounds and reduce allocation overhead.
Addresses three findings from the gemini-code-assist review on #580. 1. Non-string-keyed globals survived reset. The old snapshot only captured string keys, so a script doing `_G[42] = "secret"` or `_G[true] = "bad"` leaked across pool reuse. Snapshot now uses an LValue-keyed map and reset iterates every key type. Switched to RawSet (not RawSetH) because gopher-lua stores integer keys in an internal array slice that RawSetH does not touch. 2. Table poisoning survived reset. Restoring the `string` / `math` / `redis` / etc. GLOBAL references did not undo in-place mutations of those tables' fields (e.g. `string.upper = function() return "pwned" end`). We now snapshot every whitelisted nested table's shallow field set at pool init and restore it on reset (Option A / B from the review), keyed by *LTable pointer so the fix still applies even if a script rebinds the outer global name. 3. `redis.call` / `redis.pcall` nil-deref. A nil *luaScriptContext stored via luaBindContext(nil) -- which the bench path does -- produced a (nil, true) lookup, then panicked inside luaRedisCommand. Guards now explicitly check `!ok || ctx == nil`. New regression tests (all pass under -race -count=5): - TestLua_VMReuseNonStringGlobalKeysAreWiped - TestLua_VMReuseDoesNotPoisonStringLib - TestLua_VMReuseDoesNotPoisonRedisModule - TestLua_PoolNilContextProducesErrorNotPanic
|
Addressed the three flagged findings (2x SECURITY-HIGH + 1x HIGH) in c56a7f0. Picked Option B for the table-poisoning fix (snapshot every whitelisted nested table's shallow field set at init, restore on reset) over Option C because the fix is tractable, the reset cost is O(fields) per table -- tiny in practice -- and shipping the pool off by default would forfeit the 4-5x heap win the PR is about. 1. Non-string-keyed globals leaking (SECURITY-HIGH, line 171)
2. Table poisoning surviving reset (SECURITY-HIGH, line 186)
3. Nil context nil-pointer deref (HIGH, lines 266 / 274)
Verification
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a Lua state pool to optimize Redis EVAL operations by reusing Lua virtual machines, significantly reducing heap allocations and GC pressure. The implementation includes a security mechanism to snapshot and restore the Lua environment, preventing state leakage between executions. Reviewer feedback identifies a potential concurrency bottleneck in the global context mapping and recommends using the Lua registry for thread-local storage, alongside suggestions to use reusable slices during the environment reset to further minimize memory overhead.
| var luaStateBindings = struct { | ||
| sync.RWMutex | ||
| m map[*lua.LState]*luaScriptContext | ||
| }{m: map[*lua.LState]*luaScriptContext{}} | ||
|
|
||
| func luaBindContext(state *lua.LState, ctx *luaScriptContext) { | ||
| luaStateBindings.Lock() | ||
| luaStateBindings.m[state] = ctx | ||
| luaStateBindings.Unlock() | ||
| } | ||
|
|
||
| func luaUnbindContext(state *lua.LState) { | ||
| luaStateBindings.Lock() | ||
| delete(luaStateBindings.m, state) | ||
| luaStateBindings.Unlock() | ||
| } | ||
|
|
||
| func luaLookupContext(state *lua.LState) (*luaScriptContext, bool) { | ||
| luaStateBindings.RLock() | ||
| ctx, ok := luaStateBindings.m[state] | ||
| luaStateBindings.RUnlock() | ||
| return ctx, ok | ||
| } |
There was a problem hiding this comment.
The use of a global map with a sync.RWMutex for mapping *lua.LState to *luaScriptContext introduces a global synchronization point that will become a bottleneck under high concurrency. Every redis.call or redis.pcall invocation inside a Lua script performs an RLock on this global mutex, which can lead to significant contention.
A more efficient and scalable approach is to store the *luaScriptContext directly within the *lua.LState using the Lua registry (lua.RegistryIndex). This is local to the state and requires no synchronization.
Suggested implementation pattern:
- In newPooledLuaState, pre-allocate an *lua.LUserData in the registry: state.RawSet(lua.RegistryIndex, lua.LString("elastickv_ctx"), state.NewUserData(nil)).
- In luaBindContext, simply update the Value of that pre-allocated LUserData.
- In luaLookupContext, read the LUserData from the registry and return its Value.
This also improves memory safety by ensuring the context reference is cleaned up when the LState is destroyed, avoiding potential leaks in a global map.
References
- For frequently accessed fields or hot paths, prefer mechanisms that avoid global mutex contention (like local storage or atomic operations) to improve performance.
| type pooledLuaState struct { | ||
| state *lua.LState | ||
| // globalsSnapshot is a copy of every entry reachable via the | ||
| // state's global table at init, keyed by LValue (not just string) | ||
| // so scripts cannot smuggle state across evals via non-string | ||
| // keys such as _G[42] = "secret". | ||
| globalsSnapshot map[lua.LValue]lua.LValue | ||
| // tableSnapshots holds the shallow field sets of well-known | ||
| // whitelisted tables (string, math, table, redis, cjson, | ||
| // cmsgpack) captured at init. On reset we restore each to its | ||
| // original contents so a script doing e.g. | ||
| // `string.upper = function() return "pwned" end` cannot poison | ||
| // subsequent pooled reuses. | ||
| // | ||
| // The outer map is keyed by the *LTable pointer of the parent | ||
| // (e.g. the `string` table) so tableSnapshots survives even if a | ||
| // script rebinds the global name (`string = nil`) -- the reset | ||
| // restores the global name first, then restores the table's | ||
| // internal contents from this snapshot. | ||
| tableSnapshots map[*lua.LTable]map[lua.LValue]lua.LValue | ||
| } |
There was a problem hiding this comment.
To further reduce GC pressure and achieve zero-allocation cleanup, consider adding a scratchKeys []lua.LValue field to the pooledLuaState struct. This slice can be reused in reset and resetTableContents to collect keys before mutation, avoiding the current per-execution allocations. To prevent unbounded memory growth, ensure this collection remains within reasonable bounds.
type pooledLuaState struct {
state *lua.LState
// globalsSnapshot is a copy of every entry reachable via the
// state's global table at init, keyed by LValue (not just string)
// so scripts cannot smuggle state across evals via non-string
// keys such as _G[42] = "secret".
globalsSnapshot map[lua.LValue]lua.LValue
// tableSnapshots holds the shallow field sets of well-known
// whitelisted tables (string, math, table, redis, cjson,
// cmsgpack) captured at init. On reset we restore each to its
// original contents so a script doing e.g.
// string.upper = function() return "pwned" end cannot poison
// subsequent pooled reuses.
//
// The outer map is keyed by the *LTable pointer of the parent
// (e.g. the string table) so tableSnapshots survives even if a
// script rebinds the global name (string = nil) -- the reset
// restores the global name first, then restores the table's
// internal contents from this snapshot.
tableSnapshots map[*lua.LTable]map[lua.LValue]lua.LValue
// scratchKeys is a reusable slice for collecting table keys during
// reset to avoid allocations.
scratchKeys []lua.LValue
}References
- To prevent unbounded memory growth and potential OOM issues, apply a fixed bound to collections that can grow from external requests.
| currentKeys := make([]lua.LValue, 0, len(p.globalsSnapshot)+luaResetKeySlack) | ||
| globals.ForEach(func(k, _ lua.LValue) { | ||
| currentKeys = append(currentKeys, k) | ||
| }) |
There was a problem hiding this comment.
Reuse the scratchKeys slice from pooledLuaState here to avoid allocating a new slice on every reset. The same pattern should be applied to resetTableContents at line 318.
| currentKeys := make([]lua.LValue, 0, len(p.globalsSnapshot)+luaResetKeySlack) | |
| globals.ForEach(func(k, _ lua.LValue) { | |
| currentKeys = append(currentKeys, k) | |
| }) | |
| currentKeys := p.scratchKeys[:0] | |
| globals.ForEach(func(k, _ lua.LValue) { | |
| currentKeys = append(currentKeys, k) | |
| }) | |
| p.scratchKeys = currentKeys |
Move the per-eval *luaScriptContext binding off the global luaStateBindings map (guarded by sync.RWMutex) and into a pre-allocated *lua.LUserData stashed in each pooled state's own Lua registry under a fixed key. redis.call / redis.pcall now resolve the context via a lock-free, state-local read, eliminating the global RLock contention point that would otherwise be hit on every redis.call (e.g. ~50 lookups/s/script under BullMQ fan-out). Also reuse pooledLuaState.scratchKeys across resets so resetPooledLuaState / resetTableContents no longer allocate a fresh []lua.LValue per call. The retained backing array is dropped if it grows past luaScratchKeysMaxCap (1024) to guard against pathological scripts pinning unbounded memory on a pooled state. Adds regression tests: - TestLua_PoolConcurrentContextIsolation: 64 goroutines x 100 lookups must each observe their own *luaScriptContext (never another goroutine's, proved under -race -count=5. - TestLua_PoolContextIsRegistryBacked: pins the contract that the binding lives in the state registry, not a package-level map. - TestLua_PoolScratchKeysReused: asserts reuse + max-cap bound. - BenchmarkLuaLookupContext_Concurrent: measures concurrent lookup cost (lock-free after this change). EOF )
Addressing Gemini's 2026-04-21 review — commit
|
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
There was a problem hiding this comment.
Pull request overview
Introduces a pooled *lua.LState implementation for Redis EVAL/EVALSHA to reduce allocation rate and GC pressure by reusing initialized Lua VMs while enforcing a “no cross-script state leakage” reset invariant.
Changes:
- Add
luaStatePool/pooledLuaStatewith global/table snapshotting, per-state registry-backed context binding, and reset logic. - Wire Redis script execution to acquire/release pooled Lua states instead of creating/closing a fresh VM per script.
- Add unit/integration tests and benchmarks validating reuse, isolation, and concurrency behavior.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| adapter/redis_lua_pool.go | Implements the pooled Lua VM, snapshotting, context binding, and reset logic. |
| adapter/redis_lua_pool_test.go | Adds safety/concurrency regression tests and benchmarks for pooling behavior. |
| adapter/redis_lua.go | Switches EVAL execution to use the pool and removes the per-invocation Lua VM init path. |
| adapter/redis.go | Adds RedisServer fields for the Lua pool and initializes it in the constructor. |
| func (p *pooledLuaState) reset() { | ||
| globals := p.state.G.Global | ||
|
|
||
| // (1) Restore inner contents of every snapshotted whitelisted | ||
| // table. This defeats poisoning attacks like | ||
| // `string.upper = function() return "pwned" end`. | ||
| // | ||
| // resetTableContents borrows p.scratchKeys as a working slice. | ||
| // We pass it in and receive the (possibly grown) backing array | ||
| // back so successive calls within the same reset share one | ||
| // allocation. | ||
| scratch := p.scratchKeys[:0] | ||
| for tbl, originalFields := range p.tableSnapshots { | ||
| scratch = resetTableContents(tbl, originalFields, scratch[:0]) | ||
| } | ||
|
|
||
| // (2) Collect all current global keys (of any type). Mutating | ||
| // the table inside ForEach is unsafe, so snapshot keys first. | ||
| scratch = scratch[:0] | ||
| globals.ForEach(func(k, _ lua.LValue) { | ||
| scratch = append(scratch, k) | ||
| }) | ||
|
|
||
| // Delete any key not in the init-time snapshot: these are | ||
| // user-introduced globals (KEYS, ARGV, GLOBAL_LEAK, _G[42], | ||
| // _G[true], ...). | ||
| // | ||
| // We use RawSet (not RawSetH) because gopher-lua stores integer | ||
| // keys in an internal `array` slice rather than `dict`; RawSetH | ||
| // only touches `dict`, so a call like RawSetH(LNumber(42), LNil) | ||
| // leaves the array entry intact. RawSet dispatches to the right | ||
| // storage by key type. | ||
| for _, k := range scratch { | ||
| if _, keep := p.globalsSnapshot[k]; !keep { | ||
| globals.RawSet(k, lua.LNil) | ||
| } | ||
| } | ||
|
|
||
| // (3) Restore every whitelisted global to its original value. | ||
| // This covers the case where a script rebinds an allowed global | ||
| // (e.g. `redis = something`) -- we simply put the original back. | ||
| for k, v := range p.globalsSnapshot { | ||
| globals.RawSet(k, v) | ||
| } | ||
|
|
There was a problem hiding this comment.
The reset logic restores global/table entries but does not restore table metatables. A script can call setmetatable(_G, ...) or setmetatable(string, ...) (base lib exposes setmetatable) and leak/poison behavior across pooled evals without changing any table fields, breaking the stated isolation invariant. Consider snapshotting metatables at pool-fill time (for _G and each table in tableSnapshots) and restoring them during reset (or explicitly clearing to the init-time metatable) in addition to restoring table fields.
| // than string. Lua permits any non-nil, non-NaN value as a table key, | ||
| // so a script doing `_G[42] = "leak"` or `_G[true] = "bad"` bypasses a | ||
| // naive string-only snapshot/wipe. The LValue-keyed snapshot + the | ||
| // RawSetH-based reset in pool.reset must catch these. |
There was a problem hiding this comment.
This test comment says the pool reset is "RawSetH-based", but the implementation uses RawSet to correctly clear integer keys in the array part (see redis_lua_pool.go reset()). Updating the comment will avoid confusion when diagnosing regressions.
| // RawSetH-based reset in pool.reset must catch these. | |
| // RawSet-based reset in pool.reset must catch these. |
The Lua VM pool's reset logic snapshotted and restored _G entries plus the fields of whitelisted tables, but it did not touch metatables. Because setmetatable is in the base library, a script could install a poisoned __index on _G (or on string/math/table/redis/cjson/cmsgpack) and leak attacker-controlled behaviour into every subsequent pooled eval that dereferenced an undefined key. Capture each table's raw metatable at pool-fill time (LTable.Metatable, bypassing __metatable dispatch) and restore it on reset. When the original had no metatable, restore lua.LNil so any metatable the script installed is stripped. Also fixes the misleading "RawSetH-based" comment in TestLua_VMReuseNonStringGlobalKeysAreWiped -- the implementation uses RawSet so integer keys in gopher-lua's array part are actually cleared. Covered tables: _G, string, table, math, redis, cjson, cmsgpack. Regression tests TestLua_VMReuseDoesNotPoisonGlobalsMetatable and TestLua_VMReuseDoesNotPoisonStringMetatable verify that both attack vectors no longer leak across pool reuse, and both reproduce the vulnerability when the metatable-restore loop is stubbed out.
|
/gemini review Addressed Copilot review items (commit eacc395)1. Metatable snapshot/restore (security-relevant)
Blocks:
Regression tests:
Both were verified to fail without the fix (by stubbing out the 2. Test comment fixThe comment in |
There was a problem hiding this comment.
Code Review
This pull request introduces a Lua state pool to the Redis adapter to reduce heap and GC pressure by reusing *lua.LState instances during script execution. The implementation includes a robust reset mechanism that snapshots and restores globals, nested tables, and metatables to ensure security and isolation between scripts. Additionally, it replaces a mutex-guarded global map with a lock-free context lookup using the Lua registry to improve performance under high concurrency. One issue was identified in the hit/miss tracking logic, where the use of sync.Pool.New causes all acquisitions to be recorded as hits, even when a new state is allocated.
| pls, ok := p.pool.Get().(*pooledLuaState) | ||
| if !ok || pls == nil { | ||
| // New func never returns nil, but defend against misuse. | ||
| pls = newPooledLuaState() | ||
| p.misses.Add(1) | ||
| } else { | ||
| p.hits.Add(1) | ||
| } |
There was a problem hiding this comment.
The hit/miss tracking logic is currently inaccurate because sync.Pool.New is set in newLuaStatePool. When p.pool.Get() is called, if the pool is empty, it automatically invokes the New function and returns a fresh state. Consequently, ok will always be true and pls will never be nil, causing every acquisition to be recorded as a 'hit' even when a new allocation occurs. To accurately track reuse, you should either remove the New function from the pool and handle allocation manually in get, or use a flag within pooledLuaState to distinguish fresh instances.
Summary
*lua.LStateacross EVAL / EVALSHA invocations, reusing the expensive base-lib plus redis/cjson/cmsgpack initialisation instead of minting a fresh VM every call.newFuncContext11.8%,newRegistry11.3%,newFunctionProto10.5%); a BullMQ-style workload (~10 scripts/s) is exactly the shape that this wasted.Security invariant
No script state must leak from one evaluator to the next. On release, the pool:
KEYS,ARGV, or a leakingGLOBAL_LEAK.redis = nil,string = {upper = evil}) cannot poison the next user.*luaScriptContextso staleredis.call/redis.pcallinvocations cannot fire.The pre-registered
redis.call/redis.pcallclosures dispatch the per-eval context via a*lua.LStateto*luaScriptContextbinding map (set on acquire, cleared on release), which is what makes pooling safe without re-registering closures every eval.Benchmark
BenchmarkLuaState_NewVsPooled(darwin/arm64, Apple M1 Max, go 1.26):Rough extrapolation to the 34% heap slice cited in the profile: the pooled path allocates roughly 1/5 the bytes per script, so that slice should compress to ~7-8% of in-use heap in steady state (actual number depends on script mix and sync.Pool eviction rate under GC).
Tests added
TestLua_VMReuseDoesNotLeakGlobals-- the load-bearing leak test. Script A setsGLOBAL_LEAK = 42plus aLEAKY_TABLE, script B asserts both arenilon a recycled VM.TestLua_VMReuseRestoresRebindsWhitelistedGlobals-- sabotagesredisandstring.upperon the pooled state and confirms they are restored.TestLua_PoolRecordsReuseVsAllocation-- proves via the hit counter that the pool actually hands back existing VMs.TestRedis_LuaPoolNoGlobalLeakEndToEnd-- drives the full EVAL path on a realRedisServerand confirms cross-script leakage cannot happen through the user-facing protocol.Test plan
go test -race -short ./adapter/...(green locally, ~60s)golangci-lint --config=.golangci.yaml run ./adapter/...-> 0 issuesgo test -run='^$' -bench=BenchmarkLuaState_NewVsPooled -benchmem ./adapter/shows ~5x reduction in B/op and allocs/op