251 memory reins by cubap · Pull Request #262 · CenterForDigitalHumanities/rerum_server_nodejs

cubap · 2026-04-02T20:53:03Z

fixes #251

This pull request refactors several controller files to improve code maintainability and consistency. The main changes involve replacing manual object cloning with a utility function and standardizing pagination logic across multiple endpoints. These updates help reduce code duplication, make the codebase easier to maintain, and ensure more robust handling of request parameters.

Key changes include:

Object Cloning Consistency:

Replaced all instances of JSON.parse(JSON.stringify(...)) used for deep cloning objects with the centralized utils.cloneObject function in CRUD, patch, and update controller files. This change improves code clarity and avoids potential issues with JSON serialization. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18]

Pagination Handling Standardization:

Introduced and utilized a new getPagination utility function to extract limit and skip parameters with default values, replacing repeated manual parsing logic in CRUD, search, history, and Gallery of Glosses controller endpoints. This ensures consistent pagination behavior and reduces code duplication. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]

Let me know if you have questions about how to use the new utility functions or why these refactors were made!

Introduce cloneObject in utils.js that uses globalThis.structuredClone when available and falls back to JSON.parse(JSON.stringify(...)). Replace direct JSON cloning in configureRerumOptions for configuredObject and received_options with cloneObject, and export cloneObject. This centralizes cloning logic and allows better deep-clone behavior when structuredClone is supported.

Introduce MAX_QUERY_LIMIT and MAX_QUERY_SKIP and add clampNonNegativeInt and getPagination to safely parse and clamp limit/skip query params (with sensible defaults and env overrides). Replace ad-hoc deep copies (JSON.parse(JSON.stringify(...))) with utils.cloneObject in idNegotiation and getAllVersions to avoid issues with prototype loss and improve clarity. Export getPagination for reuse.

Replace repeated JSON.parse(JSON.stringify(...)) deep-clones with utils.cloneObject across controllers (crud, patchSet, patchUnset, patchUpdate, putUpdate, gog) and add/get getPagination to standardize parsing of limit/skip in search, gog and crud. Adjust imports accordingly to centralize cloning and pagination logic, improving readability and consistency.

thehabes

A bug or two and some cleanup to do.

Static Review Comments

Branch: 251-memory-reins
Review Date: 2026-04-06
Reviewer: Pair Static Review - Claude & @thehabes

Claude and Bryan make mistakes. Verify all issues and suggestions. Avoid unnecessary scope creep.

Category	Issues Found
🔴 Critical	1
🟠 Major	1
🟡 Minor	4
🔵 Suggestions	1

Critical Issues 🔴

Issue 1: `limit=0` bypasses pagination protection entirely

File: controllers/utils.js:17
Category: Logic error

Problem:
clampNonNegativeInt(0, 100, 500) returns 0 because the guard parsed < 0 does not catch zero. When MongoDB receives .limit(0), it interprets it as "no limit" and returns all matching documents — completely defeating the memory protection this PR is designed to add.

Confirmed by testing: POST /api/query?limit=0 returned all 1388 matching objects instead of the default 100.

This is introduced by this PR — the old inline parseInt(req.query.limit ?? 100) also had this issue, but the new getPagination function was specifically designed to add protection and should close this gap.

Current Code:

function clampNonNegativeInt(value, fallback, max) {
    const parsed = Number.parseInt(value, 10)
    if (!Number.isFinite(parsed) || parsed < 0) return fallback
    return parsed > max ? max : parsed
}

Suggested Fix:

function clampNonNegativeInt(value, fallback, max) {
    const parsed = Number.parseInt(value, 10)
    if (!Number.isFinite(parsed) || parsed <= 0) return fallback
    return parsed > max ? max : parsed
}

How to Verify:

curl -X POST http://localhost:3001/v1/api/query?limit=0 \
  -H "Content-Type: application/json" \
  -d '{"__rerum.APIversion":"1.1.0"}' | jq 'length'

Should return 100 (the default), not all matching documents.

Major Issues 🟠

Issue 2: Unmigrated `JSON.parse(JSON.stringify(...))` spots

Category: Completeness

The following files still use the old cloning pattern and were not migrated in this PR:

controllers/delete.js — 6 occurrences (lines 27, 40, 64, 128, 161, 200)
controllers/overwrite.js — 2 occurrences (lines 22, 96)
controllers/release.js — 2 occurrences (lines 47, 111)
controllers/gog.js — 1 occurrence in expand() (line 390)
controllers/utils.js — 4 occurrences in tree healing functions (lines 320, 338, 380, 408)
controllers/bulk.js — 2 occurrences (lines 36, 123)

Minor Issues 🟡

Issue 3: `cloneObject` JSDoc belongs to the wrong function

File: utils.js:28
Category: Documentation / misleading code

Problem:
The cloneObject function was inserted between the JSDoc block for configureRerumOptions and the function itself. The JSDoc at line 11-27 describes configureRerumOptions (parameters generator, received, update, extUpdate; returns configuredObject) but now visually appears to document cloneObject. This will mislead developers and IDE tooltips.

Current Code:

 * @param update A trigger for special handling from update actions
 * @return configuredObject The same object that was recieved but with the proper __rerum options.
 */
const cloneObject = function(value){
    if(typeof globalThis.structuredClone === "function"){
        return globalThis.structuredClone(value)
    }
    return JSON.parse(JSON.stringify(value))
}

const configureRerumOptions = function(generator, received, update, extUpdate){

Suggested Fix:
Move cloneObject above the configureRerumOptions JSDoc block, and give it its own JSDoc:

/**
 * Deep clone a value using structuredClone when available,
 * falling back to JSON round-trip serialization.
 *
 * @param {*} value - The value to clone
 * @returns {*} A deep clone of the input value
 */
const cloneObject = function(value){
    if(typeof globalThis.structuredClone === "function"){
        return globalThis.structuredClone(value)
    }
    return JSON.parse(JSON.stringify(value))
}

/**
 * Add the __rerum properties object to a given JSONObject...
 *
 * @param received A potentially optionless JSONObject...
 * @param update A trigger for special handling from update actions
 * @return configuredObject The same object that was recieved but with the proper __rerum options.
 */
const configureRerumOptions = function(generator, received, update, extUpdate){

How to Verify:
Read the file — ensure each JSDoc block sits directly above the function it documents.

Issue 4: `getPagination` and `clampNonNegativeInt` lack JSDoc

File: controllers/utils.js:15-28
Category: Code hygiene

Problem:
Per project conventions (CLAUDE.md: "Use JDoc style for code documentation. Cleanup, fix, or generate documentation for the code you work on as you encounter it"), new functions should have JSDoc.

Suggested Fix:

/**
 * Parse an integer value, returning a fallback for non-finite or non-positive
 * values and clamping to a maximum.
 *
 * @param {*} value - The value to parse
 * @param {number} fallback - Default if value is not a valid positive integer
 * @param {number} max - Upper bound to clamp to
 * @returns {number} Clamped non-negative integer
 */
function clampNonNegativeInt(value, fallback, max) { ... }

/**
 * Extract and validate pagination parameters from a query object.
 * Values are clamped between 1 and the configured maximums (env vars
 * RERUM_MAX_QUERY_LIMIT and RERUM_MAX_QUERY_SKIP).
 *
 * @param {Object} [query={}] - Express req.query object
 * @param {number} [defaultLimit=100] - Default limit when none provided
 * @returns {{limit: number, skip: number}} Validated pagination parameters
 */
function getPagination(query = {}, defaultLimit = 100) { ... }

Issue 5: Pre-existing HEAD `Content-Length` bug exposed by new pagination code

File: controllers/history.js:119-121
Category: Pre-existing bug (not introduced by this PR)

Problem:
queryHeadRequest computes Content-Length from the serialized JSON, then calls res.sendStatus(200) which sends the body "OK" (length 2), overwriting the previously set header. Every HEAD response returns Content-Length: 2 regardless of actual payload size.

This bug exists on main and is not introduced by this PR, but the new pagination makes this endpoint more usable, so the incorrect header is more likely to be relied upon by consumers.

Current Code:

const size = Buffer.byteLength(JSON.stringify(matches))
res.set("Content-Length", size)
res.sendStatus(200)

Suggested Fix (for a follow-up PR):

const size = Buffer.byteLength(JSON.stringify(matches))
res.set("Content-Length", size)
res.status(200).end()

Issue 6: `structuredClone` fallback is dead code

File: utils.js:29-30
Category: Dead code

Problem:
The project requires Node >= 24.14.0 (package.json engines). structuredClone has been a global since Node 17.0. The typeof globalThis.structuredClone === "function" check and the JSON.parse(JSON.stringify(value)) fallback branch will never execute.

Not harmful, but adds unnecessary complexity. Consider simplifying in a follow-up if you prefer clean code over defensive coding.

Suggestions 🔵

Suggestion 1: Consider `countDocuments` for HEAD requests instead of `toArray`

File: controllers/history.js:117
Category: Performance

The queryHeadRequest still calls .toArray(), materializing all matching documents in memory just to compute a Content-Length. Since the response body is never sent (it's a HEAD request), consider using db.countDocuments(props) or db.find(props).limit(limit).skip(skip).count() to avoid loading documents entirely. This would further reduce memory pressure for this endpoint.

This is beyond the scope of this PR but would complement the memory improvements nicely.

If there are significant code changes in response to this review please test those changes. Run the application manually and test or perform internal application tests when applicable.

cubap added 4 commits April 2, 2026 15:24

Update history.js

f777bcd

cubap requested a review from thehabes as a code owner April 2, 2026 20:53

Merge main into this branch to deploy to dev for testing.

b6db442

thehabes requested changes Apr 6, 2026

View reviewed changes

thehabes assigned cubap Apr 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

251 memory reins#262

251 memory reins#262
cubap wants to merge 5 commits intomainfrom
251-memory-reins

cubap commented Apr 2, 2026

Uh oh!

thehabes left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cubap commented Apr 2, 2026

Uh oh!

thehabes left a comment

Choose a reason for hiding this comment

Static Review Comments

Critical Issues 🔴

Issue 1: limit=0 bypasses pagination protection entirely

Major Issues 🟠

Issue 2: Unmigrated JSON.parse(JSON.stringify(...)) spots

Minor Issues 🟡

Issue 3: cloneObject JSDoc belongs to the wrong function

Issue 4: getPagination and clampNonNegativeInt lack JSDoc

Issue 5: Pre-existing HEAD Content-Length bug exposed by new pagination code

Issue 6: structuredClone fallback is dead code

Suggestions 🔵

Suggestion 1: Consider countDocuments for HEAD requests instead of toArray

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Issue 1: `limit=0` bypasses pagination protection entirely

Issue 2: Unmigrated `JSON.parse(JSON.stringify(...))` spots

Issue 3: `cloneObject` JSDoc belongs to the wrong function

Issue 4: `getPagination` and `clampNonNegativeInt` lack JSDoc

Issue 5: Pre-existing HEAD `Content-Length` bug exposed by new pagination code

Issue 6: `structuredClone` fallback is dead code

Suggestion 1: Consider `countDocuments` for HEAD requests instead of `toArray`