feat(core): add VECTOR datatype support with Oracle vector integration#2
feat(core): add VECTOR datatype support with Oracle vector integration#2prajalg wants to merge 52 commits into
Conversation
… vector functions
…or distance function
…etc. validate using to only allow hnsw or ivf allowlist distance metrics before emitting WITH DISTANCE require numeric, finite, positive values for accuracy and vector index parameters quote dropConstraintQuery constraint names
…e-dataaccess-sequelize/sequelize-oracle into vector_support_v7
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThis PR adds VECTOR data type support: core types and abstract base, Sequelize vector helper methods, Oracle dialect VECTOR implementation and index DDL/formatting, conditional index metadata selection, and extensive unit and integration tests. ChangesVector Data Type and Oracle Integration
Sequence Diagram(s)sequenceDiagram
participant User
participant Sequelize
participant Oracle
participant DB
User->>Sequelize: sequelize.vectorDistance(column, value)
Sequelize->>Sequelize: check dialect.supports.dataTypes.VECTOR
Sequelize->>Oracle: formatFn(VECTOR_DISTANCE(column, VECTOR(...)))
Oracle->>Oracle: validate args, format column, build VECTOR literal
Oracle->>DB: execute SELECT ... WHERE VECTOR_DISTANCE(...)
DB-->>Sequelize: rows
Sequelize-->>User: result set
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/core/src/abstract-dialect/data-types.ts`:
- Around line 2916-2929: The validate method currently only checks container
shape via _getVectorIterable; update validate (and the parallel block at
2934-2942) so that after obtaining the iterable from _getVectorIterable(value)
you iterate over its elements and call _validateVectorElement for each item, and
if any element fails throw via ValidationErrorItem.throwDataTypeValidationError
(keeping the existing util.format('%O is not a valid vector', value) or similar
contextual message); ensure this works for both plain arrays and typed-array
iterables returned by _getVectorIterable and that _validateVectorElement
enforces numeric finite values.
In `@packages/core/test/integration/dialects/oracle/vector.test.js`:
- Line 141: The test currently hardcodes the table name "Items" in the raw SQL
(`SELECT "id" FROM "Items" WHERE VECTOR_DISTANCE("embeddings", $queryVector) <
$threshold ORDER BY "id"`), which is brittle; replace that literal with the
model-derived table name (e.g., call Model.getTableName() or the test's
modelInstance.getTableName()) when building the SQL string so the test uses the
resolved table metadata (keep the rest of the VECTOR_DISTANCE clause and
parameter placeholders unchanged).
In `@packages/oracle/src/_internal/data-types-overrides.ts`:
- Around line 593-608: The sparse vector type guard is too permissive: update
isOracleSparseVectorInput (and the analogous guard at 616-637) to validate that
indices are either a Uint32Array or an Array of integers, reject wider typed
arrays, and that each index is an integer, >= 0 and < numDimensions; also
validate values elements are finite numbers (use Number.isFinite) and keep the
length equality check. Concretely, inside isOracleSparseVectorInput iterate over
indices and values, enforce Number.isInteger(idx) && idx >= 0 && idx <
numDimensions for each index, ensure indices is instance of Uint32Array or plain
Array (not Float64Array/Int32Array/etc.), and enforce Number.isFinite(val) for
each value; keep existing numDimensions integer/positive check and values.length
=== indices.length as before.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: ffee8fef-2d0f-4f14-a917-1e2dcc34e84e
📒 Files selected for processing (19)
packages/core/src/abstract-dialect/data-types.tspackages/core/src/abstract-dialect/dialect.tspackages/core/src/abstract-dialect/query-interface.d.tspackages/core/src/data-types.tspackages/core/src/sequelize.d.tspackages/core/src/sequelize.jspackages/core/test/integration/dialects/oracle/vector.test.jspackages/core/test/unit/data-types/string-types.test.tspackages/core/test/unit/dialects/oracle/query-generator.test.jspackages/core/test/unit/dialects/oracle/vector.test.jspackages/core/test/unit/query-generator/show-indexes-query.test.tspackages/core/test/unit/sequelize.test.tspackages/oracle/src/_internal/data-types-overrides.tspackages/oracle/src/dialect.tspackages/oracle/src/query-generator-typescript.internal.tspackages/oracle/src/query-generator.internal.tspackages/oracle/src/query-generator.jspackages/oracle/src/query.jstest/esm-named-exports.test.js
| function isOracleSparseVectorInput(value: unknown): value is OracleSparseVectorInput { | ||
| if (typeof value !== 'object' || value === null) { | ||
| return false; | ||
| } | ||
|
|
||
| const sparseVector = value as Partial<OracleSparseVectorInput>; | ||
| const { indices, numDimensions, values } = sparseVector; | ||
|
|
||
| return ( | ||
| isVectorComponent(values) && | ||
| isVectorComponent(indices) && | ||
| typeof numDimensions === 'number' && | ||
| Number.isInteger(numDimensions) && | ||
| numDimensions > 0 && | ||
| values.length === indices.length | ||
| ); |
There was a problem hiding this comment.
Harden sparse vector validation for indices and numeric values.
The sparse guard currently checks shape but not semantics: indices are not constrained to integers/range, and values are not checked for finite numbers. It also accepts index typed arrays wider than the declared Uint32Array contract.
Suggested fix
function isOracleSparseVectorInput(value: unknown): value is OracleSparseVectorInput {
if (typeof value !== 'object' || value === null) {
return false;
}
const sparseVector = value as Partial<OracleSparseVectorInput>;
const { indices, numDimensions, values } = sparseVector;
+ if (!(typeof numDimensions === 'number' && Number.isInteger(numDimensions) && numDimensions > 0)) {
+ return false;
+ }
+
+ const hasValidValues =
+ isVectorComponent(values) &&
+ Array.from(values).every(item => typeof item === 'number' && Number.isFinite(item));
+
+ const hasValidIndices =
+ (Array.isArray(indices) || indices instanceof Uint32Array) &&
+ Array.from(indices).every(index => Number.isInteger(index) && index >= 0 && index < numDimensions);
return (
- isVectorComponent(values) &&
- isVectorComponent(indices) &&
- typeof numDimensions === 'number' &&
- Number.isInteger(numDimensions) &&
- numDimensions > 0 &&
+ hasValidValues &&
+ hasValidIndices &&
values.length === indices.length
);
}Also applies to: 616-637
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/oracle/src/_internal/data-types-overrides.ts` around lines 593 -
608, The sparse vector type guard is too permissive: update
isOracleSparseVectorInput (and the analogous guard at 616-637) to validate that
indices are either a Uint32Array or an Array of integers, reject wider typed
arrays, and that each index is an integer, >= 0 and < numDimensions; also
validate values elements are finite numbers (use Number.isFinite) and keep the
length equality check. Concretely, inside isOracleSparseVectorInput iterate over
indices and values, enforce Number.isInteger(idx) && idx >= 0 && idx <
numDimensions for each index, ensure indices is instance of Uint32Array or plain
Array (not Float64Array/Int32Array/etc.), and enforce Number.isFinite(val) for
each value; keep existing numDimensions integer/positive check and values.length
=== indices.length as before.
Pull Request Checklist
Description of Changes
This PR adds
VECTORsupport with Oracle-first implementation, shared core abstractions, and full validation/test coverage for vector SQL generation paths.Core datatype & API groundwork
AbstractVECTORBase(shared behavior/validation)DataTypes.VECTORconstructor overloads:DataTypes.VECTORDataTypes.VECTOR(dimension)DataTypes.VECTOR(dimension, format)DataTypes.VECTOR({ dimension, format })cosineDistanceinnerProductl1Distancel2DistancevectorDistanceOracle integration
supports.dataTypes.VECTOR = true.SPARSE) and bind marshalling.usingdistanceaccuracyparametersorderTests
Guidance for other dialects
For future dialect integration, this PR establishes the recommended path:
supports.dataTypes.VECTOR.VECTORwhere generic option shape is sufficient, or extendAbstractVECTORBasefor dialect-specific option models.List of Breaking Changes
Summary by CodeRabbit
New Features
Tests