feat(document-api): implement doc.extract() for RAG content extraction (SD-2525)#2774
Conversation
|
Preview deployment for your docs. Learn more about Mintlify Previews.
💡 Tip: Enable Workflows to automatically generate PRs for you. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7e4827d3ab
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
packages/super-editor/src/editors/v1/document-api-adapters/extract-adapter.ts
Outdated
Show resolved
Hide resolved
7e4827d to
5cb7735
Compare
…n (SD-2525) Single API method that returns all document content with stable IDs — blocks with full text, comments with anchored block references, and tracked changes with excerpts. Every ID works directly with scrollToElement() for citation navigation.
- Use canonical getHeadingLevel() instead of divergent local regex - Reuse collectTopLevelBlocks() instead of duplicating block traversal - Add required fields to extract output JSON schema - Remove fixture-only unit tests that don't call executeExtract - Add behavior tests: headings, comments, tracked changes, scrollToElement round-trip
beb1257 to
565e4a3
Compare
332999b to
10b1403
Compare
…-for-rag-pipelines
|
🎉 This PR is included in @superdoc-dev/react v1.0.0-next.38 The release is available on GitHub release |
|
🎉 This PR is included in esign v2.2.0-next.42 The release is available on GitHub release |
|
🎉 This PR is included in vscode-ext v1.1.0-next.84 |
|
🎉 This PR is included in template-builder v1.3.0-next.44 The release is available on GitHub release |
|
🎉 This PR is included in superdoc v1.24.0-next.81 The release is available on GitHub release |
|
🎉 This PR is included in superdoc-cli v0.5.0-next.82 The release is available on GitHub release |
|
🎉 This PR is included in superdoc-sdk v1.3.0-next.83 |
…n (SD-2525) (#2774) * feat(document-api): implement doc.extract() for RAG content extraction (SD-2525) Single API method that returns all document content with stable IDs — blocks with full text, comments with anchored block references, and tracked changes with excerpts. Every ID works directly with scrollToElement() for citation navigation. * fix(document-api): review fixes — heading regex, schema required, tests - Use canonical getHeadingLevel() instead of divergent local regex - Reuse collectTopLevelBlocks() instead of duplicating block traversal - Add required fields to extract output JSON schema - Remove fixture-only unit tests that don't call executeExtract - Add behavior tests: headings, comments, tracked changes, scrollToElement round-trip * fix(tests): remove superdoc.click() — fixture uses type() for focus * fix(cli): add extract operation hints for CLI/SDK wiring
|
🎉 This PR is included in vscode-ext v2.3.0-next.1 |
|
🎉 This PR is included in template-builder v1.5.0-next.1 The release is available on GitHub release |
|
🎉 This PR is included in esign v2.3.0-next.1 The release is available on GitHub release |
|
🎉 This PR is included in superdoc v1.26.0-next.1 The release is available on GitHub release |
|
🎉 This PR is included in superdoc-cli v0.7.0-next.1 The release is available on GitHub release |
Single API method that extracts all document content with stable IDs for RAG pipelines.
editor.doc.extract()returns blocks with full text, comments with anchored block references, and tracked changes with excerptsscrollToElement()for citation navigationtextPreviewfromblocks.list)Usage:
Closes SD-2525