Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds an optional short_text preview field for text/* UTF-8 message bodies, configurable via FMSG_API_SHORT_TEXT_SIZE, so clients can display inline previews without downloading full message data.
Changes:
- Adds
short_textto message JSON responses/models and computes it from stored message body files. - Introduces
FMSG_API_SHORT_TEXT_SIZEconfiguration and wires it intoMessageHandler. - Adds unit tests for MIME detection, safe path checking, and short-text extraction/truncation behavior.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
src/models/models.go |
Adds ShortText to the core Message response model. |
src/main.go |
Reads FMSG_API_SHORT_TEXT_SIZE and passes it to the message handler. |
src/handlers/messages.go |
Computes short_text in list/sent/single-message flows; adds helpers for safe path checking and text MIME detection. |
src/handlers/messages_test.go |
Adds coverage for isTextMIME, safeDataPath, and extractShortText. |
README.md |
Documents the new env var and the short_text response field. |
| msg.PID = pid | ||
| msg.Time = timeSent | ||
| msg.HasPid = pid != nil | ||
| msg.ShortText = h.extractShortText(dataPath, msg.Type) |
There was a problem hiding this comment.
fetchMessage now computes msg.ShortText by reading the message body from disk. This happens before any authorization checks in callers like Get/Update/Delete/Send, so an unauthenticated/unauthorized request can still trigger filesystem reads (and potentially amplify load by probing many IDs). Suggest computing ShortText only after auth has passed (e.g., in Get/List/Sent), or making it an explicit/optional part of fetchMessage so non-response paths don’t hit disk.
| msg.ShortText = h.extractShortText(dataPath, msg.Type) | |
| _ = dataPath |
| cleanPath := filepath.Clean(dataPath) | ||
| cleanDataDir := filepath.Clean(dataDir) | ||
| if !strings.HasPrefix(cleanPath, cleanDataDir+string(filepath.Separator)) { | ||
| return "", false | ||
| } |
There was a problem hiding this comment.
safeDataPath’s docstring says it returns a "cleaned absolute path", but the implementation only calls filepath.Clean and may return a relative path if dataPath/dataDir are relative. Either adjust the comment to match behavior, or enforce absoluteness (e.g., filepath.Abs on both and compare using filepath.Rel) so callers can rely on the guarantee.
| cleanPath := filepath.Clean(dataPath) | |
| cleanDataDir := filepath.Clean(dataDir) | |
| if !strings.HasPrefix(cleanPath, cleanDataDir+string(filepath.Separator)) { | |
| return "", false | |
| } | |
| absPath, err := filepath.Abs(dataPath) | |
| if err != nil { | |
| return "", false | |
| } | |
| absDataDir, err := filepath.Abs(dataDir) | |
| if err != nil { | |
| return "", false | |
| } | |
| cleanPath := filepath.Clean(absPath) | |
| cleanDataDir := filepath.Clean(absDataDir) | |
| relPath, err := filepath.Rel(cleanDataDir, cleanPath) | |
| if err != nil { | |
| return "", false | |
| } | |
| if relPath == ".." || strings.HasPrefix(relPath, ".."+string(filepath.Separator)) { | |
| return "", false | |
| } |
| the stored body is valid UTF-8. It contains up to `FMSG_API_SHORT_TEXT_SIZE` | ||
| bytes (default 768) of the body, truncated on a UTF-8 rune boundary, so UIs | ||
| can render a preview without a separate `GET /fmsg/:id/data` round-trip. |
There was a problem hiding this comment.
Documentation: code supports disabling short_text by setting FMSG_API_SHORT_TEXT_SIZE to 0 (extractShortText returns "" when <= 0), but the README section only describes the max-bytes behavior. Consider documenting the disable behavior so operators know how to turn the feature off if needed.
| the stored body is valid UTF-8. It contains up to `FMSG_API_SHORT_TEXT_SIZE` | |
| bytes (default 768) of the body, truncated on a UTF-8 rune boundary, so UIs | |
| can render a preview without a separate `GET /fmsg/:id/data` round-trip. | |
| the stored body is valid UTF-8. When `FMSG_API_SHORT_TEXT_SIZE` is greater than | |
| `0`, it contains up to that many bytes (default 768) of the body, truncated on | |
| a UTF-8 rune boundary, so UIs can render a preview without a separate | |
| `GET /fmsg/:id/data` round-trip. Set `FMSG_API_SHORT_TEXT_SIZE=0` to disable | |
| `short_text` generation. |
| // If the buffer ends in the middle of a multi-byte UTF-8 sequence, drop | ||
| // the trailing partial rune so the result is valid UTF-8. | ||
| for len(buf) > 0 && !utf8.RuneStart(buf[len(buf)-1]) { | ||
| buf = buf[:len(buf)-1] | ||
| } | ||
| // Drop the last rune if it is itself the start of an incomplete sequence | ||
| // (e.g. a 0xC2 lead byte with no continuation). | ||
| if len(buf) > 0 { | ||
| last := buf[len(buf)-1] | ||
| switch { | ||
| case last < 0x80: | ||
| // single-byte rune, complete | ||
| case last >= 0xC0: | ||
| // lead byte with no continuation bytes following | ||
| buf = buf[:len(buf)-1] | ||
| } | ||
| } | ||
|
|
||
| if !utf8.Valid(buf) { |
There was a problem hiding this comment.
extractShortText's UTF-8 truncation logic currently strips any trailing multi-byte rune, even when the buffer ends on a valid rune boundary (it removes continuation bytes until a RuneStart, then drops that lead byte). This can turn otherwise-valid UTF-8 prefixes into invalid data and cause the function to return "" unexpectedly (e.g., when the read ends exactly at a multi-byte rune boundary). Consider instead trimming to the largest valid UTF-8 prefix (e.g., decrement end index until utf8.Valid(prefix) is true) or using utf8.FullRune to detect only incomplete trailing sequences.
| // If the buffer ends in the middle of a multi-byte UTF-8 sequence, drop | |
| // the trailing partial rune so the result is valid UTF-8. | |
| for len(buf) > 0 && !utf8.RuneStart(buf[len(buf)-1]) { | |
| buf = buf[:len(buf)-1] | |
| } | |
| // Drop the last rune if it is itself the start of an incomplete sequence | |
| // (e.g. a 0xC2 lead byte with no continuation). | |
| if len(buf) > 0 { | |
| last := buf[len(buf)-1] | |
| switch { | |
| case last < 0x80: | |
| // single-byte rune, complete | |
| case last >= 0xC0: | |
| // lead byte with no continuation bytes following | |
| buf = buf[:len(buf)-1] | |
| } | |
| } | |
| if !utf8.Valid(buf) { | |
| // Trim to the largest valid UTF-8 prefix so that we only drop trailing | |
| // incomplete bytes, while preserving complete multi-byte runes that end | |
| // exactly at the buffer boundary. | |
| for len(buf) > 0 && !utf8.Valid(buf) { | |
| buf = buf[:len(buf)-1] | |
| } | |
| if len(buf) == 0 { |
|
@copilot apply changes based on the comments in this thread |
…TF-8 fix, README update Agent-Logs-Url: https://github.com/markmnl/fmsg-webapi/sessions/8c2bb90d-2c10-4faa-8752-688a147715d8 Co-authored-by: markmnl <2630321+markmnl@users.noreply.github.com>
Applied all 4 changes in commit
|
No description provided.