Improve BORIS/CowLog compatibility: flexible schemas, richer parsing, timecode/frame-rate handling, and exports#37
Conversation
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughSummary by CodeRabbit
WalkthroughThis pull request implements comprehensive BORIS and CowLog compatibility enhancements, including improved timestamp parsing with frame rate support, enhanced CowLog plain-text parsing with metadata extraction, expanded tabular import handling, refactored session import logic with multi-observation merging, and extended CowLog export capabilities. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 4 |
| Duplication | 4 |
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
There was a problem hiding this comment.
Code Review
This pull request significantly enhances compatibility with BORIS and CowLog formats by introducing support for multi-observation merging, mapping-based event/annotation structures, and advanced timecode parsing (including ISO8601, SMPTE, and frame-based formats). Key improvements include metadata extraction from CowLog headers, automatic delimiter detection for tabular files, and expanded export capabilities. The review feedback highlights an opportunity to improve the note-appending logic to avoid potential data loss or unwanted reformatting in the session notes field, and suggests adding explicit user warnings when encountering invalid timestamps in CowLog files to aid in data diagnostics.
| def _append_note_line(existing: str | None, line: str, *, max_length: int = 2000) -> str: | ||
| """Append one note line while avoiding duplicate entries.""" | ||
| current = (existing or '').strip() | ||
| candidate = (line or '').strip() | ||
| if not candidate: | ||
| return current[:max_length] | ||
| lines = [item.strip() for item in current.splitlines() if item.strip()] | ||
| if candidate in lines: | ||
| return '\n'.join(lines)[:max_length] | ||
| lines.append(candidate) | ||
| return '\n'.join(lines)[:max_length] |
There was a problem hiding this comment.
The _append_note_line function collapses all existing empty lines in the notes and hardcodes a max_length of 2000. Since ObservationSession.notes is a TextField that can store much larger amounts of data, this implementation could lead to accidental data loss or unwanted reformatting of existing user notes. It is recommended to avoid global reformatting and truncation of the entire field.
| def _append_note_line(existing: str | None, line: str, *, max_length: int = 2000) -> str: | |
| """Append one note line while avoiding duplicate entries.""" | |
| current = (existing or '').strip() | |
| candidate = (line or '').strip() | |
| if not candidate: | |
| return current[:max_length] | |
| lines = [item.strip() for item in current.splitlines() if item.strip()] | |
| if candidate in lines: | |
| return '\n'.join(lines)[:max_length] | |
| lines.append(candidate) | |
| return '\n'.join(lines)[:max_length] | |
| def _append_note_line(existing: str | None, line: str) -> str: | |
| """Append one note line while avoiding duplicate entries.""" | |
| current = (existing or '').strip() | |
| candidate = (line or '').strip() | |
| if not candidate: | |
| return current | |
| lines = [item.strip() for item in current.splitlines()] | |
| if candidate in lines: | |
| return current | |
| return (current + '\n' + candidate).strip() if current else candidate |
| timestamp_decimal = _decimal(parts[0], default='NaN', frame_rate=detected_frame_rate) | ||
| if timestamp_decimal.is_nan(): | ||
| continue |
There was a problem hiding this comment.
Invalid timestamps in CowLog text files are silently skipped. It would be better to add a warning to the warnings list, including the line number and the invalid value, to help users diagnose issues with their data.
| timestamp_decimal = _decimal(parts[0], default='NaN', frame_rate=detected_frame_rate) | |
| if timestamp_decimal.is_nan(): | |
| continue | |
| timestamp_decimal = _decimal(parts[0], default='NaN', frame_rate=detected_frame_rate) | |
| if timestamp_decimal.is_nan(): | |
| warnings.append( | |
| _('Line %(line)s: invalid time value “%(value)s”.') | |
| % {'line': line_number, 'value': parts[0]} | |
| ) | |
| continue |
|



Motivation
Description
_schema_matches,_is_supported_*) and coerce functions (_coerce_object_rows) to accept both mapping and list representations forevents,annotationsandsegments._decimaland normalize frame-rate handling with_normalize_frame_rate_tokento parse ISO8601 durations, HH:MM:SS:FF (with fps), SMPTE semicolon formats and comma decimals.parse_cowlog_results_textandparse_tabular_session_rowsto detect metadata lines, parse CowLog annotation metadata, accept semicolon-delimited rows, useshlex-based tokenization, and emitmetadata,annotationsandframe_rateinformation in the import report.session.notesusing_append_note_line.ObservationVariableValuewhen present, and include observer/fps/annotations insession_export_cowlog_txtoutput.plan.mdoutlining further BORIS/CowLog compatibility work.Testing
tracker/tests/*covering CowLog text parsing (timecodes, ISO durations, frame rates, metadata annotations), tabular BORIS timecodes and durations, mapping-style rows and multi-observation merges, and export checks for CowLog metadata.python manage.py test tracker) which executed the new tests and existing compatibility/roundtrip tests, and they passed.Codex Task