Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,8 @@ Formula-like CSV text fields are neutralized with a leading single quote so spre
When an input spans multiple hostnames, both reports add compact host-level summaries without changing detector thresholds or introducing cross-host correlation logic.
Markdown table fields escape table separators, line breaks, and HTML-sensitive characters so unusual log tokens cannot break report layout.

For the report artifact contract and golden fixture map, see [`docs/report-artifacts.md`](./docs/report-artifacts.md).

## Sample Output

For sanitized sample input, see [`assets/sample_auth.log`](./assets/sample_auth.log) and [`assets/sample_journalctl_short_full.log`](./assets/sample_journalctl_short_full.log).
Expand Down
72 changes: 72 additions & 0 deletions docs/report-artifacts.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# Report Artifacts

LogLens writes deterministic offline artifacts for reviewer inspection and downstream tooling.

## Artifact Set

| Artifact | When written | Review purpose |
| --- | --- | --- |
| `report.md` | Every successful run | Human-readable triage report with summary, findings, event counts, parser quality, and parser warnings |
| `report.json` | Every successful run | Machine-readable report with the same core evidence and parser telemetry |
| `findings.csv` | Only when `--csv` is set | Spreadsheet-friendly finding rows |
| `warnings.csv` | Only when `--csv` is set | Spreadsheet-friendly parser warning rows |

Without `--csv`, LogLens does not create, overwrite, or delete existing CSV files in the output directory.

## JSON Contract

The JSON report keeps parser observability visible next to findings:

- `tool`
- `input`
- `input_mode`
- `assume_year` for syslog-style input when a year is supplied
- `timezone_present`
- `parser_quality.total_input_lines`
- `parser_quality.total_lines`
- `parser_quality.skipped_blank_lines`
- `parser_quality.parsed_lines`
- `parser_quality.unparsed_lines`
- `parser_quality.parse_success_rate`
- `parser_quality.top_unknown_patterns`
- `parsed_event_count`
- `warning_count`
- `finding_count`
- `event_counts`
- `host_summaries` when more than one hostname is represented
- `findings`
- `warnings`

Finding objects contain `rule`, `subject_kind`, `subject`, `event_count`, `window_start`, `window_end`, `usernames`, and `summary`.

Warning objects contain the original `line_number` and the parser `reason`.

## CSV Contract

The optional CSV exports intentionally stay small:

- `findings.csv`: `rule`, `subject_kind`, `subject`, `event_count`, `window_start`, `window_end`, `usernames`, `summary`
- `warnings.csv`: `kind`, `line_number`, `message`

Formula-like CSV text fields are neutralized with a leading single quote so spreadsheet tools treat them as text.

## Markdown Safety

Markdown table fields escape table separators, line breaks, HTML-sensitive characters, and control characters. Unusual log tokens should not be able to break report layout.

## Golden Fixtures

The report contracts are backed by generated fixture artifacts:

| Fixture case | Golden artifacts |
| --- | --- |
| [`syslog_legacy`](../tests/fixtures/report_contracts/syslog_legacy) | `report.md`, `report.json`, `findings.csv`, `warnings.csv` |
| [`journalctl_short_full`](../tests/fixtures/report_contracts/journalctl_short_full) | `report.md`, `report.json` |
| [`multi_host_syslog_legacy`](../tests/fixtures/report_contracts/multi_host_syslog_legacy) | `report.md`, `report.json`, `findings.csv`, `warnings.csv` |
| [`multi_host_journalctl_short_full`](../tests/fixtures/report_contracts/multi_host_journalctl_short_full) | `report.md`, `report.json` |

The enforcement lives in [`tests/test_report_contracts.cpp`](../tests/test_report_contracts.cpp). The focused report writer tests live in [`tests/test_report.cpp`](../tests/test_report.cpp).

## Boundaries

Reports are triage aids. They are not SIEM evidence, incident verdicts, attribution claims, or cross-host correlation output. Host summaries are compact per-host rollups; they do not change detector thresholds.
1 change: 1 addition & 0 deletions docs/reviewer-path.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ Inspect:
- [`assets/sample_journalctl_short_full.log`](../assets/sample_journalctl_short_full.log)
- [`tests/fixtures/report_contracts/syslog_legacy/report.md`](../tests/fixtures/report_contracts/syslog_legacy/report.md)
- [`tests/fixtures/report_contracts/syslog_legacy/report.json`](../tests/fixtures/report_contracts/syslog_legacy/report.json)
- [`docs/report-artifacts.md`](./report-artifacts.md)
- [`docs/parser-contract.md`](./parser-contract.md)

Look for parser coverage fields:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,9 @@
"input_mode": "journalctl_short_full",
"timezone_present": true,
"parser_quality": {
"total_input_lines": 16,
"total_lines": 16,
"skipped_blank_lines": 0,
"parsed_lines": 14,
"unparsed_lines": 2,
"parse_success_rate": 0.8750,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,9 @@
- Input: `tests/fixtures/report_contracts/journalctl_short_full/input.log`
- Input mode: journalctl_short_full
- Timezone present: true
- Total input lines: 16
- Total lines: 16
- Skipped blank lines: 0
- Parsed lines: 14
- Unparsed lines: 2
- Parse success rate: 87.50%
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,9 @@
"input_mode": "journalctl_short_full",
"timezone_present": true,
"parser_quality": {
"total_input_lines": 15,
"total_lines": 15,
"skipped_blank_lines": 0,
"parsed_lines": 12,
"unparsed_lines": 3,
"parse_success_rate": 0.8000,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,9 @@
- Input: `tests/fixtures/report_contracts/multi_host_journalctl_short_full/input.log`
- Input mode: journalctl_short_full
- Timezone present: true
- Total input lines: 15
- Total lines: 15
- Skipped blank lines: 0
- Parsed lines: 12
- Unparsed lines: 3
- Parse success rate: 80.00%
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,9 @@
"assume_year": 2026,
"timezone_present": false,
"parser_quality": {
"total_input_lines": 15,
"total_lines": 15,
"skipped_blank_lines": 0,
"parsed_lines": 12,
"unparsed_lines": 3,
"parse_success_rate": 0.8000,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,9 @@
- Input mode: syslog_legacy
- Assume year: 2026
- Timezone present: false
- Total input lines: 15
- Total lines: 15
- Skipped blank lines: 0
- Parsed lines: 12
- Unparsed lines: 3
- Parse success rate: 80.00%
Expand Down
2 changes: 2 additions & 0 deletions tests/fixtures/report_contracts/syslog_legacy/report.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,9 @@
"assume_year": 2026,
"timezone_present": false,
"parser_quality": {
"total_input_lines": 16,
"total_lines": 16,
"skipped_blank_lines": 0,
"parsed_lines": 14,
"unparsed_lines": 2,
"parse_success_rate": 0.8750,
Expand Down
2 changes: 2 additions & 0 deletions tests/fixtures/report_contracts/syslog_legacy/report.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,9 @@
- Input mode: syslog_legacy
- Assume year: 2026
- Timezone present: false
- Total input lines: 16
- Total lines: 16
- Skipped blank lines: 0
- Parsed lines: 14
- Unparsed lines: 2
- Parse success rate: 87.50%
Expand Down
4 changes: 4 additions & 0 deletions tests/test_report_contracts.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,9 @@ std::vector<std::string> extract_markdown_contract_lines(const std::string& mark
|| starts_with(line, "- Input mode: ")
|| starts_with(line, "- Assume year: ")
|| starts_with(line, "- Timezone present: ")
|| starts_with(line, "- Total input lines: ")
|| starts_with(line, "- Total lines: ")
|| starts_with(line, "- Skipped blank lines: ")
|| starts_with(line, "- Parsed lines: ")
|| starts_with(line, "- Unparsed lines: ")
|| starts_with(line, "- Parse success rate: ")
Expand Down Expand Up @@ -140,7 +142,9 @@ std::vector<std::string> extract_json_contract_lines(const std::string& json) {
|| starts_with(line, "\"input_mode\": ")
|| starts_with(line, "\"assume_year\": ")
|| starts_with(line, "\"timezone_present\": ")
|| starts_with(line, "\"total_input_lines\": ")
|| starts_with(line, "\"total_lines\": ")
|| starts_with(line, "\"skipped_blank_lines\": ")
|| starts_with(line, "\"parsed_lines\": ")
|| starts_with(line, "\"unparsed_lines\": ")
|| starts_with(line, "\"parse_success_rate\": ")
Expand Down
Loading