Releases · EntityProcess/agentv

Release list

v5.3.1-next.1 Pre-release

Pre-release

github-actions released this 06 Jul 08:08

v5.3.1-next.1

9f08f81

What's Changed

docs(migration): align skill and trajectory assertion docs by @christso in #1687
fix(core): reject authored use_target targets by @christso in #1688
docs: keep result publishing in global config by @christso in #1684
feat(config): add v5-native env_path/env_from environment injection by @christso in #1689
feat(eval): load explicit TypeScript eval configs by @christso in #1690
fix(config): reject removed eval case aliases by @christso in #1691
feat(config): lower scenarios into tests by @christso in #1693
docs(grading): align reference answers and artifacts by @christso in #1694
docs(eval): finish promptfoo authoring guidance by @christso in #1695

Full Changelog: v5.3.0-next.1...v5.3.1-next.1

Contributors

christso

Assets 2

v5.3.0-next.1 Pre-release

Pre-release

github-actions released this 06 Jul 02:01

v5.3.0-next.1

9fbdab5

What's Changed

feat(assertions): add promptfoo-compatible skill-used by @christso in #1682
feat(assertions): add promptfoo trajectory graders by @christso in #1683
fix(artifacts): flatten metrics sidecar by @christso in #1685
fix(schema): reject stale authored skill and trajectory graders by @christso in #1686

Full Changelog: v5.2.0-next.1...v5.3.0-next.1

Contributors

christso

Assets 2

v5.2.0-next.1 Pre-release

Pre-release

github-actions released this 05 Jul 23:55

v5.2.0-next.1

4aba60b

What's Changed

fix(sdk): align script grader results by @christso in #1659
fix(codex): speak app-server JSON-RPC protocol by @christso in #1665
fix(runtime): install sdk child runner deps by @christso in #1663
feat(core): parse Promptfoo llm-rubric results by @christso in #1664
feat(dashboard): link raw transcript artifacts by @christso in #1666
docs: capture Copilot runtime research by @christso in #1667
docs(adr): codify environment recipes for coding-agent evals by @christso in #1668
fix(runtime): unblock SDK provider live matrix for av-t2o5.6 by @christso in #1669
feat(eval): add environment recipe schema loader (av-noh3.2.2) by @christso in #1670
docs(environment): preserve reference audit guardrails by @christso in #1672
feat(runtime): execute host environment recipes by @christso in #1671
Implement recursive grading result artifacts by @christso in #1673
feat(dashboard): render component grading artifacts by @christso in #1674
feat(runtime): execute docker environment recipes by @christso in #1676
Snapshot environment recipe provenance in run artifacts (av-noh3.2.5) by @christso in #1675
docs(agent): clarify matrix authoring boundaries by @christso in #1677
fix(config): reject top-level eval imports by @christso in #1678
Require argv-only environment setup commands by @christso in #1679
docs(environment): teach environment recipe vocabulary by @christso in #1681
fix(contract): remove workspace testbed public surfaces by @christso in #1680

Full Changelog: v5.1.0-next.1...v5.2.0-next.1

Contributors

christso

Assets 2

v5.1.0-next.1 Pre-release

Pre-release

github-actions released this 05 Jul 04:04

v5.1.0-next.1

6f7b550

What's Changed

docs(adr): define transcript artifact contract by @christso in #1520
feat(results): write normalized transcript artifacts by @christso in #1521
docs: finalize inline experiment ADR by @christso in #1522
cleanup(results): remove legacy manifest artifact aliases by @christso in #1525
refactor(results): remove public trace artifact surface by @christso in #1526
feat(evaluation): inline experiment runtime in eval files by @christso in #1524
feat(dashboard): polish transcript timeline by @christso in #1527
fix(results): avoid duplicate raw provider logs by @christso in #1528
docs(verification): document codex local proxy dogfood by @christso in #1529
chore(entire): use manual commit strategy by @christso in #1530
Fix workspace baseline branch safety by @christso in #1531
docs: document composite AND/OR patterns by @christso in #1532
docs: update composite strict-or example to bun run by @christso in #1533
fix(targets): use flat Copilot SDK provider config by @christso in #1534
test(eval): guard workspace path override compatibility by @christso in #1535
chore: remove accidental root artifacts by @christso in #1536
fix(skills): default agentv-bench to CLI mode by @christso in #1537
feat(results): rename result directory row field by @christso in #1540
docs(results): codify eval result identity contract by @christso in #1539
feat(eval): compose imported experiment defaults by @christso in #1541
feat(dashboard): identify result rows by eval path by @christso in #1543
fix(results): keep dashboard status read-only for repo paths by @christso in #1542
docs(verification): clarify contract dogfood evidence by @christso in #1544
docs(eval): clarify eval authoring contracts by @christso in #1545
docs(schema): record benchmark primitive decision by @christso in #1546
ci: publish commit-addressed build artifacts by @christso in #1548
fix(eval): tighten workspace composition contracts by @christso in #1547
fix(eval): remove dry-run mock execution by @christso in #1550
fix(eval): move local workspace binding out of eval YAML by @christso in #1549
fix(eval): reject per-case target selection by @christso in #1551
fix(cli): remove promptfoo import support by @christso in #1552
fix(results): remove persistent require push config by @christso in #1553
feat(dashboard): cache git-backed result lists by @christso in #1554
fix(results): ignore missing WIP branch cleanup by @christso in #1555
fix(results): isolate target bundles under run timestamps by @christso in #1558
[codex] fix(config): remove flat results remote alias by @christso in #1562
docs(adr): clarify result row sidecar identity by @christso in #1556
[codex] Rename result row manifest to run_manifest by @christso in #1563
feat(results): simplify project + results config to one flat symmetric schema by @christso in #1565
refactor(projects): drop redundant project name field, display id by @christso in #1566
Support flatter eval imports by @christso in #1567
docs(results): document result artifact contract by @christso in #1568
fix(results): restore index jsonl artifact by @christso in #1569
feat(core)!: require canonical assertions, remove assert authoring alias by @christso in #1571
[codex] Support eval policy config by @christso in #1576
fix(artifacts): rename generated task bundle to test bundle by @christso in #1574
docs(readme): modernize quick start eval and fix agentv compare example by @christso in #1572
refactor(core): centralize case conversion boundary by @christso in #1573
docs(readme): document experiment eval contract by @christso in #1577
fix(schema): reject authored eval YAML workers by @christso in #1578
fix(artifacts): keep eval runs at timestamp root by @christso in #1580
docs(readme): define singular target eval contract by @christso in #1579
Write file_changes diff sidecar by @christso in #1581
feat(eval): canonicalize repeat policy schema by @christso in #1582
fix(artifacts): keep run status in agentv format by @christso in #1584
feat(artifacts): write eval runs at results root by @christso in #1583
docs(readme): align results layout with run root by @christso in #1585
feat(evals): support default_test threshold by @christso in #1588
docs(adr): stabilize eval authoring contract by @christso in #1587
fix(eval): default repo workspaces to fresh materialization by @christso in #1586
feat(eval): author experiment as promptfoo-style tags.experiment (av-4hzh) by @christso in #1589
docs(experiment): label runs via tags.experiment, drop top-level experiment by @christso in #1590
feat(dashboard): Tags tab — group runs by any promptfoo tag key; remove legacy manual tags (av-qsxw) by @christso in #1591
docs(concepts): define Tags concept; note manual tags removed (av-qsxw) by @christso in #1593
docs(agents): make .agents/*.md guide reads a required, trigger-keyed gate by @christso in #1595
docs(plan): eval authoring restructure — promptfoo superset (DRAFT) by @christso in #1594
docs(plans): add promptfoo-compatible extensions plan by @christso in #1592
docs(agents): clarify worker guidance and cc-mirror docs by @christso in #1596
feat(core): adopt nunjucks eval templating by @christso in #1597
feat(eval): add lifecycle extensions and agent rules by @christso in #1607
Canonicalize targets around labels by @christso in #1598
feat(eval): normalize transcript artifacts by @christso in #1600
feat(core): load file-backed datasets by @christso in #1601
docs: plan CLI command surface alignment by @christso in #1606
[codex] Add promptfoo assertion grader surface by @christso in #1599
feat(evals): add prompt instance expansion by @christso in #1602
feat: align repeat config with attempt artifacts by @christso in #1608
Grading contract: assertion_results rows by @christso in #1603
feat(eval): add rerun-failed runner pooling by @christso in #1609
feat(web): add versioned docs snapshot by @christso in #1611
refactor(targets): rename request batching field by @christso in #1612
feat(eval): finalize promptfoo-aligned restructure by @christso in #1610
feat(web): add Next docs version snapshot by @christso in #1613
docs(eval): align llm-rubric quickstart by @christso in https://github.com/EntityProcess/agentv/...

Contributors

christso

Assets 2

v5.0.0-next.1 Pre-release

Pre-release

github-actions released this 26 Jun 02:30

v5.0.0-next.1

cc49db6

What's Changed

fix(docker): drop COPY of removed phoenix-adapter package by @christso in #1499
fix(dashboard): show file contents inline in run Files tab by @christso in #1501
fix(results): push same-repo results to project origin, not a synthesized URL by @christso in #1502
fix(dashboard): overlay git artifacts into the existing run file tree by @christso in #1504
Separate experiments from eval definitions by @christso in #1500
docs: compound experiment separation learning by @christso in #1505
docs: design conflict-free results sync without force push by @christso in #1503
feat(results): no-force-push results sync (auto-merge loop) by @christso in #1506
feat(results): temp-branch fallback + OK-to-resync (no force push) by @christso in #1507
feat(dashboard): pending-merge card + OK-to-resync (av-raf.4) by @christso in #1508
docs: no-force-push results sync behavior (av-raf) by @christso in #1509
feat(results): signal auto-merged remote pushes by @christso in #1510
feat(experiments): support suite test selectors by @christso in #1514
fix(cli): avoid duplicate createRequire binding by @christso in #1515
feat(results): restructure AgentV run artifacts by @christso in #1513
docs(eval): clarify expected_output assertion contract by @christso in #1516
docs: clarify eval and experiment boundaries by @christso in #1517
docs: migrate ADRs to adrs format by @christso in #1518
feat(config): support local config overlays by @christso in #1519

Full Changelog: v4.44.0-next.1...v5.0.0-next.1

Contributors

christso

Assets 2

v4.44.0-next.1 Pre-release

Pre-release

github-actions released this 23 Jun 09:15

v4.44.0-next.1

cbee02b

What's Changed

fix(pi-cli): route endpoint overrides through azure provider by @christso in #1494
docs(providers): capture SDK runtime isolation guidance by @christso in #1498
feat(workspace): read git_cache.mirrors from project-local .agentv/config.yaml by @christso in #1497

Full Changelog: v4.43.0-next.1...v4.44.0-next.1

Contributors

christso

Assets 2

v4.43.0-next.1 Pre-release

Pre-release

github-actions released this 23 Jun 06:06

v4.43.0-next.1

18b5ba4

What's Changed

feat(dashboard): preserve context for result row details by @christso in #1449
merge final reviewed results stack by @christso in #1462
docs(agents): require PR-based merges by @christso in #1463
test(evals): add PR workflow guard self-eval by @christso in #1464
fix(results): make sidecar artifacts canonical by @christso in #1465
feat(dashboard): define trace session read model by @christso in #1468
docs(results): specify storage retention oplog plan by @christso in #1471
feat(evals): support suite input shorthand by @christso in #1469
docs(phoenix): document read-only AgentV boundary by @christso in #1470
eval/dashboard: read linked Phoenix sessions by @christso in #1472
feat(dashboard): serve trace session artifacts by @christso in #1473
chore(package): remove deprecated @agentv/eval package by @christso in #1475
style(brand): dim AgentV wordmark middle letters by @christso in #1476
feat(core): normalize trace artifacts for dashboard by @christso in #1474
fix(beads): harden worktree identity checks by @christso in #1460
feat(results): emit metrics sidecar by @christso in #1478
feat: add Vitest workspace verifier adapter by @christso in #1480
fix(beads): guard worktree tracker state by @christso in #1482
refactor(dashboard): remove Phoenix read-through UI by @christso in #1481
feat(cli): infer Vitest verifier eval commands by @christso in #1483
fix(results): preserve configured git identity by @christso in #1484
feat(sdk): expose evaluate api by @christso in #1485
fix(providers): support Copilot SDK OpenAI endpoint parity runs by @christso in #1479
refactor: remove private Phoenix adapter package by @christso in #1486
[codex] Support dashboard remote result branches by @christso in #1477
docs: standardize contract design references by @christso in #1487
docs: design repeat-run reliability by @christso in #1488
feat(results): move artifacts under results experiments by @christso in #1489
feat(dashboard): set run experiment and tags by @christso in #1490
docs: design remote metadata conflict resolution by @christso in #1491
fix(results): handle push conflicts with backup policy by @christso in #1492
fix(agent-providers): stabilize Copilot and Pi SDK runs by @christso in #1493

Full Changelog: v4.42.4...v4.43.0-next.1

Contributors

christso

Assets 2

v4.42.4 Latest

Latest

github-actions released this 19 Jun 12:51

v4.42.4

7195137

What's Changed

fix(evals): require self-eval guidance reads by @christso in #1447
fix(results): root results branch at a stable deterministic genesis by @christso in #1448

Full Changelog: v4.42.3...v4.42.4

Contributors

christso

Assets 2

v4.42.4-next.1 Pre-release

Pre-release

github-actions released this 19 Jun 12:47

v4.42.4-next.1

451dc68

What's Changed

fix(evals): require self-eval guidance reads by @christso in #1447
fix(results): root results branch at a stable deterministic genesis by @christso in #1448

Full Changelog: v4.42.3...v4.42.4-next.1

Contributors

christso

Assets 2

v4.42.3

github-actions released this 19 Jun 11:09

v4.42.3

f9ab3af

What's Changed

fix(artifacts): write trace sidecars to canonical path by @christso in #1446

Full Changelog: v4.42.2...v4.42.3

Contributors

christso

Assets 2

Uh oh!

Releases: EntityProcess/agentv

Release list

v5.3.1-next.1

What's Changed

Contributors

Uh oh!

v5.3.0-next.1

What's Changed

Contributors

Uh oh!

v5.2.0-next.1

What's Changed

Contributors

Uh oh!

v5.1.0-next.1

What's Changed

Contributors

Uh oh!

v5.0.0-next.1

What's Changed

Contributors

Uh oh!

v4.44.0-next.1

What's Changed

Contributors

Uh oh!

v4.43.0-next.1

What's Changed

Contributors

Uh oh!

v4.42.4

What's Changed

Contributors

Uh oh!

v4.42.4-next.1

What's Changed

Contributors

Uh oh!

v4.42.3

What's Changed

Contributors

Uh oh!