mimic iv #2008
Unanswered
netanelcyber
asked this question in
MIMIC-IV
mimic iv
#2008
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi MIT-LCP team,
I’m reaching out specifically in the context of MIMIC’s role as a foundational infrastructure for reproducible clinical research.
While developing a reproducible ICU pathogen prediction pipeline using MIMIC-IV (https://github.com/netanelcyber/PenuX), we encountered what appears to be a systematic evaluation gap that may be relevant to the broader MIMIC ecosystem.
Core observation
Across multiple standard modeling approaches built on MIMIC-IV-derived cohorts, we consistently observed that:
Most importantly, these effects appear not model-specific, but instead evaluation-design dependent.
Why this is relevant to MIMIC’s role
Given that MIMIC is widely used as a benchmarking substrate for clinical ML, this suggests a potential gap between:
In practice, this means that two studies using identical MIMIC Code pipelines may still report substantially different “robustness” depending on evaluation protocol choices that are currently not standardized.
Proposal (infrastructure-level, not project-level)
Rather than addressing this at the level of individual models, it may be useful to consider whether the MIMIC ecosystem could benefit from an optional evaluation extension layer, for example:
Beta Was this translation helpful? Give feedback.
All reactions