ReconBench: benchmarking LLMs on cardiac signaling network reconstruction

An Inspect-native replication and extension of Tewari et al. (2025). Recall reproduces, precision and F1 are new, and the main failure traces to an extraction format mismatch rather than a reasoning gap.

April 27, 2026 · 8 min · Robert Amanfu