Fact Extraction Steering Matrix
Source: fact-extraction-steering-matrix.md | Rendered: 2026-04-07 23:55:28 UTC
Tracks local LM Studio runs for buried-fact extraction quality.
Each run compares models and prompt sets on one synthetic corpus.
Run 2026-04-07 21:48:50 UTC
| model |
prompt set |
docs |
precision |
recall |
f1 |
uncertainty acc |
uncertain recall |
evidence support |
invalid predicate |
invalid arity |
parse failures |
overconfident |
notes |
| qwen/qwen3.5-9b |
strict_v1 |
2 |
0.0% |
0.0% |
0.0% |
0.0% |
0.0% |
0.0% |
0 |
0 |
0 |
0 |
ok |
Run 2026-04-07 21:51:04 UTC
| model |
prompt set |
docs |
precision |
recall |
f1 |
uncertainty acc |
uncertain recall |
evidence support |
invalid predicate |
invalid arity |
parse failures |
overconfident |
notes |
| qwen/qwen3.5-9b |
strict_v1 |
2 |
0.0% |
0.0% |
0.0% |
0.0% |
0.0% |
0.0% |
0 |
0 |
2 |
0 |
parse_failures_present |
Run 2026-04-07 22:12:24 UTC
| model |
prompt set |
docs |
precision |
recall |
f1 |
uncertainty acc |
uncertain recall |
evidence support |
invalid predicate |
invalid arity |
parse failures |
overconfident |
notes |
| qwen/qwen3.5-9b |
strict_v1 |
2 |
100.0% |
50.0% |
66.7% |
50.0% |
50.0% |
50.0% |
0 |
0 |
1 |
0 |
parse_failures_present |
Run 2026-04-07 22:21:46 UTC
| model |
prompt set |
docs |
precision |
recall |
f1 |
uncertainty acc |
uncertain recall |
evidence support |
invalid predicate |
invalid arity |
parse failures |
overconfident |
notes |
| qwen3.5-4b |
baseline_v1 |
8 |
100.0% |
44.4% |
61.5% |
50.0% |
37.5% |
50.0% |
0 |
0 |
4 |
0 |
parse_failures_present |
| qwen3.5-4b |
strict_v1 |
8 |
100.0% |
44.4% |
61.5% |
37.5% |
25.0% |
37.5% |
0 |
0 |
5 |
0 |
parse_failures_present |
| qwen/qwen3.5-9b |
baseline_v1 |
8 |
100.0% |
55.6% |
71.4% |
62.5% |
50.0% |
62.5% |
0 |
0 |
3 |
0 |
parse_failures_present |
| qwen/qwen3.5-9b |
strict_v1 |
8 |
100.0% |
66.7% |
80.0% |
62.5% |
37.5% |
62.5% |
0 |
0 |
3 |
0 |
parse_failures_present |
| qwen3.5-27b@q4_k_m |
baseline_v1 |
8 |
100.0% |
66.7% |
80.0% |
54.2% |
12.5% |
62.5% |
0 |
0 |
3 |
2 |
parse_failures_present |
| qwen3.5-27b@q4_k_m |
strict_v1 |
8 |
100.0% |
77.8% |
87.5% |
75.0% |
50.0% |
75.0% |
0 |
0 |
2 |
0 |
parse_failures_present |