Metrics¶
The ontoaligner.utils.metrics module provides evaluation functions for ontology alignment tasks.
Given a list of predicted alignments and a list of reference (ground-truth) alignments,
it computes standard IR-style metrics as well as ranking-based measures.
Standard Metrics¶
These metrics treat alignments as sets of (source, target) pairs.
Function |
Description |
|---|---|
|
Count of unique |
|
Fraction of predictions that are correct. |
|
Fraction of reference alignments that were retrieved. |
|
Weighted harmonic mean of precision and recall (Fβ). |
|
Returns all of the above in a single summary dictionary (values scaled to 0–100). |
Precision
Returns 0 when the prediction list is empty.
Recall
Returns 0 when the reference list is empty.
Fβ-Score
Set beta=1 (default) for the standard F1-measurement, beta=2 to weight recall more heavily,
or beta=0.5 to favour precision.
evaluation_report — output keys
Key |
Type |
Description |
|---|---|---|
|
int |
Number of matched alignment pairs |
|
float |
Precision × 100 |
|
float |
Recall × 100 |
|
float |
Fβ × 100 |
|
int |
Total number of predictions supplied |
|
int |
Total number of reference alignments |
Ranking Metrics¶
These metrics require a score field on each prediction and evaluate the ranking quality
of the predictions for each source concept.
Hit@K
The fraction of reference alignments whose correct target appears in the top-K predictions
for the same source (ranked by descending score). Returns 0 if k ≤ 0 or no references exist.
Mean Reciprocal Rank (MRR)
The average reciprocal rank of the correct target in the prediction list for each source.
Missing predictions contribute 0 to the sum.
Example Usage¶
from ontoaligner.utils.metrics import evaluation_report, hit_at_k, mrr
predicts = [
{"source": "A", "target": "1", "score": 0.9},
{"source": "A", "target": "2", "score": 0.5},
{"source": "B", "target": "3", "score": 0.8},
]
references = [
{"source": "A", "target": "1"},
{"source": "B", "target": "3"},
]
report = evaluation_report(predicts, references)
print(f"Hit@1 : {hit_at_k(predicts, references, k=1):.2f}")
print(f"Hit@2 : {hit_at_k(predicts, references, k=2):.2f}")
print(f"MRR : {mrr(predicts, references):.2f}")
Tip
Use evaluation_report for a quick benchmark summary; add hit_at_k and mrr
when your matcher produces ranked candidate lists (e.g., retrieval-based or LLM-based aligners).