Metrics

The ontoaligner.utils.metrics module provides evaluation functions for ontology alignment tasks. Given a list of predicted alignments and a list of reference (ground-truth) alignments, it computes standard IR-style metrics as well as ranking-based measures.

Standard Metrics

These metrics treat alignments as sets of (source, target) pairs.

Function

Description

calculate_intersection

Count of unique (source, target) pairs present in both predictions and references.

precision_score

Fraction of predictions that are correct.

recall_score

Fraction of reference alignments that were retrieved.

f1_measurement

Weighted harmonic mean of precision and recall (Fβ).

evaluation_report

Returns all of the above in a single summary dictionary (values scaled to 0–100).


Precision

\[P = \frac{|\,\text{predicts} \cap \text{references}\,|}{|\,\text{predicts}\,|}\]

Returns 0 when the prediction list is empty.


Recall

\[R = \frac{|\,\text{predicts} \cap \text{references}\,|}{|\,\text{references}\,|}\]

Returns 0 when the reference list is empty.


Fβ-Score

\[F_\beta = \frac{(1 + \beta^2) \cdot P \cdot R}{\beta^2 \cdot P + R}\]

Set beta=1 (default) for the standard F1-measurement, beta=2 to weight recall more heavily, or beta=0.5 to favour precision.


evaluation_report — output keys

Key

Type

Description

intersection

int

Number of matched alignment pairs

precision

float

Precision × 100

recall

float

Recall × 100

f-score

float

Fβ × 100

predictions-len

int

Total number of predictions supplied

reference-len

int

Total number of reference alignments

Ranking Metrics

These metrics require a score field on each prediction and evaluate the ranking quality of the predictions for each source concept.


Hit@K

\[\text{Hit@K} = \frac{1}{|\text{references}|} \sum_{\text{ref}} \mathbf{1}\!\left[\text{ref.target} \in \text{top-}K(\text{ref.source})\right]\]

The fraction of reference alignments whose correct target appears in the top-K predictions for the same source (ranked by descending score). Returns 0 if k 0 or no references exist.


Mean Reciprocal Rank (MRR)

\[\text{MRR} = \frac{1}{|\text{references}|} \sum_{\text{ref}} \frac{1}{\text{rank}(\text{ref.target})}\]

The average reciprocal rank of the correct target in the prediction list for each source. Missing predictions contribute 0 to the sum.

Example Usage

from ontoaligner.utils.metrics import evaluation_report, hit_at_k, mrr

predicts = [
    {"source": "A", "target": "1", "score": 0.9},
    {"source": "A", "target": "2", "score": 0.5},
    {"source": "B", "target": "3", "score": 0.8},
]
references = [
    {"source": "A", "target": "1"},
    {"source": "B", "target": "3"},
]

report = evaluation_report(predicts, references)

print(f"Hit@1 : {hit_at_k(predicts, references, k=1):.2f}")
print(f"Hit@2 : {hit_at_k(predicts, references, k=2):.2f}")
print(f"MRR   : {mrr(predicts, references):.2f}")

Tip

Use evaluation_report for a quick benchmark summary; add hit_at_k and mrr when your matcher produces ranked candidate lists (e.g., retrieval-based or LLM-based aligners).