Utils¶

Metrics¶

This script defines functions for evaluating the intersection between predicted and reference data, as well as calculating various evaluation metrics such as precision, recall, and F-score.

It includes two main functions:

calculate_intersection: Computes the number of matching items between the predicted and reference data.
evaluation_report: Calculates precision, recall, and F-score based on the intersection of predicted and reference data.

ontoaligner.utils.metrics.calculate_intersection(predicts: List, references: List) → int[source]¶

Calculate Matching Items Between Predicted and Reference Data:

This function compares the predicted data with the reference data and determines the number of matching items. A match is identified when both the source and target fields are identical in a predicted-reference pair.

The function now handles duplicate predictions by counting only unique matches.

Parameters:¶

`predicts` (list of dict):
A list of predicted entries, where each entry is a dictionary containing: - source (any): The source element of the prediction. - target (any): The target element of the prediction. - score (float): An optional confidence or relevance score for the prediction.
`references` (list of dict):
A list of reference entries, where each entry is a dictionary containing: - source (any): The source element of the reference. - target (any): The target element of the reference. - relation (any): The relationship between the source and target in the reference.

Returns:¶

`intersection` (int):
The count of unique (source, target) pairs that appear in both predictions and references.

ontoaligner.utils.metrics.evaluation_report(predicts: List, references: List, beta: int = 1) → Dict[source]¶

Parameters:

predicts
references
beta

Returns:

ontoaligner.utils.metrics.f1_measurement(predicts: List, references: List, beta: int = 1) → float[source]¶

Calculate F-Score:

The F-score is a weighted harmonic mean of precision and recall, where ( β ) determines the balance between them. It is calculated as: $$( F_β = frac{(1 + β^2) cdot P cdot R}{(β^2 cdot P) + R} )$$

Parameters:¶

`predicts` (list of dict): A list of predicted entries.
`references` (list of dict): A list of reference entries.
`beta` (int): The weight of recall in the combined score (default is 1 for F1-score).

Returns:¶

`f_score` (float): The calculated F-score with the specified ( β ).

ontoaligner.utils.metrics.hit_at_k(predicts: List, references: List, k: int = 1) → float[source]¶

Compute Hit@K: fraction of references where the correct target appears in the top-K predicted targets for the same source.

Assumes: - predicts: [{“source”: str, “target”: str, “score”: float(optional)}] - references: [{“source”: str, “target”: str, …}]

ontoaligner.utils.metrics.mrr(predicts: List, references: List) → float[source]¶

Mean Reciprocal Rank (MRR): average of reciprocal ranks for correct targets per source.

For each reference (source, target), sort predicted candidates for that source by score descending, find the rank of the correct target (1-based). If missing, contributes 0.

ontoaligner.utils.metrics.precision_score(predicts: List, references: List) → float[source]¶

Calculate Precision:

Precision is the proportion of predicted items that are correct. It is calculated as: $$( P = frac{|\text{intersection of predicts and references}|}{|\text{predicts}|} )$$

Parameters:¶

`predicts` (list of dict): A list of predicted entries.
`references` (list of dict): A list of reference entries.

Returns:¶

`precision` (float): The calculated precision score.

ontoaligner.utils.metrics.recall_score(predicts: List, references: List) → float[source]¶

Calculate Recall:

Recall is the proportion of reference items that are successfully predicted. It is calculated as: $$( R = frac{|\text{intersection of predicts and references}|}{|\text{references}|} )$$

Parameters:¶

`predicts` (list of dict): A list of predicted entries.
`references` (list of dict): A list of reference entries.

Returns:¶

`recall` (float): The calculated recall score.

XMLify¶

This module provides functionality to generate XML alignment files compliant with the Alignment API. It is useful for representing ontology matching results in a standardized XML format.

ontoaligner.utils.xmlify.xml_alignment_generator(matchings: List[Dict], return_rdf: bool = False, relation: str = '=', digits: int = -2) → Any[source]¶

Generates an XML file representing ontology matching results in RDF format.

Parameters:

matchings (List[dict]) – A list of dictionaries representing matching pairs, where each dictionary contains: - ‘source’ (str): URI of the source entity. - ‘target’ (str): URI of the target entity. - ‘score’ (float): Confidence score of the mapping.
relation (str) – The default relation to be used between source and target if not provided in the input (default is “=”).
digits (int) – The number of decimal places to round the confidence score. A value of -2 rounds to two decimal places.

Returns:

A prettified XML string representing the ontology matchings, or an RDF element if return_rdf is True.

Return type:

Any