Base

Base Dataset Classes

The script is responsible for loading and collecting data related to source and target ontologies, along with reference alignments. It provides methods for collecting data, loading from JSON, and handling file paths.

Classes:
  • OMDataset: A base class for handling ontology matching datasets, including parsing ontologies and alignments

    and collecting dataset-related information.

class ontoaligner.base.dataset.OMDataset(**kwargs)[source]

Bases: ABC

A base class for managing ontology matching datasets, including the source and target ontologies, and the reference alignments.

This class is responsible for collecting ontology data, parsing ontologies, and handling file paths associated with the dataset. It provides methods for data collection, loading data from JSON files, and retrieving directory paths.

track

The dataset track name.

Type:

str

ontology_name

The name of the ontology being processed.

Type:

str

source_ontology

The source ontology object.

Type:

Any

target_ontology

The target ontology object.

Type:

Any

alignments

The alignments parser object, using BaseAlignmentsParser.

Type:

Any

alignments: Any = <ontoaligner.base.ontology.BaseAlignmentsParser object>
collect(source_ontology_path: str, target_ontology_path: str, reference_matching_path: str = '') Dict

Collects data from the source ontology, target ontology, and reference alignments.

This method takes paths to the source ontology, target ontology, and reference alignments files, parses them, and returns a dictionary containing the dataset information, source, target, and reference alignments data.

Parameters:
  • source_ontology_path (str) – The file path to the source ontology.

  • target_ontology_path (str) – The file path to the target ontology.

  • reference_matching_path (str) – The file path to the reference matching alignments.

Returns:

A dictionary containing the dataset information, parsed source and target ontologies,

and parsed reference alignments.

Return type:

Dict

load_from_json(json_file_path: str) Dict

Loads dataset information from a JSON file.

This method loads the dataset’s information from a JSON file located at a specific path, constructed from the root directory, track, and ontology name.

Parameters:

root_dir (str) – The root directory where the dataset’s JSON file is located.

Returns:

The JSON data loaded from the specified file.

Return type:

Dict

ontology_name: str = ''
source_ontology: Any = None
target_ontology: Any = None
track: str = ''

Base Encoder Classes

This script provides a foundation for flexible text encoding, including text preprocessing, customizable prompt templates, and structured methods for encoding and retrieving encoder-specific details. It Ensures a consistent interface and behavior for text encoding tasks.

Classes:
  • BaseEncoder: An abstract base class for encoders, providing text preprocessing, a template for prompts,

    and methods for encoding data and obtaining encoder information.

class ontoaligner.base.encoder.BaseEncoder[source]

Bases: ABC

An abstract base class for encoders that provides methods for text preprocessing and encoding tasks. This class defines methods for preprocessing text and serves as a blueprint for creating encoders that will handle specific encoding logic and retrieval of encoder-related information.

prompt_template

A string template used in prompting for encoding tasks.

Type:

str

items_in_owl

A string that defines the items in the ontology used by the encoder.

Type:

str

abstract get_encoder_info() str

An abstract method for retrieving encoder-specific information. Subclasses must implement this method.

This method is intended to be overridden by subclasses to return relevant information about the encoder.

Returns:

Information about the encoder (e.g., type, configuration).

Return type:

str

items_in_owl: str = ''
abstract parse(**kwargs) Any

An abstract method for parsing input data. Subclasses must implement this method.

This method is intended to be overridden by subclasses to define how to parse input data for encoding.

Parameters:

**kwargs – The keyword arguments passed to the method for parsing.

Returns:

The parsed data in a format defined by the subclass.

Return type:

Any

preprocess(text: str) str

Preprocesses input text by replacing underscores with spaces and converting the text to lowercase.

This method is used to standardize the format of input text before processing it further for encoding.

Parameters:

text (str) – The input text that needs preprocessing.

Returns:

The preprocessed text with underscores replaced by spaces and all characters in lowercase.

Return type:

str

prompt_template: str = ''

Base Model Classes

Defines a blueprint for ontology matching models, specifying methods for string representation and data generation that must be implemented by subclasses. The script ensures consistency and structure for building specialized models in the ontology matching domain.

Classes:
  • BaseOMModel: An abstract base class for ontology matching models, which defines methods for

    string representation and data generation.

class ontoaligner.base.model.BaseOMModel(**kwargs)[source]

Bases: ABC

An abstract base class for ontology matching models. This class defines methods for string representation and output generation, which must be implemented by subclasses.

Initializes the ontology matching model with optional keyword arguments.

Parameters:

**kwargs – Additional keyword arguments that may be used for model configuration or parameters.

abstract generate(input_data: List) List

Generates output based on the input data. This method must be implemented by subclasses.

Parameters:

input_data (List) – A list of data that will be processed by the model to generate output.

Returns:

A list containing the generated output based on the input data. The specific content of

the output will depend on the model’s functionality.

Return type:

List

Base Ontology Classes

This script provides functionality for parsing ontologies and alignment files. It includes methods for extracting data from OWL ontologies, such as names, labels, and relationships, as well as parsing alignment data in RDF format to extract relationships between entities and their corresponding data.

Classes:
  • BaseOntologyParser: A base class for parsing OWL ontologies, extracting information such as

    names, labels, parents, children, synonyms, and comments.

  • BaseAlignmentsParser: A base class for parsing alignment data, extracting relationships between

    entities and their corresponding RDF data.

class ontoaligner.base.ontology.BaseAlignmentsParser[source]

Bases: ABC

An abstract base class for parsing RDF alignment data. This class provides methods for extracting relationships between entities in the alignment data, including entities and their relations.

entity_1: URIRef = rdflib.term.URIRef('http://knowledgeweb.semanticweb.org/heterogeneity/alignmententity1')
entity_2: URIRef = rdflib.term.URIRef('http://knowledgeweb.semanticweb.org/heterogeneity/alignmententity2')
extract_data(reference: Any) List[Dict]

Extracts alignment data from an RDF graph, processing relationships between entities.

Parameters:

reference (Any) – RDF reference containing the alignment data.

Returns:

A list of dictionaries representing entity relationships in the alignment data.

Return type:

List

load_ontology(input_file_path: str) Any

Loads an RDF alignment file from the specified file path.

Parameters:

input_file_path (str) – The file path of the RDF alignment file.

Returns:

The loaded RDF alignment data.

Return type:

Any

namespace: Namespace = Namespace('http://knowledgeweb.semanticweb.org/heterogeneity/alignment')
parse(input_file_path: str = '') List

Loads and processes the RDF alignment file, extracting relevant data.

Parameters:

input_file_path (str) – The file path of the RDF alignment file.

Returns:

A list of extracted alignment data.

Return type:

List

relation: URIRef = rdflib.term.URIRef('http://knowledgeweb.semanticweb.org/heterogeneity/alignmentrelation')
class ontoaligner.base.ontology.BaseOntologyParser[source]

Bases: ABC

An abstract base class for parsing OWL ontologies. This class defines methods to extract data such as names, labels, IRIs, children, parents, synonyms, and comments for ontology classes.

duplicate_removals(owl_class_info: Dict) Dict

Removes duplicate ontology class information based on IRI.

Parameters:

owl_class_info (Dict) – A dictionary containing information about an ontology class.

Returns:

A dictionary with duplicates removed from the class information.

Return type:

Dict

extract_data(ontology: Any) List[Dict]

Extracts and processes data from the given ontology, including children, parents, synonyms, and comments.

Parameters:

ontology (Any) – An ontology.

Returns:

A list of dictionaries containing extracted ontology class data.

Return type:

List

get_childrens(owl_class: Any) List

Retrieves the subclasses (children) of the given ontology class.

Parameters:

owl_class (Any) – An ontology class.

Returns:

A list of subclasses (children) of the ontology class.

Return type:

List

get_comments(owl_class: Any) List

Abstract method to retrieve comments for the given ontology class.

Parameters:

owl_class (Any) – An ontology class.

Returns:

A list of comments associated with the ontology class.

Return type:

List

get_iri(owl_class: Any) str

Retrieves the IRI of the given ontology class.

Parameters:

owl_class (Any) – An ontology class.

Returns:

The IRI of the ontology class.

Return type:

str

get_label(owl_class: Any) str

Retrieves the label of the given ontology class.

Parameters:

owl_class (Any) – An ontology class.

Returns:

The label of the ontology class.

Return type:

str

get_name(owl_class: Any) str

Retrieves the name of the given ontology class.

Parameters:

owl_class (Any) – An ontology class.

Returns:

The name of the ontology class.

Return type:

str

get_owl_classes(ontology: Any) Any

Retrieves all classes from the given ontology.

Parameters:

ontology (Any) – An ontology.

Returns:

A collection of all classes in the ontology.

Return type:

Any

get_owl_items(owl_class: Any) List

Extracts relevant items from the given ontology class, including IRI, name, and label.

Parameters:

owl_class (Any) – An ontology class.

Returns:

A list of dictionaries containing IRI, name, and label of relevant ontology items.

Return type:

List

get_parents(owl_class: Any) List

Retrieves the superclasses (parents) of the given ontology class.

Parameters:

owl_class (Any) – An ontology class.

Returns:

A list of superclasses (parents) of the ontology class.

Return type:

List

get_synonyms(owl_class: Any) List

Retrieves the synonyms of the given ontology class.

Parameters:

owl_class (Any) – An ontology class.

Returns:

A list of synonyms of the ontology class.

Return type:

List

is_contain_label(owl_class: Any) bool

Checks if the given ontology class contains a label.

Parameters:

owl_class (Any) – An ontology class.

Returns:

True if the class contains a label, False otherwise.

Return type:

bool

load_ontology(input_file_path: str) Any

Loads an ontology from the specified file path.

Parameters:

input_file_path (str) – The file path of the ontology.

Returns:

The loaded ontology.

Return type:

Any

parse(input_file_path: str) List[Dict[str, Any]]

Loads and processes the ontology, extracting relevant data.

Parameters:

input_file_path (str) – The file path of the ontology.

Returns:

A list of extracted ontology data.

Return type:

List