Base¶
Base Dataset Classes¶
The script is responsible for loading and collecting data related to source and target ontologies, along with reference alignments. It provides methods for collecting data, loading from JSON, and handling file paths.
- Classes:
- OMDataset: A base class for handling ontology matching datasets, including parsing ontologies and alignments
and collecting dataset-related information.
- class ontoaligner.base.dataset.OMDataset(**kwargs)[source]¶
Bases:
ABCA base class for managing ontology matching datasets, including the source and target ontologies, and the reference alignments.
This class is responsible for collecting ontology data, parsing ontologies, and handling file paths associated with the dataset. It provides methods for data collection, loading data from JSON files, and retrieving directory paths.
- track¶
The dataset track name.
- Type:
str
- ontology_name¶
The name of the ontology being processed.
- Type:
str
- source_ontology¶
The source ontology object.
- Type:
Any
- target_ontology¶
The target ontology object.
- Type:
Any
- alignments¶
The alignments parser object, using BaseAlignmentsParser.
- Type:
Any
- alignments: Any = <ontoaligner.base.ontology.BaseAlignmentsParser object>¶
- collect(source_ontology_path: str, target_ontology_path: str, reference_matching_path: str = '') Dict¶
Collects data from the source ontology, target ontology, and reference alignments.
This method takes paths to the source ontology, target ontology, and reference alignments files, parses them, and returns a dictionary containing the dataset information, source, target, and reference alignments data.
- Parameters:
source_ontology_path (str) – The file path to the source ontology.
target_ontology_path (str) – The file path to the target ontology.
reference_matching_path (str) – The file path to the reference matching alignments.
- Returns:
- A dictionary containing the dataset information, parsed source and target ontologies,
and parsed reference alignments.
- Return type:
Dict
- load_from_json(json_file_path: str) Dict¶
Loads dataset information from a JSON file.
This method loads the dataset’s information from a JSON file located at a specific path, constructed from the root directory, track, and ontology name.
- Parameters:
root_dir (str) – The root directory where the dataset’s JSON file is located.
- Returns:
The JSON data loaded from the specified file.
- Return type:
Dict
- ontology_name: str = ''¶
- source_ontology: Any = None¶
- target_ontology: Any = None¶
- track: str = ''¶
Base Encoder Classes¶
This script provides a foundation for flexible text encoding, including text preprocessing, customizable prompt templates, and structured methods for encoding and retrieving encoder-specific details. It Ensures a consistent interface and behavior for text encoding tasks.
- Classes:
- BaseEncoder: An abstract base class for encoders, providing text preprocessing, a template for prompts,
and methods for encoding data and obtaining encoder information.
- class ontoaligner.base.encoder.BaseEncoder[source]¶
Bases:
ABCAn abstract base class for encoders that provides methods for text preprocessing and encoding tasks. This class defines methods for preprocessing text and serves as a blueprint for creating encoders that will handle specific encoding logic and retrieval of encoder-related information.
- prompt_template¶
A string template used in prompting for encoding tasks.
- Type:
str
- items_in_owl¶
A string that defines the items in the ontology used by the encoder.
- Type:
str
- abstract get_encoder_info() str¶
An abstract method for retrieving encoder-specific information. Subclasses must implement this method.
This method is intended to be overridden by subclasses to return relevant information about the encoder.
- Returns:
Information about the encoder (e.g., type, configuration).
- Return type:
str
- items_in_owl: str = ''¶
- abstract parse(**kwargs) Any¶
An abstract method for parsing input data. Subclasses must implement this method.
This method is intended to be overridden by subclasses to define how to parse input data for encoding.
- Parameters:
**kwargs – The keyword arguments passed to the method for parsing.
- Returns:
The parsed data in a format defined by the subclass.
- Return type:
Any
- preprocess(text: str) str¶
Preprocesses input text by replacing underscores with spaces and converting the text to lowercase.
This method is used to standardize the format of input text before processing it further for encoding.
- Parameters:
text (str) – The input text that needs preprocessing.
- Returns:
The preprocessed text with underscores replaced by spaces and all characters in lowercase.
- Return type:
str
- prompt_template: str = ''¶
Base Model Classes¶
Defines a blueprint for ontology matching models, specifying methods for string representation and data generation that must be implemented by subclasses. The script ensures consistency and structure for building specialized models in the ontology matching domain.
- Classes:
- BaseOMModel: An abstract base class for ontology matching models, which defines methods for
string representation and data generation.
- class ontoaligner.base.model.BaseOMModel(**kwargs)[source]¶
Bases:
ABCAn abstract base class for ontology matching models. This class defines methods for string representation and output generation, which must be implemented by subclasses.
Initializes the ontology matching model with optional keyword arguments.
- Parameters:
**kwargs – Additional keyword arguments that may be used for model configuration or parameters.
- abstract generate(input_data: List) List¶
Generates output based on the input data. This method must be implemented by subclasses.
- Parameters:
input_data (List) – A list of data that will be processed by the model to generate output.
- Returns:
- A list containing the generated output based on the input data. The specific content of
the output will depend on the model’s functionality.
- Return type:
List
Base Ontology Classes¶
This script provides functionality for parsing ontologies and alignment files. It includes methods for extracting data from OWL ontologies, such as names, labels, and relationships, as well as parsing alignment data in RDF format to extract relationships between entities and their corresponding data.
- Classes:
- BaseOntologyParser: A base class for parsing OWL ontologies, extracting information such as
names, labels, parents, children, synonyms, and comments.
- BaseAlignmentsParser: A base class for parsing alignment data, extracting relationships between
entities and their corresponding RDF data.
- class ontoaligner.base.ontology.BaseAlignmentsParser[source]¶
Bases:
ABCAn abstract base class for parsing RDF alignment data. This class provides methods for extracting relationships between entities in the alignment data, including entities and their relations.
- entity_1: URIRef = rdflib.term.URIRef('http://knowledgeweb.semanticweb.org/heterogeneity/alignmententity1')¶
- entity_2: URIRef = rdflib.term.URIRef('http://knowledgeweb.semanticweb.org/heterogeneity/alignmententity2')¶
- extract_data(reference: Any) List[Dict]¶
Extracts alignment data from an RDF graph, processing relationships between entities.
- Parameters:
reference (Any) – RDF reference containing the alignment data.
- Returns:
A list of dictionaries representing entity relationships in the alignment data.
- Return type:
List
- load_ontology(input_file_path: str) Any¶
Loads an RDF alignment file from the specified file path.
- Parameters:
input_file_path (str) – The file path of the RDF alignment file.
- Returns:
The loaded RDF alignment data.
- Return type:
Any
- namespace: Namespace = Namespace('http://knowledgeweb.semanticweb.org/heterogeneity/alignment')¶
- parse(input_file_path: str = '') List¶
Loads and processes the RDF alignment file, extracting relevant data.
- Parameters:
input_file_path (str) – The file path of the RDF alignment file.
- Returns:
A list of extracted alignment data.
- Return type:
List
- relation: URIRef = rdflib.term.URIRef('http://knowledgeweb.semanticweb.org/heterogeneity/alignmentrelation')¶
- class ontoaligner.base.ontology.BaseOntologyParser[source]¶
Bases:
ABCAn abstract base class for parsing OWL ontologies. This class defines methods to extract data such as names, labels, IRIs, children, parents, synonyms, and comments for ontology classes.
- duplicate_removals(owl_class_info: Dict) Dict¶
Removes duplicate ontology class information based on IRI.
- Parameters:
owl_class_info (Dict) – A dictionary containing information about an ontology class.
- Returns:
A dictionary with duplicates removed from the class information.
- Return type:
Dict
- extract_data(ontology: Any) List[Dict]¶
Extracts and processes data from the given ontology, including children, parents, synonyms, and comments.
- Parameters:
ontology (Any) – An ontology.
- Returns:
A list of dictionaries containing extracted ontology class data.
- Return type:
List
- get_childrens(owl_class: Any) List¶
Retrieves the subclasses (children) of the given ontology class.
- Parameters:
owl_class (Any) – An ontology class.
- Returns:
A list of subclasses (children) of the ontology class.
- Return type:
List
- get_comments(owl_class: Any) List¶
Abstract method to retrieve comments for the given ontology class.
- Parameters:
owl_class (Any) – An ontology class.
- Returns:
A list of comments associated with the ontology class.
- Return type:
List
- get_iri(owl_class: Any) str¶
Retrieves the IRI of the given ontology class.
- Parameters:
owl_class (Any) – An ontology class.
- Returns:
The IRI of the ontology class.
- Return type:
str
- get_label(owl_class: Any) str¶
Retrieves the label of the given ontology class.
- Parameters:
owl_class (Any) – An ontology class.
- Returns:
The label of the ontology class.
- Return type:
str
- get_name(owl_class: Any) str¶
Retrieves the name of the given ontology class.
- Parameters:
owl_class (Any) – An ontology class.
- Returns:
The name of the ontology class.
- Return type:
str
- get_owl_classes(ontology: Any) Any¶
Retrieves all classes from the given ontology.
- Parameters:
ontology (Any) – An ontology.
- Returns:
A collection of all classes in the ontology.
- Return type:
Any
- get_owl_items(owl_class: Any) List¶
Extracts relevant items from the given ontology class, including IRI, name, and label.
- Parameters:
owl_class (Any) – An ontology class.
- Returns:
A list of dictionaries containing IRI, name, and label of relevant ontology items.
- Return type:
List
- get_parents(owl_class: Any) List¶
Retrieves the superclasses (parents) of the given ontology class.
- Parameters:
owl_class (Any) – An ontology class.
- Returns:
A list of superclasses (parents) of the ontology class.
- Return type:
List
- get_synonyms(owl_class: Any) List¶
Retrieves the synonyms of the given ontology class.
- Parameters:
owl_class (Any) – An ontology class.
- Returns:
A list of synonyms of the ontology class.
- Return type:
List
- is_contain_label(owl_class: Any) bool¶
Checks if the given ontology class contains a label.
- Parameters:
owl_class (Any) – An ontology class.
- Returns:
True if the class contains a label, False otherwise.
- Return type:
bool
- load_ontology(input_file_path: str) Any¶
Loads an ontology from the specified file path.
- Parameters:
input_file_path (str) – The file path of the ontology.
- Returns:
The loaded ontology.
- Return type:
Any
- parse(input_file_path: str) List[Dict[str, Any]]¶
Loads and processes the ontology, extracting relevant data.
- Parameters:
input_file_path (str) – The file path of the ontology.
- Returns:
A list of extracted ontology data.
- Return type:
List