- The paper introduces an ontology reshaping approach that integrates user feedback to tailor KG schemata for efficient industrial data integration.
- Using heuristic mapping and user inputs, the method achieved a 7-8x reduction in KG generation time while ensuring comprehensive data coverage.
- The study demonstrates that reshaped ontologies enhance computational efficiency and usability in automated Bosch welding quality monitoring.
Towards Ontology Reshaping for KG Generation with User-in-the-Loop: Applied to Bosch Welding
Introduction
The paper "Towards Ontology Reshaping for KG Generation with User-in-the-Loop: Applied to Bosch Welding" (2209.11067) tackles the intricate challenge of knowledge graph generation from vast industrial datasets, specifically focusing on Bosch's automated welding process. Knowledge Graphs (KGs) represent data through nodes and edges, carving out entities and their interrelations to provide a meaningful structure of knowledge. The essence of this research lies in a methodological innovation termed "ontology reshaping," which aims to optimize the process of constructing Knowledge Graph (KG) schemata. This is particularly relevant in industrial contexts like automated welding, where data complexity and variety pose significant challenges to machine learning-based quality monitoring [kagermann2015change], [zhou2020cikmdemo].
Figure 1: The need for ontology reshaping illustrated with domain ontology and knowledge graph schemata for Bosch Welding.
Challenges in KG Generation
Industrial applications of KGs have gained momentum due to their flexibility in structuring information that supports data integration and machine learning-driven analytics. However, a persistent obstacle is the automation of KG generation from raw industrial data, which is characterized by its high volume and heterogeneity. Classical domain ontologies are typically knowledge-oriented, targeting the representation of general domain knowledge. These can often become misaligned with the data-oriented priorities of KGs, such as specific data adaptability and user usability.
Figure 1: Why do we need ontology reshaping? Domain ontology (partially shown in a) reflects the knowledge; the KG schema (partially shown in b) needs to reflect raw data specificities and usability. Blue boxes: classes that can be mapped to attributes in the raw data; black boxes: classes that cannot be found in the raw data.
Ontology Reshaping Approach
Ontology reshaping quantitatively addresses the gap between knowledge orientation and data orientation in existing domain ontologies. The framework integrates a user-in-the-loop methodology with heuristic-based ontology manipulation rather than completely automated solutions that may lead to incomplete or user-unfriendly Knowledge Graphs.
The proposed ontology reshaping algorithm comprises five steps:
- Initialization: Begins with the main class (MC) from the domain ontology O and defines potential classes and properties using mapped relationships in the raw data.
- Class Addition: Integrates potential classes to the KG schema if they are mapped within the raw data tables.
- Entity Identification: Applies keyword-based mapping from raw data attributes to potential properties, supplemented by optional user input for identifying significant entities.
- Classes Connection: Establishes connections between classes and their properties as per the reshaped KG schema, optimizing for data containments.
- User-guided Connection: Finalizes schema creation by connecting remaining classes using main class positioning and optional user input.
Figure 1: Why do we need ontology reshaping? Domain ontology (partially shown in a) reflects the knowledge; the KG schema (partially shown in b) needs to reflect raw data specificities and usability. Blue boxes: classes that can be mapped to attributes in the raw data; black boxes: classes that cannot be found in the raw data. \looseness=-1
Evaluation and Results
For evaluating the efficiency and validity of the proposed ontology reshaping framework, the study employed a proof-of-concept experimental setup with Bosch's industrial welding data. The method was vetted against the prominent challenges of KG generation, specifically focusing on efficiency, simplicity, and coverage of data.
The evaluation involved randomly sub-sampling datasets to form sub-groups with increasing complexity. Using the reshaped ontology significantly reduced the generation time of the KGs by 7-8 times compared to the baseline. Moreover, the generated KGs showed full data coverage while minimizing the presence of dummy nodes, hence enhancing data utility and reducing redundancy in the graph representation (Table 1).
(Table 1)
Table 1: Evaluation on subsets with different numbers of attributes shows that KG generation with ontology reshaping (Onto-Reshape) significantly outperforms the baseline approach in terms of efficiency and simplicity metrics. Completeness, user-friendliness, and automatic scaling capacities are successfully maintained, aligning with the stated requirements of KG construction from industrial data.
Practical Implications and Future Directions
The proposed ontology reshaping method caters to the nuances of industrial data by producing smaller, refined KG schemata that match the specificities of datasets, reducing dummy node occurrence, information loss, and complexity within the KGs. This approach demonstrates marked optimization in both storage and computational efficiency, ensuring zero fatigue due to blank nodes and user-unfriendly output.
In a practical deployment within Bosch for Electric Resistance Welding analytics, the reshaped KGs facilitated seamless integration of heterogeneous data, thereby enhancing quality monitoring processes. The user-in-the-loop feature enabled customization to address user-specific preferences and improved overall workflow efficiency, leading to more accessible and informative KGs.
In future work, a more formal theory of ontology reshaping may be developed, emphasizing uniform interpolation and the optimization of machine learning pipelines through advanced ontology structures. Further research could also explore extending the reshaping method's application across various industries, thereby enhancing the generalizability and impact of KGs in diverse data-driven domains.
Conclusion
The study sets forth an ontology reshaping framework that effectively generates high-quality KG schemata adapted to the data specificities of industrial applications. By managing the trade-off between knowledge-oriented and data-specific ontology construction, the approach supports scalable, user-friendly, and efficient KG generation. Evaluations demonstrate substantial improvements in performance metrics over traditional methods. Future exploration into formal theory development, interpolation techniques, and enhanced machine learning integration is anticipated to further elevate the stepwise ontology reshaping algorithm in diversified industrial contexts.