Articles

OPAL - Ontology-based Pattern Analysis with Logic

Web form understanding is fundamental to automated collection, extraction, integration, or analysis of web data.

In this demonstration, we illustrate how OPAL, a novel, model-driven tool for web form understanding, achieves near perfect form understanding by considering a form in four scopes: At the narrow field scope, OPAL associates labels to individual form fields following the local structure of the DOM tree. At segment scope, OPAL zooms to hierarchically organized groups of similar fields and distributes labels in or to these groups. At page scope, OPAL searches for labels anywhere on a page to the left or top of a field, that are not overshadowed by another field. At domain scope, OPAL classifies fields based on their labels and repairs the constructed model, where necessary, to match our domain model. The OPAL demo visualizes each scope of this understanding process and highlights the contribution of each scope to the overall accuracy of the system.


Presentation


Screencast


References


Contacts

(name dot surname at cs dot ox dot ac dot uk)
  • Christian Schallhart
  • Xiaonan Guo