Digital Building Permit (DBP) workflows benefit from structured, machine interpretable data, yet real submissions are typically delivered as heterogeneous, document-centric dossiers. This thesis presents an automated upstream pipeline that transforms raw building permit submission archives into ISO 21597–compliant Information Containers for Linked Document Delivery (ICDD), coupling the original documents with RDF-based metadata, linksets, and an ontology-aligned semantic payload. The method applies a hybrid retrieval augmented approach that combines dense similarity retrieval with structure-aware traversal to assemble bounded evidence contexts for large language model (LLM) extraction. Distinct from unconstrained extraction, the pipeline operationalizes an evidence-bounded protocol: the LLM acts as a schema-constrained proposal mechanism, and values are populated into an OntoBPR/OBPA aligned case representation only when supported by traceable evidence from the submission; otherwise, fields remain unset. Normative deliverables are separated from audit traces, and conformance is assessed through staged RDF coherence checks and SHACL-based validation reports. A proof of concept on a real submission demonstrates feasibility of producing structurally conformant, inspectable case packages while prioritizing traceability and defensible semantic enrichment over maximum extraction coverage.
«
Digital Building Permit (DBP) workflows benefit from structured, machine interpretable data, yet real submissions are typically delivered as heterogeneous, document-centric dossiers. This thesis presents an automated upstream pipeline that transforms raw building permit submission archives into ISO 21597–compliant Information Containers for Linked Document Delivery (ICDD), coupling the original documents with RDF-based metadata, linksets, and an ontology-aligned semantic payload. The method appl...
»