• search hit 1 of 1
Back to Result List

Summarizing XML Documents: Contributions, Empirical Studies, and Challenges

  • We tackle the problem of obtaining statistics on content and structure of XML documents by using summaries which may provide cardinality estimations for XML query expressions. Our focus is a data-centric processing scenario in which we use a query engine to process such query expressions. We provide three new summary structures called LESS (Leaf-Element-in-Subtree), LWES (Level-Wide Element Summarization), and EXsum (Element-centered XML Summarization) which are targeted to base an estimation process in an XML query optimizer. Each of these collects structural statistical information of XML documents, and the latter (EXsum) gathers, in addition, statistics on document content. Estimation procedures and/or heuristics for specic types of query expressions of each proposed approach are developed. We have incorporated and implemented our proposals in XTC, a native XML database management system (XDBMS). With this common implementation base, we present an empirical and comparative study in which our proposals are stressed against others published in the literature, which are also incorporated into the XTC. Furthermore, an analysis is made based on criteria pertinent to a query optimizer process.
  • Zusammenfassung des XML Dokumenten: Beiträge, empirische Untersuchungen und Herausforderungen

Download full text files

Export metadata

Additional Services

Search Google Scholar
Metadaten
Author:José de Aguiar Moraes Filho
URN:urn:nbn:de:hbz:386-kluedo-24832
Advisor:Theo Härder
Document Type:Doctoral Thesis
Language of publication:English
Year of Completion:2010
Year of first Publication:2010
Publishing Institution:Technische Universität Kaiserslautern
Granting Institution:Technische Universität Kaiserslautern
Acceptance Date of the Thesis:2010/03/30
Date of the Publication (Server):2010/04/01
Tag:XML query estimation; XML summary; content-and-structure summary; statistics; structural summary
Faculties / Organisational entities:Kaiserslautern - Fachbereich Informatik
DDC-Cassification:0 Allgemeines, Informatik, Informationswissenschaft / 004 Informatik
Licence (German):Standard gemäß KLUEDO-Leitlinien vor dem 27.05.2011