Automatic identification of the most important elements in an XML collection
An important problem in XML retrieval is determining the most useful element types to retrieve - e.g. book, chapter, section, paragraph or caption. An automated system for doing this could be based on features of element types related to size, depth, frequency of occurrence, etc. We consider a large number of such features and assess their usefulness in predicting the types of elements judged relevant in INEX evaluations for the IEEE and Wikipedia 2006 corpora. For each feature we automatically...[Show more]
|Collections||ANU Research Publications|
|Source:||Proceedings of the Sixteenth Australasian Document Computing Symposium|
|01_Krumpholz_Automatic_identification_of_2011.pdf||259.63 kB||Adobe PDF||Request a copy|
Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.