Text segmentation and Chinese site search
Automatic segmentation and overlapping bigrams are the most common methods for overcoming the lack of explicit word boundaries in Chinese text. Past studies have compared their effectiveness, but findings have been equivocal and site search has been little studied. We compare representatives of the two approaches using a 465,000 page crawl and test queries applicable to the university context. 503 pairs of result sets were judged by 56 Chinese students. Although there are differences on certain...[Show more]
|Collections||ANU Research Publications|
|Source:||Text segmentation and Chinese site search|
|01_Zhou_Text_segmentation_and_Chinese_2015.pdf||315.37 kB||Adobe PDF||Request a copy|
Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.