Ishihara, Shunichi2024-04-222024-04-222055-7671http://hdl.handle.net/1885/316957The rotated delta, which is argued to be a theoretically better-grounded distance measure, has failed to receive any empirical support for its superiority. This study revisits the rotated delta-which is more commonly known as the Mahalanobis distance in other areas-with two different covariance matrices that are estimated from training data. The first covariance matrix represents the between-author variability, and the second the within-author variability. A series of likelihood ratio-based authorship verification experiments was carried out with some different distance measures. The experiments made use of the documents arranged from a large database of text messages that allowed for a total of 2,160 same-author and 4,663,440 different-author comparisons. The Mahalanobis distance with the between-author covariance matrix performed far worse compared to the other distance measures, whereas the Mahalanobis distance with the within-author covariance matrix performed better than the other measures. However, superior performance relative to the cosine distance is subject to word lengths and/or the order of the feature vector. The result of follow-up experiments further illustrated that the covariance matrix representing the within-author variability needs to be trained using a good amount of data to perform better than the cosine distance: the higher the order of the vector, the more data are required for training. The quantitative results also infer that the two sources of variabilities-notably within- and between-author variabilities-are independent of each other to the extent that the latter cannot accurately approximate the former.application/pdfen-AU© 2022 The authorsMahalanobis distance with an adapted within-author covariance matrix: An authorship verification experiment202210.1093/llc/fqac0082022-12-25