Journal of Systems Engineering and Electronics ›› 2010, Vol. 32 ›› Issue (5): 1088-1093.doi: 10.3969/j.issn.1001-506X.2010.05.044

Previous Articles     Next Articles

Hierarchical text classification and evaluation

SONG Sheng-li, BAO Liang, CHEN Ping   

  1. (Software Engineering Inst., Xidian Univ.,   Xi’an 710071, China)
  • Online:2010-05-24 Published:2010-01-03

Abstract:

To evaluate hierarchical classification methods and resolve the limitations of conventional flat classification measures for hierarchical classification evaluation, after studying the hierarchical classification method based on concept tree, a set of extended measures are put forward to accurately describe its performance, by effectively using the level and “affinity” among the categories in hierarchical structure. And further a definition of error classification concentration ratio (ECCR) is given based on the distribution of misclassification samples. Besides evaluation the classification result, ECCR can guide the training samples selection process to make the training set more representative. Through the experiment of Chinese news corpus classification, it proves that the extended measures for hierarchical classification result are more accurate, and ECCR is helpful to train the more accurate classification model.

[an error occurred while processing this directive]