A large collection of reproductions of calligraphy on paper was scanned into images to enable web access for both the academic community and the public. Calligraphic paper digitization technology is mature, but technology for segmentation, character coding, style classification, and identification of calligraphy are lacking. Therefore, computational tools for classification and quantification of calligraphic style are proposed and demonstrated on a statistically characterized corpus. A subset of 259 historical page images is segmented into 8719 individual character images. Calligraphic style is revealed and quantified by visual attributes (i.e., appearance features) of character images sampled from historical works. A style space is defined with the features of five main classical styles as basis vectors. Cross-validated error rates of 10% to 40% are reported on conventional and conservative sampling into training/test sets and on same-work voting with a range of voter participation. Beyond its immediate applicability to education and scholarship, this research lays the foundation for style-based calligraphic forgery detection and for discovery of latent calligraphic groups induced by mentor-student relationships.
|