By using the program coma and the Malaga library the wordlist of a corpus may be morphologically analysed. The analysis are written to the following files:
After the complete analysis, a statistic profile is written to standard out: (example)
Malaga morphological analysis for corpus limas: Consumed time: 1154.78 sec Number of types: 121650 Number of recognized types: 100051 (82.245%) Number of unknown types: 21599 Number of tokens: 1236774 Number of recognized tokens: 1176201 (95.1023%) Number of unknown tokens: 60573 Result files : limas.matest.mat limas.matest.man limas.matest.sym_c
A copy of the Malaga symbol associated with the analysis is made, thus enabling the use of the stored Malaga analysis independent of the rule files.
coma supports the following options: