DKPro Statistics is a collection of open-licensed statistical tools written in Java. The software library is divided into the following modules:
dkpro-statistics-agreement) is a module for computing multiple inter-rater agreement measures using a shared interface and data model. Based on this model, the software allows for analyzing coding (i.e., assigning categories to fixed items) and unitizing setups (i.e., segmenting the data into codable units). The software has been described in our COLING 2014 demo paper.
dkpro-statistics-correlation) is a module for computing correlation and association measures. It is currently under construction.
dkpro-statistics-significance) is a module for assessing statistical significance. It is currently under construction.
A more detailed description of DKPro Statistics is available in our scientific articles:
Christian M. Meyer, Margot Mieskes, Christian Stab, and Iryna Gurevych: DKPro Agreement: An Open-Source Java Library for Measuring Inter-Rater Agreement, in: Proceedings of the 25th International Conference on Computational Linguistics (COLING), pp. 105–109, August 2014. Dublin, Ireland. (download)
Please cite our COLING 2014 paper if you use the software in your scientific work.
The latest version of DKPro Statistics is available via Maven Central. If you use Maven as your build tool, then you can add DKPro Statistics as a dependency in your pom.xml file:
In addition to that, you can add each of the modules described above separately (e.g., artifactId
DKPro Statistics is available as open-source software under the Apache License 2.0 (ASL). The software thus comes “as is” without any warranty (see license text for more details).
Prior to being available as open-source software, DKPro Statistics has been a research project at the Ubiquitous Knowledge Processing (UKP) Lab of the Technische Universität Darmstadt, Germany. The following people have mainly contributed to this project (in alphabetical order):