DKPro - Welcome

DKPro is a community of projects focussing on re-usable Natural Language Processing software.

Ready to use software components for natural language processing, based on the Apache UIMA framework.
More ›
Pure-Python implementation of the Common Analysis System (CAS) as defined by the UIMA framework including the ability to load/save UIMA CAS XMI files.
More ›
UIMA-based text classification framework built on top of DKPro Core, DKPro Lab and the Weka Machine Learning Toolkit. It is intended to alleviate supervised machine learning experiments with any kind of textual data.
More ›
Collection of open-licensed statistical tools, currently including correlation and inter-rater agreement methods.
More ›
Framework for developing text similarity algorithms.
More ›
Modular, extensible Java framework for word sense disambiguation.
More ›

Facilitate using DKPro UIMA components with Hadoop.
More ›
Tools for processing CommonCrawl corpus, including Creative Commons license detection, boilerplate removal, language detection, and near-duplicate removal
More ›
Framework for keyphrase extraction.
More ›
Lightweight framework for parameter sweeping experiments. It allows you to set up experiments consisting of multiple interdependent tasks in a declarative manner with minimal overhead.
More ›
Unified API for several lexical-semantic resources.
More ›
An easy to use interface to the DKPro Core libraries, mainly for teaching purposes -- Inspired by NLTK.
More ›
Search-based annotation tool to help distributed annotation teams finding infrequent linguistic phenomena in large corpora.
More ›
Framework for creating and accessing sense-linked lexical resources in accordance with the UBY-LMF lexicon model, an instantiation of the ISO standard Lexicon Markup Framework (LMF).
More ›
Java OpenThesaurus Library allows to access all information contained in OpenThesaurus, such as glosses, usage examples, translations and much more.
More ›
Efficient access to Web1T n-gram data.
More ›
Java OmegaWiki Library allows to access all information contained in OmegaWiki, such as glosses, usage examples, translations and much more.
More ›
Java Wiktionary Library allows to access the information contained in Wiktionary.
More ›
Java Wikipedia Library allows to access all information contained in Wikipedia.
More ›

Here are a few additional projects which are not part of DKPro proper, but which are closely related, compatible with DKPro products and building on them.

A semantic annotation platform offering intelligent assistance and knowledge management.
More ›
General purpose web-based annotation tool for a wide range of linguistic annotations.
More ›