DKPro Core - Welcome

A collection of software components for natural language processing (NLP) based on the Apache UIMA framework.

Many NLP tools are already freely available in the NLP research community. DKPro Core provides Apache UIMA components wrapping these tools (and some original tools) so they can be used interchangeably in UIMA processing pipelines. DKPro Core builds heavily on uimaFIT which allows for rapid and easy development of NLP processing pipelines, for wrapping existing tools and for creating original UIMA components. More ›

Components

Find out more about our bundled components.

Models/Languages

Various models covering different languages accompany the components.

Formats

Reading and writing various formats is just one line of code away.

Typesystem

Our typesystem is comprehensive, yet simple.

DKPro with Java

The original flavour. Use DKPro in your Java projects.

DKPro with Groovy

Create self-contained scripts using DKPro and Groovy!

DKPro with Jython

Easily integrate DKPro into your python projects!

How to cite

Many of the wrapped third-party components and the models used by them should be cited individually. We currently do not provide a comprehensive overview over citable publications. We encourage you to track down citable publications for these dependencies. However, you might find pointers to some relevant publications in the Model overview of the DKPro Core release you are using or in the JavaDoc of individual components.

Please cite DKPro Core itself as:

Eckart de Castilho, R. and Gurevych, I. (2014). A broad-coverage collection of portable NLP components for building shareable analysis pipelines. In Proceedings of the Workshop on Open Infrastructures and Analysis Frameworks for HLT (OIAF4HLT) at COLING 2014, p 1-11, Dublin, Ireland. (pdf) (bib)

A comprehensive (but probably not complete) list of scientific publications citing DKPro Core can be found on Google Scholar.

License

All components in DKPro Core are licensed under the Apache Software License (ASL) version 2 or where necessary under GPL - but their dependencies may not be:

IMPORTANT LICENSE NOTE - It must be pointed out that while the component’s source code itself may be licensed under the ASL, individual components might make use of third-party libraries or products that are not licensed under the ASL, such as LGPL libraries or libraries which are free for research but may not be used in commercial scenarios. Please be aware of the third party licenses and respect them. Please observe that parts of DKPro Core are distributed under GPL due to third-party GPL-licensed dependencies.

ASL and GPL modules can be identified by the “-asl” and “-gpl” suffixes in their artifactIDs.

About DKPro Core

This project was initiated by the Ubiquitous Knowledge Processing Lab (UKP) at the Technische Universität Darmstadt, Germany under the auspices of Prof. Iryna Gurevych. It is now jointly developed at UKP Lab (Technische Universität Darmstadt) and Language Technology Lab (Universität Duisburg-Essen), and other contributors. More ›

Image sources: LogoJava.png by Christian F. Burprich, Creative Commons (Attribution-Noncommercial-Share Alike 3.0 Unported), color changed; LogoPython.png by IFA; LogoGroovy.png by pictonic.co; IconComponents.png, IconModels.png by Visual Pharm; IconFormatText.png, IconFormatBlank.png by Honza Dousek; IconTypeSystem.png by Designmodo