The document provides detailed information about the DKPro Core type system.

The DKPro Core type system forms the interface between all the integrated components. Components store and retrieve their data from the UIMA CAS based on this type system. The type system design is using a rather flat hierarchy and a mostly loose coupling between annotations. It is offered as a set of modules, not as a single monolithic type system.

Types

Table 1. Top-level Types
Type Description

Anomaly

No description

ArticleInfo

Contains basic information about the article.

Chunk

No description

Compound

This type represents a decompounding word, i.e.: flowerpot.

Constituent

No description

CoreferenceLink

A link in the coreference chain.

DBConfig

Database configuration for the connection to the database where the CAS data was retrieved.

Dependency

A dependency relation between two tokens.

DiscourseArgument

Discourse argument (arg1, arg2)

DiscourseAttribution

Attribution annotation (see PTDB for details); not connected to any particular relation as it may belong to two relations thus is covered by DiscourseRelation

DiscourseConnective

Discourse connective

DiscourseRelation

Discourse relation

Div

Document structure element.

Field

No description

Lemma

No description

LexicalPhrase

No description

MetaDataStringField

A general purpose annotation to store document-wide information in the form of arbitrary key-value string pairs.

Morpheme

No description

MorphologicalFeatures

Morphological categories that can be attached to tokens.

NGram

No description

NamedEntity

Named entities refer e.g. to persons, locations, organizations and so on.

POS

The part of speech of a word or a phrase.

PennTree

The Penn Treebank-style phrase structure string.

PhoneticTranscription

Represents the phonetic transcription of some textual element (usually a Token).

RSTTreeNode

RST Tree node

ReadabilityScore

No description

SemArg

The SemArg annotation is attached to semantic arguments of semantic predicates.

SemPred

One of the predicates of a sentence (often a main verb, but nouns and adjectives can also be predicates).

SemanticArgument

The SemanticArgument annotation is attached to semantic arguments of semantic predicates.

SemanticField

The SemanticField is a coarse-grained semantic category that can be attached to nouns, verbs or adjectives.

SemanticPredicate

One of the predicates of a sentence (often a main verb, but nouns and adjectives can also be predicates).

Sentence

No description

SofaChangeAnnotation

Encodes an edit operation that can be interpreted by the ApplyChangesAnnotator.

Split

This type represents a part of a decompounding word.

StanfordSentimentAnnotation

Stanford CoreNLP Sentiment annotation

Stem

No description

StopWord

No description

SuggestedAction

No description

SurfaceForm

This annotation can be used to indicate an alternate surface form.

TagsetDescription

Information about a tagset (controlled vocabulary).

Tfidf

Annotates the tf.idf score of a token, stem, or lemma.

TimerAnnotation

Used for storing timing information (e.g. for performance testing).

Token

Token is one of the two types commonly produced by a segmenter (the other being Sentence).

TokenForm

A alternative token text which should be used instead of the covered text if set on a token.

TopicDistribution

An array representing the topic proportions in a document.

WikipediaLink

Wikipedia link

WikipediaRevision

Represents a revision in Wikipedia.

WordEmbedding

An array representing the word embedding vector.

WordSense

No description

XmlDocument

XML document

XmlNode

Supertype for XmlElements and XmlTextNodes.

Anomalies

Anomaly

Description
Features of Anomaly (3)
description (String)

No description

suggestions (FSArray of SuggestedAction)

An array of the suggested actions to be taken for this anomaly.

category (String)

No description

Table 2. Producers and consumers of Anomaly

Producers

None declared

Consumers

None declared

Table 3. Sub-types of Anomaly (2)
Type Description

GrammarAnomaly

No description

SpellingAnomaly

No description

SuggestedAction

URI

http://dkpro.github.io/dkpro-core/releases/2.2.0/docs/typesystem-reference.html#de.tudarmstadt.ukp.dkpro.core.api.anomaly.type.SuggestedAction

Name

de.tudarmstadt.ukp.dkpro.core.api.anomaly.type.SuggestedAction

Supertype

Annotation

Description
Features of SuggestedAction (2)
replacement (String)

The text covered by the Anomaly annotation should be replaced with the contents of this feature.

certainty (Float)

A score representing how certain is this suggested action. Usually in [0,1].

Table 4. Producers and consumers of SuggestedAction

Producers

JazzyChecker

Consumers

None declared

GrammarAnomaly

Description
Table 5. Producers and consumers of GrammarAnomaly

Producers

LanguageToolChecker

Consumers

None declared

SpellingAnomaly

Description
Table 6. Producers and consumers of SpellingAnomaly

Producers

JazzyChecker

Consumers

SpellingNormalizer

Coreference

ts coreference
Figure 1. Coreference types

This type system contains two types: CoreferenceChain and CoreferenceLink. The CoreferenceChain marks the beginning of a chain. It points to the first CoreferenceLink in the chain. Each CoreferenceLink then points to the next link.

CoreferenceChain

Description

Marks the beginning of a chain.

Features of CoreferenceChain (1)
first (CoreferenceLink)

This is the first corefernce link in coreference chain

Table 7. Producers and consumers of CoreferenceChain

Producers

CoreNlpCoreferenceResolver StanfordCoreferenceResolver Conll2012 (format) Tcf (format)

Consumers

Conll2012 (format) Tcf (format)

Description

A link in the coreference chain.

Features of CoreferenceLink (3)
next (CoreferenceLink)

If there is one, it is the next coreference link to the current coreference link

referenceType (String)

The role or type which the covered text has in the coreference chain.

referenceRelation (String)

The type of relation between this link and the next link in the chain.

Table 8. Producers and consumers of CoreferenceLink

Producers

CoreNlpCoreferenceResolver StanfordCoreferenceResolver Tcf (format)

Consumers

Tcf (format)

Discourse

DiscourseArgument

URI

http://dkpro.github.io/dkpro-core/releases/2.2.0/docs/typesystem-reference.html#de.tudarmstadt.ukp.dkpro.core.api.discourse.type.pdtb.DiscourseArgument

Name

de.tudarmstadt.ukp.dkpro.core.api.discourse.type.pdtb.DiscourseArgument

Supertype

Annotation

Description

Discourse argument (arg1, arg2)

Features of DiscourseArgument (3)
parentRelationId (Integer)

ID of the parent relation

argumentNumber (Integer)

1 or 2

argumentType (String)

argument type, e.g. Cause, etc.

Table 9. Producers and consumers of DiscourseArgument

Producers

None declared

Consumers

None declared

DiscourseAttribution

URI

http://dkpro.github.io/dkpro-core/releases/2.2.0/docs/typesystem-reference.html#de.tudarmstadt.ukp.dkpro.core.api.discourse.type.pdtb.DiscourseAttribution

Name

de.tudarmstadt.ukp.dkpro.core.api.discourse.type.pdtb.DiscourseAttribution

Supertype

Annotation

Description

Attribution annotation (see PTDB for details); not connected to any particular relation as it may belong to two relations thus is covered by DiscourseRelation

Features of DiscourseAttribution (1)
attributeId (Integer)

No description

Table 10. Producers and consumers of DiscourseAttribution

Producers

None declared

Consumers

None declared

DiscourseConnective

URI

http://dkpro.github.io/dkpro-core/releases/2.2.0/docs/typesystem-reference.html#de.tudarmstadt.ukp.dkpro.core.api.discourse.type.pdtb.DiscourseConnective

Name

de.tudarmstadt.ukp.dkpro.core.api.discourse.type.pdtb.DiscourseConnective

Supertype

Annotation

Description

Discourse connective

Features of DiscourseConnective (2)
connectiveType (String)

connective type

parentRelationId (Integer)

ID of the parent relation

Table 11. Producers and consumers of DiscourseConnective

Producers

None declared

Consumers

None declared

DiscourseRelation

Description
null
Features of DiscourseRelation (3)
relationType (String)

Relation type (elaboration, contrast, etc.)

arg1 (RSTTreeNode)

No description

arg2 (RSTTreeNode)

No description

Table 12. Producers and consumers of DiscourseRelation

Producers

None declared

Consumers

None declared

DiscourseRelation

URI

http://dkpro.github.io/dkpro-core/releases/2.2.0/docs/typesystem-reference.html#de.tudarmstadt.ukp.dkpro.core.api.discourse.type.pdtb.DiscourseRelation

Name

de.tudarmstadt.ukp.dkpro.core.api.discourse.type.pdtb.DiscourseRelation

Supertype

Annotation

Description

Discourse relation

Features of DiscourseRelation (3)
relationId (Integer)

id of the relation

arg1 (DiscourseArgument)

arg 1

arg2 (DiscourseArgument)

arg 2

Table 13. Producers and consumers of DiscourseRelation

Producers

None declared

Consumers

None declared

Table 14. Sub-types of DiscourseRelation (1)
Type Description

ExplicitDiscourseRelation

Discourse relation

EDU

Description
null
Features of EDU (1)
originalText (String)

No description

Table 15. Producers and consumers of EDU

Producers

None declared

Consumers

None declared

ExplicitDiscourseRelation

Description

Discourse relation

Features of ExplicitDiscourseRelation (2)
discourseConnective1 (DiscourseConnective)

Discourse connective (in case of explicit relations)

discourseConnective2 (DiscourseConnective)

Discourse connective (in case of explicit relations)

Table 16. Producers and consumers of ExplicitDiscourseRelation

Producers

None declared

Consumers

None declared

RSTTreeNode

URI

http://dkpro.github.io/dkpro-core/releases/2.2.0/docs/typesystem-reference.html#de.tudarmstadt.ukp.dkpro.core.api.discourse.type.rst.RSTTreeNode

Name

de.tudarmstadt.ukp.dkpro.core.api.discourse.type.rst.RSTTreeNode

Supertype

Annotation

Description

RST Tree node

Features of RSTTreeNode (1)
unitType (String)

N or S (nucleus/satellite)

Table 17. Producers and consumers of RSTTreeNode

Producers

None declared

Consumers

None declared

Table 18. Sub-types of RSTTreeNode (2)
Type Description

DiscourseRelation

No description

EDU

No description

ImplicitDiscourseRelation

Description

Implicit discourse relation

Table 19. Producers and consumers of ImplicitDiscourseRelation

Producers

None declared

Consumers

None declared

Metadata

Recording tagset and tag descriptions in the CAS is still a feature under development. It is not supported by all components and it is not yet well defined. Expect changes and enhancements to this feature in future versions of DKPro Core.

ts metadata
Figure 2. Metadata types

DocumentMetaData

Description

The DocumentMetaData annotation stores information about a single processed document. There can only be one of these annotations per CAS. The annotation is created by readers and contains information to uniquely identify the document from which a CAS was created. Writer components use this information when determining under which filename a CAS is stored.

There are two principle ways of identifying a document:

  • collection id / document id: this simple system identifies a document within a collection. The ID of the collection and the document are each simple strings without any further semantics such as e.g. a hierarchy. For this reason, this identification scheme is not well suited to preserve information about directory structures.
  • document base URI / document URI: this system identifies a document using a URI. The base URI is used to derive the relative path of the document with respect to the base location from where it has been read. E.g. if the base URI is file:/texts and the document URI is file:/texts/english/text1.txt, then the relativ path of the document is english/text1.txt. This information is used by writers to recreate the directory structure found under the base location in the target location.

It is possible and indeed common for a writer to initialize both systems of identification. If both systems are present, most writers default to using the URI-based systems. However, most writers also allow forcing the use of the ID-based systems.

In addition to the features given here, there is a language feature inherited from UIMA's DocumentAnnotation. DKPro Core components expect a two letter ISO 639-1 language code there.

Features of DocumentMetaData (6)
documentTitle (String)

The human readable title of the document.

documentId (String)

The id of the document.

documentUri (String)

The URI of the document.

collectionId (String)

The ID of the whole document collection.

documentBaseUri (String)

Base URI of the document.

isLastSegment (Boolean)

CAS de-multipliers need to know whether a CAS is the last multiplied segment. Thus CAS multipliers should set this field to true for the last CAS they produce.

Table 20. Producers and consumers of DocumentMetaData

Producers

ApplyChangesAnnotator AclAnthology (format) Ancora (format) AnnotatedGigaword (format) BlikiWikipedia (format) Bnc (format) Concrete (format) Conll2000 (format) Conll2002 (format) Conll2003 (format) Conll2006 (format) Conll2008 (format) Conll2009 (format) Conll2012 (format) ConllCoreNlp (format) ConllU (format) Html (format) HtmlDocument (format) ImsCwb (format) Jdbc (format) Lcc (format) Lif (format) Lxf (format) NegraExport (format) Nif (format) Nitf (format) Pdf (format) PennTreebankChunked (format) PennTreebankCombined (format) Perseus (format) PubAnnotation (format) RTF (format) Reuters21578Sgml (format) Reuters21578Txt (format) String (format) Tcf (format) Tei (format) Text (format) TigerXml (format) Tika (format) TuebaDZ (format) Tuepp (format) WikipediaArticleInfo (format) WikipediaRevision (format) WikipediaRevisionPair (format) WikipediaTemplateFilteredArticle (format) Xmi (format) Xml (format) XmlDocument (format) XmlText (format) XmlXPath (format)

Consumers

ApplyChangesAnnotator BinaryCas (format) Concrete (format) Conll2000 (format) Conll2002 (format) Conll2003 (format) Conll2006 (format) Conll2008 (format) Conll2009 (format) Conll2012 (format) ConllCoreNlp (format) ConllU (format) DiTop (format) ImsCwb (format) InlineXml (format) Json (format) Lif (format) Lxf (format) Nif (format) PennTreebankCombined (format) PubAnnotation (format) SerializedCas (format) Tcf (format) Tei (format) Text (format) TigerXml (format) TokenizedText (format) WebannoTsv3X (format) Xmi (format) XmlDocument (format)

MetaDataStringField

URI

http://dkpro.github.io/dkpro-core/releases/2.2.0/docs/typesystem-reference.html#de.tudarmstadt.ukp.dkpro.core.api.metadata.type.MetaDataStringField

Name

de.tudarmstadt.ukp.dkpro.core.api.metadata.type.MetaDataStringField

Supertype

Annotation

Description

A general purpose annotation to store document-wide information in the form of arbitrary key-value string pairs.

Features of MetaDataStringField (2)
key (String)

Name of a metadata field.

value (String)

The field value.

Table 21. Producers and consumers of MetaDataStringField

Producers

MauiKeywordAnnotator

Consumers

None declared

TagDescription

Description

Description of an individual tag.

Features of TagDescription (1)
name (String)

The name of the tag.

Table 22. Producers and consumers of TagDescription

Producers

None declared

Consumers

None declared

TagsetDescription

URI

http://dkpro.github.io/dkpro-core/releases/2.2.0/docs/typesystem-reference.html#de.tudarmstadt.ukp.dkpro.core.api.metadata.type.TagsetDescription

Name

de.tudarmstadt.ukp.dkpro.core.api.metadata.type.TagsetDescription

Supertype

Annotation

Description

Information about a tagset (controlled vocabulary).

Features of TagsetDescription (9)
layer (String)

The layer to which the tagset applies. This is typically the name of an UIMA type such as "de.tudarmstadt.ukp.dkpro.core.api.lexmorph.type.pos.POS".

name (String)

The name of the tagset.

tags (FSArray of TagDescription)

Descriptions of the tags belonging to this tagset.

componentName (String)

No description

modelLocation (String)

No description

modelVariant (String)

No description

modelLanguage (String)

No description

modelVersion (String)

No description

input (Boolean)

True if the tagset is used as input by the component/model, otherwise false.

Table 23. Producers and consumers of TagsetDescription

Producers

None declared

Consumers

None declared

Morphology

ts morphology
Figure 3. Morphology types

Morpheme

URI

http://dkpro.github.io/dkpro-core/releases/2.2.0/docs/typesystem-reference.html#de.tudarmstadt.ukp.dkpro.core.api.lexmorph.type.morph.Morpheme

Name

de.tudarmstadt.ukp.dkpro.core.api.lexmorph.type.morph.Morpheme

Supertype

Annotation

Description
Features of Morpheme (1)
morphTag (String)

No description

Table 24. Producers and consumers of Morpheme

Producers

MateMorphTagger

Consumers

None declared

MorphologicalFeatures

Description

Morphological categories that can be attached to tokens. The features are supposed to match the Universal Dependency v1 features.

Features of MorphologicalFeatures (19)
gender (String)

No description

number (String)

Singular/plural

case (String)

Nouns: nominative, genetiv, dative, …​

degree (String)

Adjectives: comparative/Superlative

verbForm (String)

No description

tense (String)

Verbs: past tense, present tense, future tense, etc.

mood (String)

Verbs: indicative, imperative, subjunctive

voice (String)

Verbs: active/passive

definiteness (String)

Definite or indefinite

value (String)

The original morphological analysis results as produced by a tool or as recorded in a corpus (if available). If the categories were originally encoded in such a string, the other features are filled by analyzing this string. If the categories were provided separately, e.g. by different attributed in an XML-encoded corpus, this field may remain empty.

person (String)

Verbs: 1st, 2nd, 3rd person

aspect (String)

Verbs: perfective, imperfective

animacy (String)

No description

negative (String)

No description

numType (String)

No description

possessive (String)

No description

pronType (String)

No description

reflex (String)

No description

transitivity (String)

Verbs: transitive/intransitive

@deprecated

Table 25. Producers and consumers of MorphologicalFeatures

Producers

MateMorphTagger RfTagger SfstAnnotator UDPipePosTagger Conll2006 (format) Conll2008 (format) Conll2009 (format) ConllU (format)

Consumers

UDPipeParser Conll2006 (format) Conll2008 (format) Conll2009 (format) ConllU (format)

POS

Description

The part of speech of a word or a phrase.

Features of POS (2)
PosValue (String)

Fine-grained POS tag. This is the tag as produced by a POS tagger or obtained from a reader.

coarseValue (String)

Coarse-grained POS tag. This may be produced by a POS tagger or reader in addition to the fine-grained tag.

Table 26. Producers and consumers of POS

Producers

ArktweetPosTagger ClearNlpPosTagger CoreNlpPosTagger HepplePosTagger HunPosTagger LingPipePosTagger MatePosTagger MeCabTagger Nlp4JPosTagger OpenNlpPosTagger PosMapper RfTagger SfstAnnotator StanfordPosTagger TreeTaggerPosTagger UDPipePosTagger Ancora (format) Bnc (format) Conll2006 (format) Conll2008 (format) Conll2009 (format) Conll2012 (format) ConllCoreNlp (format) ConllU (format) ImsCwb (format) Lxf (format) NegraExport (format) Nif (format) PennTreebankChunked (format) PennTreebankCombined (format) Perseus (format) Tcf (format) Tei (format) TigerXml (format) Tuepp (format)

Consumers

ArktweetPosTaggerTrainer ClearNlpLemmatizer ClearNlpParser ClearNlpSemanticRoleLabeler CoreNlpCoreferenceResolver CoreNlpDependencyParser CoreNlpLemmatizer CoreNlpParser GermanSeparatedParticleAnnotator IxaLemmatizer MaltParser MateParser MateSemanticRoleLabeler MorphaLemmatizer MstParser Nlp4JDependencyParser Nlp4JLemmatizer Nlp4JNamedEntityRecognizer OpenNlpChunker OpenNlpLemmatizer OpenNlpPosTaggerTrainer PosFilter PosMapper SemanticFieldAnnotator StanfordCoreferenceResolver StanfordLemmatizer StanfordParser StanfordPosTaggerTrainer TokenMerger TreeTaggerChunker UDPipeParser Conll2006 (format) Conll2008 (format) Conll2009 (format) Conll2012 (format) ConllCoreNlp (format) ConllU (format) ImsCwb (format) Lxf (format) Nif (format) PennTreebankCombined (format) Tcf (format) Tei (format) TigerXml (format) XcesXml (format)

Table 27. Sub-types of POS (43)
Type Description

ADJ

Adjective

@deprecated Use POS_ADJ instead

ADP

Adposition

@deprecated Use POS_ADP instead

ADV

Adverb

@deprecated Use POS_ADV instead

ART

Determiners and articles.

AUX

Auxiliary verb

@deprecated Use POS_AUX instead

CARD

Numerals

@deprecated Use POS_NUM instead

CONJ

Conjunction

@deprecated Use POS_CONJ instead

DET

Determiner

@deprecated Use POS_DET instead

INTJ

Interjection

@deprecated Use POS_INTJ instead

N

Nouns

@deprecated Use POS_NOUN instead

NOUN

Noun

@deprecated Use POS_NOUN instead

NUM

Numeral

@deprecated Use POS_NUM instead

O

Catch-all for other categories such as abbreviations or foreign words

@deprecated Use POS_X instead

PART

Particle

@deprecated Use POS_PART instead

POS_ADJ

Adjective

POS_ADP

Adposition

POS_ADV

Adverb

POS_AUX

Auxiliary verb

POS_CONJ

Conjunction

POS_DET

Determiner

POS_INTJ

Interjection

POS_NOUN

Noun

POS_NUM

Numeral

POS_PART

Particle

POS_PRON

Pronoun

POS_PROPN

Proper noun

POS_PUNCT

Punctuation

POS_SCONJ

Subordinating conjunction

POS_SYM

Symbol

POS_VERB

Verb

POS_X

Other

PP

Prepositions and postpositions

@deprecated Use POS_ADP instead

PR

Pronoun

@deprecated Use POS_PRON instead

PRON

Pronoun

@deprecated Use POS_PRON instead

PROPN

Proper noun

@deprecated Use POS_PROPN instead

PRT

Particles

@deprecated Use POS_PART instead

PUNC

Punctuation marks

@deprecated Use POS_PUNCT instead

PUNCT

Punctuation

@deprecated Use POS_PUNCT instead

SCONJ

Subordinating conjunction

@deprecated Use POS_SCONJ instead

SYM

Symbol

@deprecated Use POS_SYM instead

V

Verbs

@deprecated Use POS_VERB instead

VERB

Verb

@deprecated Use POS_VERB instead

X

Other

@deprecated Use POS_X instead

ADJ

Description

Adjective @deprecated Use POS_ADJ instead

Table 28. Producers and consumers of ADJ

Producers

None declared

Consumers

None declared

ADP

Description

Adposition @deprecated Use POS_ADP instead

Table 29. Producers and consumers of ADP

Producers

None declared

Consumers

None declared

ADV

Description

Adverb @deprecated Use POS_ADV instead

Table 30. Producers and consumers of ADV

Producers

None declared

Consumers

None declared

ART

Description

Determiners and articles. @deprecated Use POS_DET instead

Table 31. Producers and consumers of ART

Producers

None declared

Consumers

None declared

AT

Description

at-mention (indicates another user as a recipient of a tweet) @deprecated Use POS_AT instead

Table 32. Producers and consumers of AT

Producers

None declared

Consumers

None declared

AUX

Description

Auxiliary verb @deprecated Use POS_AUX instead

Table 33. Producers and consumers of AUX

Producers

None declared

Consumers

None declared

CARD

Description

Numerals @deprecated Use POS_NUM instead

Table 34. Producers and consumers of CARD

Producers

None declared

Consumers

None declared

CONJ

Description

Conjunction @deprecated Use POS_CONJ instead

Table 35. Producers and consumers of CONJ

Producers

None declared

Consumers

None declared

DET

Description

Determiner @deprecated Use POS_DET instead

Table 36. Producers and consumers of DET

Producers

None declared

Consumers

None declared

DM

Description

discourse marker, indications of continuation of a message across multiple tweets @deprecated Use POS_DM instead

Table 37. Producers and consumers of DM

Producers

None declared

Consumers

None declared

EMO

Description

emoticon @deprecated Use POS_EMO instead

Table 38. Producers and consumers of EMO

Producers

None declared

Consumers

None declared

HASH

Description

Hashtag (indicates topic/category for tweet) @deprecated Use POS_HASH instead

Table 39. Producers and consumers of HASH

Producers

None declared

Consumers

None declared

INT

Description

proper noun + verbal @deprecated Use POS_INT instead

Table 40. Producers and consumers of INT

Producers

None declared

Consumers

None declared

INTJ

Description

Interjection @deprecated Use POS_INTJ instead

Table 41. Producers and consumers of INTJ

Producers

None declared

Consumers

None declared

N

Description

Nouns @deprecated Use POS_NOUN instead

Table 42. Producers and consumers of N

Producers

None declared

Consumers

None declared

Table 43. Sub-types of N (4)
Type Description

NN

Common noun

@deprecated Use POS_NOUN instead

NNV

nominal + verbal

@deprecated Use POS_NNV instead

NP

Proper noun

@deprecated Use POS_PROPN instead

NPV

proper noun + verbal

@deprecated Use POS_NPV instead

NN

Description

Common noun @deprecated Use POS_NOUN instead

Table 44. Producers and consumers of NN

Producers

None declared

Consumers

None declared

NNV

Description

nominal + verbal @deprecated Use POS_NNV instead

Table 45. Producers and consumers of NNV

Producers

None declared

Consumers

None declared

NOUN

Description

Noun @deprecated Use POS_NOUN instead

Table 46. Producers and consumers of NOUN

Producers

None declared

Consumers

None declared

NP

Description

Proper noun @deprecated Use POS_PROPN instead

Table 47. Producers and consumers of NP

Producers

None declared

Consumers

None declared

NPV

Description

proper noun + verbal @deprecated Use POS_NPV instead

Table 48. Producers and consumers of NPV

Producers

None declared

Consumers

None declared

NUM

Description

Numeral @deprecated Use POS_NUM instead

Table 49. Producers and consumers of NUM

Producers

None declared

Consumers

None declared

O

Description

Catch-all for other categories such as abbreviations or foreign words @deprecated Use POS_X instead

Table 50. Producers and consumers of O

Producers

None declared

Consumers

None declared

Table 51. Sub-types of O (6)
Type Description

AT

at-mention (indicates another user as a recipient of a tweet)

@deprecated Use POS_AT instead

DM

discourse marker, indications of continuation of a message across multiple tweets

@deprecated Use POS_DM instead

EMO

emoticon

@deprecated Use POS_EMO instead

HASH

Hashtag (indicates topic/category for tweet)

@deprecated Use POS_HASH instead

INT

proper noun + verbal

@deprecated Use POS_INT instead

URL

URL or email address

@deprecated Use POS_URL instead

PART

Description

Particle @deprecated Use POS_PART instead

Table 52. Producers and consumers of PART

Producers

None declared

Consumers

None declared

POS_ADJ

Description

Adjective

Table 53. Producers and consumers of POS_ADJ

Producers

None declared

Consumers

None declared

POS_ADP

Description

Adposition

Table 54. Producers and consumers of POS_ADP

Producers

None declared

Consumers

None declared

POS_ADV

Description

Adverb

Table 55. Producers and consumers of POS_ADV

Producers

None declared

Consumers

None declared

POS_AT

Description

at-mention (indicates another user as a recipient of a tweet)

Table 56. Producers and consumers of POS_AT

Producers

None declared

Consumers

None declared

POS_AUX

Description

Auxiliary verb

Table 57. Producers and consumers of POS_AUX

Producers

None declared

Consumers

None declared

POS_CONJ

Description

Conjunction

Table 58. Producers and consumers of POS_CONJ

Producers

None declared

Consumers

None declared

POS_DET

Description

Determiner

Table 59. Producers and consumers of POS_DET

Producers

None declared

Consumers

None declared

POS_DM

Description

discourse marker, indications of continuation of a message across multiple tweets

Table 60. Producers and consumers of POS_DM

Producers

None declared

Consumers

None declared

POS_EMO

Description

emoticon

Table 61. Producers and consumers of POS_EMO

Producers

None declared

Consumers

None declared

POS_HASH

Description

Hashtag (indicates topic/category for tweet)

Table 62. Producers and consumers of POS_HASH

Producers

None declared

Consumers

None declared

POS_INT

Description

proper noun + verbal

Table 63. Producers and consumers of POS_INT

Producers

None declared

Consumers

None declared

POS_INTJ

Description

Interjection

Table 64. Producers and consumers of POS_INTJ

Producers

None declared

Consumers

None declared

POS_NNV

Description

nominal + verbal

Table 65. Producers and consumers of POS_NNV

Producers

None declared

Consumers

None declared

POS_NOUN

Description

Noun

Table 66. Producers and consumers of POS_NOUN

Producers

None declared

Consumers

None declared

Table 67. Sub-types of POS_NOUN (2)
Type Description

POS_NNV

nominal + verbal

POS_NPV

proper noun + verbal

POS_NPV

Description

proper noun + verbal

Table 68. Producers and consumers of POS_NPV

Producers

None declared

Consumers

None declared

POS_NUM

Description

Numeral

Table 69. Producers and consumers of POS_NUM

Producers

None declared

Consumers

None declared

POS_PART

Description

Particle

Table 70. Producers and consumers of POS_PART

Producers

None declared

Consumers

None declared

POS_PRON

Description

Pronoun

Table 71. Producers and consumers of POS_PRON

Producers

None declared

Consumers

None declared

POS_PROPN

Description

Proper noun

Table 72. Producers and consumers of POS_PROPN

Producers

None declared

Consumers

None declared

POS_PUNCT

Description

Punctuation

Table 73. Producers and consumers of POS_PUNCT

Producers

None declared

Consumers

None declared

POS_SCONJ

Description

Subordinating conjunction

Table 74. Producers and consumers of POS_SCONJ

Producers

None declared

Consumers

None declared

POS_SYM

Description

Symbol

Table 75. Producers and consumers of POS_SYM

Producers

None declared

Consumers

None declared

POS_URL

Description

URL or email address

Table 76. Producers and consumers of POS_URL

Producers

None declared

Consumers

None declared

POS_VERB

Description

Verb

Table 77. Producers and consumers of POS_VERB

Producers

None declared

Consumers

None declared

POS_X

Description

Other

Table 78. Producers and consumers of POS_X

Producers

None declared

Consumers

None declared

Table 79. Sub-types of POS_X (6)
Type Description

POS_AT

at-mention (indicates another user as a recipient of a tweet)

POS_DM

discourse marker, indications of continuation of a message across multiple tweets

POS_EMO

emoticon

POS_HASH

Hashtag (indicates topic/category for tweet)

POS_INT

proper noun + verbal

POS_URL

URL or email address

PP

Description

Prepositions and postpositions @deprecated Use POS_ADP instead

Table 80. Producers and consumers of PP

Producers

None declared

Consumers

None declared

PR

Description

Pronoun @deprecated Use POS_PRON instead

Table 81. Producers and consumers of PR

Producers

None declared

Consumers

None declared

PRON

Description

Pronoun @deprecated Use POS_PRON instead

Table 82. Producers and consumers of PRON

Producers

None declared

Consumers

None declared

PROPN

Description

Proper noun @deprecated Use POS_PROPN instead

Table 83. Producers and consumers of PROPN

Producers

None declared

Consumers

None declared

PRT

Description

Particles @deprecated Use POS_PART instead

Table 84. Producers and consumers of PRT

Producers

None declared

Consumers

None declared

PUNC

Description

Punctuation marks @deprecated Use POS_PUNCT instead

Table 85. Producers and consumers of PUNC

Producers

None declared

Consumers

None declared

PUNCT

Description

Punctuation @deprecated Use POS_PUNCT instead

Table 86. Producers and consumers of PUNCT

Producers

None declared

Consumers

None declared

SCONJ

Description

Subordinating conjunction @deprecated Use POS_SCONJ instead

Table 87. Producers and consumers of SCONJ

Producers

None declared

Consumers

None declared

SYM

Description

Symbol @deprecated Use POS_SYM instead

Table 88. Producers and consumers of SYM

Producers

None declared

Consumers

None declared

URL

Description

URL or email address @deprecated Use POS_URL instead

Table 89. Producers and consumers of URL

Producers

None declared

Consumers

None declared

V

Description

Verbs @deprecated Use POS_VERB instead

Table 90. Producers and consumers of V

Producers

None declared

Consumers

None declared

VERB

Description

Verb @deprecated Use POS_VERB instead

Table 91. Producers and consumers of VERB

Producers

None declared

Consumers

None declared

X

Description

Other @deprecated Use POS_X instead

Table 92. Producers and consumers of X

Producers

None declared

Consumers

None declared

NYTArticleMetaData

ArticleMetaData

Description

A document annotation that describes the metadata of a newspaper article.

Features of ArticleMetaData (15)
guid (Integer)

The GUID field specifies a (4-byte) integer that is guaranteed to be unique for every document in the corpus.

alternateUrl (String)

This field specifies the location on nytimes.com of the article. When present, this URL is preferred to the URL field on articles published on or after April 02, 2006, as the linked page will have richer content.

url (String)

This field specifies the location on nytimes.com of the article. The 'Alternative Url' field is preferred to this field on articles published on or after April 02, 2006, as the linked page will have richer content.

publicationDate (String)

This field specifies the date of the article’s publication. This field is specified in the format YYYYMMDD’T’HHMMSS where:

  1. YYYY is the four-digit year.

  2. MM is the two-digit month [01-12].

  3. DD is the two-digit day [01-31]. 4. T is a constant value.

  4. HH is the two-digit hour [00-23].

  5. MM is the two-digit minute-past-the hour [00-59]

  6. SS is the two-digit seconds-past-the-minute [00-59]. Please note that values for HH,MM, and SS are not defined for this corpus, that is to day HH,MM, and SS are always defined to be '00'.

typesOfMaterial (StringArray)

This field specifies a normalized list of terms describing the general editorial category of the article. These tags are algorithmically assigned and manually verified by nytimes.com production staff. Examples Include:

  • REVIEW

  • OBITUARY

  • ANALYSIS

headline (String)

This field specifies the headline of the article as it appeared in the print edition of the New York Times.

onlineHeadline (String)

This field specifies the headline displayed with the article on nytimes.com. Often this differs from the headline used in print.

columnName (String)

If the article is part of a regular column, this field specifies the name of that column. Sample Column Names:

  1. World News Briefs

  2. WEDDINGS

  3. The Accessories Channel

author (String)

This field is based on the normalized byline in the original corpus data: "The Normalized Byline field is the byline normalized to the form (last name, first name)".

descriptors (StringArray)

The 'descriptors' field specifies a list of descriptive terms drawn from a normalized controlled vocabulary corresponding to subjects mentioned in the article. These tags are hand-assigned by a team of library scientists working in the New York Times Indexing service. Examples Include:

  • ECONOMIC CONDITIONS AND TRENDS

  • AIRPLANES

  • VIOLINS

onlineDescriptors (StringArray)

This field specifies a list of descriptors from a normalized controlled vocabulary that correspond to topics mentioned in the article. These tags are algorithmically assigned and manually verified by nytimes.com production staff. Examples Include:

  • Marriages

  • Parks and Other Recreation Areas

  • Cooking and Cookbooks

generalOnlineDescriptors (String)

The 'general online descriptors' field specifies a list of descriptors that are at a higher level of generality than the other tags associated with the article. These tags are algorithmically assigned and manually verified by nytimes.com production staff. Examples Include:

  • Surfing

  • Venice Biennale

  • Ranches

onlineSection (String)

This field specifies the section(s) on nytimes.com in which the article is placed. If the article is placed in multiple sections, this field will be specified as a ';' delineated list.

section (String)

This field specifies the section of the paper in which the article appears. This is not the name of the section, but rather a letter or number that indicates the section.

taxonomicClassifiers (StringArray)

This field specifies a list of taxonomic classifiers that place this article into a hierarchy of articles. The individual terms of each taxonomic classifier are separated with the '/' character. These tags are algorithmically assigned and manually verified by nytimes.com production staff. Examples Include:

  • Top/Features/Travel/Guides/Destinations/North America/United States/Arizona

  • Top/News/U.S./Rockies

  • Top/Opinion

Table 93. Producers and consumers of ArticleMetaData

Producers

Nitf (format)

Consumers

None declared

Phonetics

ts phonetics
Figure 4. Phonetics types

PhoneticTranscription

URI

http://dkpro.github.io/dkpro-core/releases/2.2.0/docs/typesystem-reference.html#de.tudarmstadt.ukp.dkpro.core.api.phonetics.type.PhoneticTranscription

Name

de.tudarmstadt.ukp.dkpro.core.api.phonetics.type.PhoneticTranscription

Supertype

Annotation

Description

Represents the phonetic transcription of some textual element (usually a Token). Phonetic transcriptions are e.g. generated by transcription processes like Soundex or Metaphone.

Features of PhoneticTranscription (2)
transcription (String)

The actual transcription

name (String)

The name of the transcription process that was used

Table 94. Producers and consumers of PhoneticTranscription

Producers

ColognePhoneticTranscriptor DoubleMetaphonePhoneticTranscriptor MetaphonePhoneticTranscriptor SoundexPhoneticTranscriptor

Consumers

None declared

Phrase

LexicalPhrase

URI

http://dkpro.github.io/dkpro-core/releases/2.2.0/docs/typesystem-reference.html#de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.LexicalPhrase

Name

de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.LexicalPhrase

Supertype

Annotation

Description
Features of LexicalPhrase (1)
text (String)

No description

Table 95. Producers and consumers of LexicalPhrase

Producers

None declared

Consumers

None declared

ReadabilityScore

ReadabilityScore

Description
Features of ReadabilityScore (2)
measureName (String)

No description

score (Double)

No description

Table 96. Producers and consumers of ReadabilityScore

Producers

ReadabilityAnnotator

Consumers

None declared

Segmentation

ts segmentation
Figure 5. Segmentation types

The segmentation type system consists of two primary areas: tokenization (including sentences), compound words, and document structure.

The Sentence annotation type is simply a span with no futher attributes.

The Token type may be explicitly linked to a part of speech, lemma, and stem. It is expected that if either of these annotations are present, the token explicitly refers to them. If more than one annotation of such a type, e.g. multiple part-of-speech annotations are present, then it is expected that the token links to the most probable one, while the others are only located at the same offsets.

Additionally, the Token can link into the syntactic constituency structure via the parent feature.

The document structure can be encoded using the Div types. The type Div itself is a generic type representing some element of the document structure more closely specified by the divType attribute. The value of divType corresponds to the tag used in some original document format or to the output of a text segmentation tool. E.g. when reading an HTML document, the divType for a paragraph would be p, whereas in a DocBook XML file, it would instead be para.

For typical structural elements, the subtypes Document, Heading, and Paragrah are available. Document is rarely used, since the basic assumption is that a CAS always represents a document.

Compound

Description

This type represents a decompounding word, i.e.: flowerpot. Each Compound one have at least two Splits.

Features of Compound (1)
splits (FSArray of Split)

A word that can be decomposed into different parts.

Table 97. Producers and consumers of Compound

Producers

CompoundAnnotator

Consumers

None declared

Div

Description

Document structure element.

Features of Div (2)
divType (String)

No description

id (String)

If this unit had an ID in the source format from which it was imported, it may be stored here. IDs are typically not assigned by DKPro Core components. If an ID is present, it should be respected by writers.

Table 98. Producers and consumers of Div

Producers

None declared

Consumers

None declared

Table 99. Sub-types of Div (3)
Type Description

Document

No description

Heading

Document title, section heading, etc.

Paragraph

No description

JapaneseToken

Description
Features of JapaneseToken (4)
kana (String)

No description

ibo (String)

No description

kei (String)

No description

dan (String)

Specifies the kind of the verb if the current token is a verb. Either it is a vowel stem verb (ichi-dan) or a consonant stem verb (go-dan). Blank if not a verb.

Table 100. Producers and consumers of JapaneseToken

Producers

MeCabTagger

Consumers

None declared

Lemma

Description
Features of Lemma (1)
value (String)

No description

Table 101. Producers and consumers of Lemma

Producers

ClearNlpLemmatizer CoreNlpLemmatizer GateLemmatizer GermanSeparatedParticleAnnotator IxaLemmatizer LanguageToolLemmatizer MateLemmatizer MeCabTagger MorphaLemmatizer Nlp4JLemmatizer OpenNlpLemmatizer StanfordLemmatizer TokenMerger TreeTaggerPosTagger UDPipePosTagger Ancora (format) Bnc (format) Conll2006 (format) Conll2008 (format) Conll2009 (format) Conll2012 (format) ConllCoreNlp (format) ConllU (format) ImsCwb (format) Lxf (format) NegraExport (format) Nif (format) Perseus (format) Tcf (format) Tei (format) TigerXml (format) Tuepp (format) XcesXml (format)

Consumers

ClearNlpParser ClearNlpSemanticRoleLabeler CoreNlpCoreferenceResolver GermanSeparatedParticleAnnotator MaltParser MateMorphTagger MateSemanticRoleLabeler Nlp4JNamedEntityRecognizer SemanticFieldAnnotator StanfordCoreferenceResolver TokenMerger UDPipeParser Conll2006 (format) Conll2008 (format) Conll2009 (format) Conll2012 (format) ConllCoreNlp (format) ConllU (format) ImsCwb (format) Lxf (format) Nif (format) Tcf (format) Tei (format) TigerXml (format) XcesXml (format)

NGram

Description
Features of NGram (1)
text (String)

No description

Table 102. Producers and consumers of NGram

Producers

NGramAnnotator

Consumers

None declared

Sentence

Description
Features of Sentence (1)
id (String)

If this unit had an ID in the source format from which it was imported, it may be stored here. IDs are typically not assigned by DKPro Core components. If an ID is present, it should be respected by writers.

Table 103. Producers and consumers of Sentence

Producers

BreakIteratorSegmenter ClearNlpSegmenter CoreNlpSegmenter GosenSegmenter IcuSegmenter JTokSegmenter JiebaSegmenter LanguageToolSegmenter LineBasedSentenceSegmenter LingPipeSegmenter MeCabTagger Nlp4JSegmenter OpenNlpSegmenter RegexSegmenter StanfordSegmenter UDPipeSegmenter WhitespaceSegmenter Ancora (format) Bnc (format) Concrete (format) Conll2000 (format) Conll2002 (format) Conll2003 (format) Conll2006 (format) Conll2008 (format) Conll2009 (format) Conll2012 (format) ConllCoreNlp (format) ConllU (format) ImsCwb (format) Lif (format) Lxf (format) NegraExport (format) Nif (format) PennTreebankChunked (format) PennTreebankCombined (format) Perseus (format) Tcf (format) Tei (format) TigerXml (format) TuebaDZ (format) Tuepp (format) XcesXml (format)

Consumers

ArktweetPosTaggerTrainer BerkeleyParser ClearNlpLemmatizer ClearNlpParser ClearNlpPosTagger ClearNlpSemanticRoleLabeler CoreNlpCoreferenceResolver CoreNlpDependencyParser CoreNlpLemmatizer CoreNlpNamedEntityRecognizer CoreNlpParser CoreNlpPosTagger DictionaryAnnotator GermanSeparatedParticleAnnotator HepplePosTagger HunPosTagger IxaLemmatizer LanguageToolLemmatizer LingPipePosTagger MaltParser MateLemmatizer MateMorphTagger MateParser MatePosTagger MateSemanticRoleLabeler MorphaLemmatizer MstParser NGramAnnotator Nlp4JDependencyParser Nlp4JLemmatizer Nlp4JNamedEntityRecognizer Nlp4JPosTagger OpenNlpChunker OpenNlpLemmatizer OpenNlpNamedEntityRecognizerTrainer OpenNlpParser OpenNlpPosTagger OpenNlpPosTaggerTrainer OpenNlpSentenceTrainer ReadabilityAnnotator RfTagger SfstAnnotator StanfordCoreferenceResolver StanfordNamedEntityRecognizer StanfordNamedEntityRecognizerTrainer StanfordParser StanfordPosTagger StanfordPosTaggerTrainer StanfordSentimentAnalyzer UDPipeParser UDPipePosTagger Concrete (format) Conll2000 (format) Conll2002 (format) Conll2003 (format) Conll2006 (format) Conll2008 (format) Conll2009 (format) Conll2012 (format) ConllCoreNlp (format) ConllU (format) ImsCwb (format) Lif (format) Lxf (format) Nif (format) PennTreebankCombined (format) Tcf (format) Tei (format) TigerXml (format) Web1T (format) WebannoTsv3X (format) XcesXml (format)

Split

Description

This type represents a part of a decompounding word. A Split can be either a CompoundPart or a LinkingMorpheme.

Features of Split (1)
splits (FSArray of Split)

Sub-splits of the current split.

Table 104. Producers and consumers of Split

Producers

CompoundAnnotator

Consumers

None declared

Table 105. Sub-types of Split (2)
Type Description

CompoundPart

A CompoundPart represents one fragment from the compounding word.

LinkingMorpheme

This type represents a linking morpheme between two CompoundParts.

Stem

Description
Features of Stem (1)
value (String)

No description

Table 106. Producers and consumers of Stem

Producers

CisStemmer LancasterStemmer MyStemStemmer OpenNlpSnowballStemmer SmileLancasterStemmer SnowballStemmer Nif (format)

Consumers

Nif (format)

StopWord

Description
Table 107. Producers and consumers of StopWord

Producers

None declared

Consumers

StopWordRemover

SurfaceForm

URI

http://dkpro.github.io/dkpro-core/releases/2.2.0/docs/typesystem-reference.html#de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.SurfaceForm

Name

de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.SurfaceForm

Supertype

Annotation

Description

This annotation can be used to indicate an alternate surface form. E.g. some corpora consider a normalized form of the text with resolved contractions as the canonical form and only maintain the original surface form as a secondary information. One example is the Conll-U format.

Features of SurfaceForm (1)
value (String)

Alternate surface form.

Table 108. Producers and consumers of SurfaceForm

Producers

None declared

Consumers

None declared

Token

Description

Token is one of the two types commonly produced by a segmenter (the other being Sentence). A Token usually represents a word, although it may be used to represent multiple tightly connected words (e.g. "New York") or parts of a word (e.g. the possessive "'s"). One may choose to split compound words into multiple tokens, e.g. ("CamelCase" -> "Camel", "Case"; "Zauberstab" -> "Zauber", "stab"). Most processing components operate on Tokens, usually within the limits of the surrounding Sentence. E.g. a part-of-speech tagger analyses each Token in a Sentence and assigns a part-of-speech to each Token.

Features of Token (9)
parent (Annotation)

the parent of this token. This feature is meant to be used in when the token participates in a constituency parse and then refers to a constituent containing this token. The type of this feature is {@link Annotation} to avoid adding a dependency on the syntax API module.

lemma (Lemma)

No description

stem (Stem)

No description

pos (POS)

No description

morph (MorphologicalFeatures)

The morphological feature associated with this token.

id (String)

If this unit had an ID in the source format from which it was imported, it may be stored here. IDs are typically not assigned by DKPro Core components. If an ID is present, it should be respected by writers.

form (TokenForm)

Potentially normalized form of the token text that should be used instead of the covered text if set.

syntacticFunction (String)

No description

order (Integer)

Disambiguates the token order for tokens which have the same offsets, e.g. when the contraction "à" is analyzed as two tokens "a" and "a".

Table 109. Producers and consumers of Token

Producers

BreakIteratorSegmenter CamelCaseTokenSegmenter ClearNlpSegmenter CoreNlpSegmenter GosenSegmenter IcuSegmenter JTokSegmenter JiebaSegmenter LanguageToolSegmenter LingPipeSegmenter Nlp4JSegmenter OpenNlpSegmenter PatternBasedTokenSegmenter PosMapper RegexSegmenter StanfordSegmenter TokenTrimmer TrailingCharacterRemover UDPipeSegmenter WhitespaceSegmenter Ancora (format) Bnc (format) Concrete (format) Conll2000 (format) Conll2002 (format) Conll2003 (format) Conll2006 (format) Conll2008 (format) Conll2009 (format) Conll2012 (format) ConllCoreNlp (format) ConllU (format) ImsCwb (format) Lif (format) Lxf (format) NegraExport (format) Nif (format) PennTreebankChunked (format) PennTreebankCombined (format) Perseus (format) Tcf (format) Tei (format) TigerXml (format) TuebaDZ (format) Tuepp (format) XcesXml (format)

Consumers

ArktweetPosTagger ArktweetPosTaggerTrainer BerkeleyParser CamelCaseTokenSegmenter CapitalizationNormalizer ClearNlpLemmatizer ClearNlpParser ClearNlpPosTagger ClearNlpSemanticRoleLabeler ColognePhoneticTranscriptor CompoundAnnotator CoreNlpCoreferenceResolver CoreNlpDependencyParser CoreNlpLemmatizer CoreNlpNamedEntityRecognizer CoreNlpParser CoreNlpPosTagger DictionaryAnnotator DoubleMetaphonePhoneticTranscriptor ExpressiveLengtheningNormalizer GateLemmatizer GermanSeparatedParticleAnnotator HepplePosTagger HunPosTagger IxaLemmatizer JazzyChecker LancasterStemmer LanguageToolLemmatizer LingPipeNamedEntityRecognizer LingPipePosTagger MalletEmbeddingsAnnotator MalletLdaTopicModelInferencer MaltParser MateLemmatizer MateMorphTagger MateParser MatePosTagger MateSemanticRoleLabeler MetaphonePhoneticTranscriptor MorphaLemmatizer MstParser MyStemStemmer NGramAnnotator Nlp4JDependencyParser Nlp4JLemmatizer Nlp4JNamedEntityRecognizer Nlp4JPosTagger NorvigSpellingCorrector OpenNlpChunker OpenNlpLemmatizer OpenNlpNamedEntityRecognizer OpenNlpNamedEntityRecognizerTrainer OpenNlpParser OpenNlpPosTagger OpenNlpPosTaggerTrainer OpenNlpTokenTrainer PatternBasedTokenSegmenter PosMapper ReadabilityAnnotator RegexBasedTokenTransformer RegexTokenFilter ReplacementFileNormalizer RfTagger SemanticFieldAnnotator SfstAnnotator SmileLancasterStemmer SoundexPhoneticTranscriptor StanfordCoreferenceResolver StanfordDependencyConverter StanfordLemmatizer StanfordNamedEntityRecognizer StanfordNamedEntityRecognizerTrainer StanfordParser StanfordPosTagger StanfordPosTaggerTrainer StanfordSentimentAnalyzer TokenMerger TokenTrimmer TrailingCharacterRemover TreeTaggerPosTagger UDPipeParser UDPipePosTagger UmlautNormalizer Concrete (format) Conll2000 (format) Conll2002 (format) Conll2003 (format) Conll2006 (format) Conll2008 (format) Conll2009 (format) Conll2012 (format) ConllCoreNlp (format) ConllU (format) ImsCwb (format) Lif (format) Lxf (format) Nif (format) PennTreebankCombined (format) Tcf (format) Tei (format) TigerXml (format) WebannoTsv3X (format) XcesXml (format)

Table 110. Sub-types of Token (1)
Type Description

JapaneseToken

No description

TokenForm

URI

http://dkpro.github.io/dkpro-core/releases/2.2.0/docs/typesystem-reference.html#de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.TokenForm

Name

de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.TokenForm

Supertype

Annotation

Description

A alternative token text which should be used instead of the covered text if set on a token.

Features of TokenForm (1)
value (String)

No description

Table 111. Producers and consumers of TokenForm

Producers

None declared

Consumers

None declared

CompoundPart

Description

A CompoundPart represents one fragment from the compounding word. Besides that, it can store other CompoundParts if it can be split again. The way it stores a decompounding word represents a decompounding tree.

Table 112. Producers and consumers of CompoundPart

Producers

CompoundAnnotator

Consumers

None declared

Document

Description
Table 113. Producers and consumers of Document

Producers

None declared

Consumers

None declared

Heading

Description

Document title, section heading, etc.

Table 114. Producers and consumers of Heading

Producers

Html (format) HtmlDocument (format) Nif (format) Pdf (format)

Consumers

Nif (format)

LinkingMorpheme

Description

This type represents a linking morpheme between two CompoundParts.

Table 115. Producers and consumers of LinkingMorpheme

Producers

CompoundAnnotator

Consumers

None declared

Paragraph

Description
Table 116. Producers and consumers of Paragraph

Producers

JTokSegmenter ParagraphSplitter Html (format) HtmlDocument (format) Lif (format) Nif (format) Pdf (format) Tei (format) XcesBasicXml (format) XcesXml (format)

Consumers

Lif (format) Nif (format) Tei (format) XcesBasicXml (format) XcesXml (format)

Semantics

ts semantics
Figure 6. Segmentation types

NamedEntity

Description

Named entities refer e.g. to persons, locations, organizations and so on. They often consist of multiple tokens.

Features of NamedEntity (2)
value (String)

The class/category of the named entity, e.g. person, location, etc.

identifier (String)

Identifier of the named entity, e.g. a reference into a person database.

Table 117. Producers and consumers of NamedEntity

Producers

CoreNlpNamedEntityRecognizer LingPipeNamedEntityRecognizer Nlp4JNamedEntityRecognizer OpenNlpNamedEntityRecognizer SemanticFieldAnnotator StanfordNamedEntityRecognizer Conll2002 (format) Conll2003 (format) Conll2012 (format) ConllCoreNlp (format) Lif (format) Nif (format) Tcf (format) Tei (format)

Consumers

CoreNlpCoreferenceResolver OpenNlpNamedEntityRecognizerTrainer StanfordCoreferenceResolver StanfordNamedEntityRecognizerTrainer Conll2002 (format) Conll2003 (format) Conll2012 (format) ConllCoreNlp (format) Lif (format) Nif (format) Tcf (format) Tei (format)

Table 118. Sub-types of NamedEntity (30)
Type Description

Animal

No description

Cardinal

No description

ContactInfo

No description

Date

No description

Disease

No description

Event

No description

Fac

No description

FacDesc

No description

Game

No description

Gpe

No description

GpeDesc

No description

Language

No description

Law

No description

Location

No description

Money

No description

Nationality

No description

Norp

No description

Ordinal

No description

OrgDesc

No description

Organization

No description

PerDesc

No description

Percent

No description

Person

No description

Plant

No description

Product

No description

ProductDesc

No description

Quantity

No description

Substance

No description

Time

No description

WorkOfArt

No description

SemArg

Description

The SemArg annotation is attached to semantic arguments of semantic predicates. Semantic arguments are characterized by their semantic role, e.g. Agent, Experiencer, Topic. The semantic role of an argument is related to its semantic type (for communication verbs, the Agent can be a person or an organization, but typically not food).

Table 119. Producers and consumers of SemArg

Producers

ClearNlpSemanticRoleLabeler MateSemanticRoleLabeler Conll2008 (format) Conll2009 (format) Conll2012 (format) TigerXml (format)

Consumers

Conll2008 (format) Conll2009 (format) Conll2012 (format)

Description

The SemArgLink type is used to attach SemPred annotations to their respective SemArg annotations while giving each link a role.

Features of SemArgLink (2)
role (String)

The role which the argument takes. The value depends on the theory being used, e.g. Arg0, Arg1, etc. or Buyer, Seller, etc.

target (SemArg)

The target argument.

Table 120. Producers and consumers of SemArgLink

Producers

None declared

Consumers

None declared

SemPred

Description

One of the predicates of a sentence (often a main verb, but nouns and adjectives can also be predicates). The SemPred annotation can be attached to predicates in a sentence. Semantic predicates express events or situations and take semantic arguments expressing the participants in these events or situations. All forms of main verbs can be annotated with a SemPred. However, there are also many nouns and adjectives that take arguments and can thus be annotated with a SemanticPredicate, e.g. event nouns, such as "suggestion" (with arguments what and by whom), or relational adjectives, such as "proud" (with arguments who and of what).

Features of SemPred (2)
arguments (FSArray of SemArgLink)

The predicate’s arguments.

category (String)

A more detailed specification of the predicate type depending on the theory being used, e.g. a frame name.

Table 121. Producers and consumers of SemPred

Producers

ClearNlpSemanticRoleLabeler MateSemanticRoleLabeler Conll2008 (format) Conll2009 (format) Conll2012 (format) TigerXml (format)

Consumers

Conll2008 (format) Conll2009 (format) Conll2012 (format)

SemanticArgument

URI

http://dkpro.github.io/dkpro-core/releases/2.2.0/docs/typesystem-reference.html#de.tudarmstadt.ukp.dkpro.core.api.semantics.type.SemanticArgument

Name

de.tudarmstadt.ukp.dkpro.core.api.semantics.type.SemanticArgument

Supertype

Annotation

Description

The SemanticArgument annotation is attached to semantic arguments of semantic predicates. Semantic arguments are characterized by their semantic role, e.g. Agent, Experiencer, Topic. The semantic role of an argument is related to its semantic type (for communication verbs, the Agent can be a person or an organization, but typically not food). The semantic type of arguments is not yet covered by the SemanticType. @deprecated Use SemArg instead.

Features of SemanticArgument (1)
role (String)

The role which the argument takes. The value depends on the theory being used, e.g. Arg0, Arg1, etc. or Buyer, Seller, etc.

Table 122. Producers and consumers of SemanticArgument

Producers

None declared

Consumers

None declared

SemanticField

URI

http://dkpro.github.io/dkpro-core/releases/2.2.0/docs/typesystem-reference.html#de.tudarmstadt.ukp.dkpro.core.api.semantics.type.SemanticField

Name

de.tudarmstadt.ukp.dkpro.core.api.semantics.type.SemanticField

Supertype

Annotation

Description

The SemanticField is a coarse-grained semantic category that can be attached to nouns, verbs or adjectives. Semantic field information is present e.g. in WordNet as lexicographer file names. Previously, this kind of semantic information has also been called supersenses or semantic types.

Features of SemanticField (1)
value (String)

The value or name of the semantic field. Examples of semantic field values are: location, artifact, event, communication, attribute

Table 123. Producers and consumers of SemanticField

Producers

None declared

Consumers

None declared

SemanticPredicate

Description

One of the predicates of a sentence (often a main verb, but nouns and adjectives can also be predicates). The SemanticPredicate annotation can be attached to predicates in a sentence. Semantic predicates express events or situations and take semantic arguments expressing the participants in these events ore situations. All forms of main verbs can be annotated with a SemanticPredicate. However, there are also many nouns and adjectives that take arguments and can thus be annotated with a SemanticPredicate, e.g. event nouns, such as "suggestion" (with arguments what and by whom), or relational adjectives, such as "proud" (with arguments who and of what). @deprecated use SemPred instead

Features of SemanticPredicate (2)
category (String)

A more detailed specification of the predicate type depending on the theory being used, e.g. a frame name.

arguments (FSArray of SemanticArgument)

The predicate’s arguments.

Table 124. Producers and consumers of SemanticPredicate

Producers

None declared

Consumers

None declared

StanfordSentimentAnnotation

Description

Stanford CoreNLP Sentiment annotation

Features of StanfordSentimentAnnotation (5)
veryNegative (Double)

Value of veryNegative

negative (Double)

Value of negative

neutral (Double)

Value of neutral

positive (Double)

Value of positive

veryPositive (Double)

Value of veryPositive

Table 125. Producers and consumers of StanfordSentimentAnnotation

Producers

StanfordSentimentAnalyzer

Consumers

None declared

WordSense

Description
Features of WordSense (1)
value (String)

The sense identifier.

Table 126. Producers and consumers of WordSense

Producers

Conll2012 (format)

Consumers

Conll2012 (format)

Animal

Description
Table 127. Producers and consumers of Animal

Producers

None declared

Consumers

None declared

Cardinal

Description
Table 128. Producers and consumers of Cardinal

Producers

None declared

Consumers

None declared

ContactInfo

Description
Table 129. Producers and consumers of ContactInfo

Producers

None declared

Consumers

None declared

Date

Description
Table 130. Producers and consumers of Date

Producers

None declared

Consumers

None declared

Disease

Description
Table 131. Producers and consumers of Disease

Producers

None declared

Consumers

None declared

Event

Description
Table 132. Producers and consumers of Event

Producers

None declared

Consumers

None declared

Fac

Description
Table 133. Producers and consumers of Fac

Producers

None declared

Consumers

None declared

FacDesc

Description
Table 134. Producers and consumers of FacDesc

Producers

None declared

Consumers

None declared

Game

Description
Table 135. Producers and consumers of Game

Producers

None declared

Consumers

None declared

Gpe

Description
Table 136. Producers and consumers of Gpe

Producers

None declared

Consumers

None declared

GpeDesc

Description
Table 137. Producers and consumers of GpeDesc

Producers

None declared

Consumers

None declared

Language

Description
Table 138. Producers and consumers of Language

Producers

None declared

Consumers

None declared

Law

Description
Table 139. Producers and consumers of Law

Producers

None declared

Consumers

None declared

Location

Description
Table 140. Producers and consumers of Location

Producers

None declared

Consumers

None declared

Money

Description
Table 141. Producers and consumers of Money

Producers

None declared

Consumers

None declared

Nationality

Description
Table 142. Producers and consumers of Nationality

Producers

None declared

Consumers

None declared

Norp

Description
Table 143. Producers and consumers of Norp

Producers

None declared

Consumers

None declared

Ordinal

Description
Table 144. Producers and consumers of Ordinal

Producers

None declared

Consumers

None declared

OrgDesc

Description
Table 145. Producers and consumers of OrgDesc

Producers

None declared

Consumers

None declared

Organization

Description
Table 146. Producers and consumers of Organization

Producers

None declared

Consumers

None declared

PerDesc

Description
Table 147. Producers and consumers of PerDesc

Producers

None declared

Consumers

None declared

Percent

Description
Table 148. Producers and consumers of Percent

Producers

None declared

Consumers

None declared

Person

Description
Table 149. Producers and consumers of Person

Producers

None declared

Consumers

None declared

Plant

Description
Table 150. Producers and consumers of Plant

Producers

None declared

Consumers

None declared

Product

Description
Table 151. Producers and consumers of Product

Producers

None declared

Consumers

None declared

ProductDesc

Description
Table 152. Producers and consumers of ProductDesc

Producers

None declared

Consumers

None declared

Quantity

Description
Table 153. Producers and consumers of Quantity

Producers

None declared

Consumers

None declared

Substance

Description
Table 154. Producers and consumers of Substance

Producers

None declared

Consumers

None declared

Time

Description
Table 155. Producers and consumers of Time

Producers

None declared

Consumers

None declared

WorkOfArt

Description
Table 156. Producers and consumers of WorkOfArt

Producers

None declared

Consumers

None declared

Structure

Field

Description
Features of Field (1)
name (String)

the name of the tag

Table 157. Producers and consumers of Field

Producers

Xml (format) XmlXPath (format)

Consumers

None declared

Syntax

ts syntax
Figure 7. Syntax types

Chunk

Description
Features of Chunk (1)
chunkValue (String)

No description

Table 158. Producers and consumers of Chunk

Producers

OpenNlpChunker TreeTaggerChunker Conll2000 (format) Conll2003 (format) PennTreebankChunked (format) TuebaDZ (format)

Consumers

Conll2000 (format) Conll2003 (format)

Table 159. Sub-types of Chunk (10)
Type Description

ADJC

adjective chunks

ADVC

adverb chunks

CONCJ

complex coordinating conjunctions such as "as well (as)" or "rather (than)"

INTJ

interjection

LST

enumeration symbol

NC

noun chunk (non-recursive noun phrase)

O

other or outside a chunk

PC

prepositional chunk

PRT

verb particle

VC

verb complex

Constituent

URI

http://dkpro.github.io/dkpro-core/releases/2.2.0/docs/typesystem-reference.html#de.tudarmstadt.ukp.dkpro.core.api.syntax.type.constituent.Constituent

Name

de.tudarmstadt.ukp.dkpro.core.api.syntax.type.constituent.Constituent

Supertype

Annotation

similar

http//vocab.lappsgrid.org/Constituent (LAPPS)

similar

http://w3id.org/meta-share/omtd-share/Constituent (OMTD-SHARE)

similar

http://purl.org/olia/olia-top.owl#Constituent (OLiA)

Description
Features of Constituent (4)
constituentType (String)

No description

parent (Annotation)

The parent constituent

children (FSArray of Annotation)

No description

syntacticFunction (String)

No description

Table 160. Producers and consumers of Constituent

Producers

BerkeleyParser CoreNlpParser OpenNlpParser StanfordParser Lif (format) NegraExport (format) PennTreebankCombined (format) Tei (format) TigerXml (format)

Consumers

CoreNlpCoreferenceResolver StanfordCoreferenceResolver StanfordDependencyConverter Lif (format) PennTreebankCombined (format) Tei (format) TigerXml (format)

Table 161. Sub-types of Constituent (29)
Type Description

ADJP

No description

ADVP

No description

CONJP

No description

FRAG

No description

INTJ

No description

LST

No description

NAC

No description

NP

No description

NX

No description

PARN

This cateory is called PRN in the Penn Treebank tagset.

PP

No description

PRN

This type is no longer used and no JCas wrapper is generated for it because on Windows, it conflicts with the reserved device name for printers.

PRP

No description

PRT

No description

QP

No description

ROOT

No description

RRC

No description

S

No description

SBAR

No description

SBARQ

No description

SINV

No description

SQ

No description

UCP

No description

VP

No description

WHADJP

No description

WHADVP

No description

WHNP

No description

WHPP

No description

X

No description

Dependency

Description

A dependency relation between two tokens. The dependency annotation begin and end offsets correspond to those of the dependent.

Features of Dependency (4)
Governor (Token)

The governor word

Dependent (Token)

The dependent word

DependencyType (String)

The dependency type

flavor (String)

Flavor of the dependency relation (basic, collapsed, enhanced, etc…​)

Table 162. Producers and consumers of Dependency

Producers

ClearNlpParser CoreNlpDependencyParser CoreNlpParser MaltParser MateParser MstParser Nlp4JDependencyParser StanfordDependencyConverter StanfordParser UDPipeParser Conll2006 (format) Conll2008 (format) Conll2009 (format) ConllCoreNlp (format) ConllU (format) Lif (format) Lxf (format) Perseus (format) Tcf (format)

Consumers

ClearNlpSemanticRoleLabeler MateSemanticRoleLabeler Conll2006 (format) Conll2008 (format) Conll2009 (format) ConllCoreNlp (format) ConllU (format) Lif (format) Lxf (format) Tcf (format)

Table 163. Sub-types of Dependency (57)
Type Description

ABBREV

No description

ACOMP

No description

ADVCL

No description

ADVMOD

No description

AGENT

No description

AMOD

No description

APPOS

No description

ATTR

No description

AUX0

No description

AUXPASS

No description

CC

No description

CCOMP

No description

COMPLM

No description

CONJ

No description

CONJP

No description

CONJ_YET

No description

COP

No description

CSUBJ

No description

CSUBJPASS

No description

DEP

No description

DET

No description

DOBJ

No description

EXPL

No description

INFMOD

No description

IOBJ

No description

MARK

No description

MEASURE

No description

MWE

No description

NEG

No description

NN

No description

NPADVMOD

No description

NSUBJ

No description

NSUBJPASS

No description

NUM

No description

NUMBER

No description

PARATAXIS

No description

PARTMOD

No description

PCOMP

No description

POBJ

No description

POSS

No description

POSSESSIVE

No description

PRECONJ

No description

PRED

No description

PREDET

No description

PREP

No description

PREPC

No description

PRT

No description

PUNCT

No description

PURPCL

No description

QUANTMOD

No description

RCMOD

No description

REF

No description

REL

No description

ROOT

Dependency tree root.

TMOD

No description

XCOMP

No description

XSUBJ

No description

PennTree

Description

The Penn Treebank-style phrase structure string.

Features of PennTree (2)
PennTree (String)

Contains a Penn Treebank-style representation of a tree.

TransformationNames (String)

The name(s) of the transformation(s) that have been performed on the PennTree

Table 164. Producers and consumers of PennTree

Producers

BerkeleyParser OpenNlpParser

Consumers

TGrep (format)

ABBREV

Description
Table 165. Producers and consumers of ABBREV

Producers

None declared

Consumers

None declared

ACOMP

Description
Table 166. Producers and consumers of ACOMP

Producers

None declared

Consumers

None declared

ADJC

Description

adjective chunks

Table 167. Producers and consumers of ADJC

Producers

None declared

Consumers

None declared

ADJP

Description
Table 168. Producers and consumers of ADJP

Producers

None declared

Consumers

None declared

ADVC

Description

adverb chunks

Table 169. Producers and consumers of ADVC

Producers

None declared

Consumers

None declared

ADVCL

Description
Table 170. Producers and consumers of ADVCL

Producers

None declared

Consumers

None declared

ADVMOD

Description
Table 171. Producers and consumers of ADVMOD

Producers

None declared

Consumers

None declared

ADVP

Description
Table 172. Producers and consumers of ADVP

Producers

None declared

Consumers

None declared

AGENT

Description
Table 173. Producers and consumers of AGENT

Producers

None declared

Consumers

None declared

AMOD

Description
Table 174. Producers and consumers of AMOD

Producers

None declared

Consumers

None declared

APPOS

Description
Table 175. Producers and consumers of APPOS

Producers

None declared

Consumers

None declared

ATTR

Description
Table 176. Producers and consumers of ATTR

Producers

None declared

Consumers

None declared

AUX0

Description
Table 177. Producers and consumers of AUX0

Producers

None declared

Consumers

None declared

AUXPASS

Description
Table 178. Producers and consumers of AUXPASS

Producers

None declared

Consumers

None declared

CC

Description
Table 179. Producers and consumers of CC

Producers

None declared

Consumers

None declared

CCOMP

Description
Table 180. Producers and consumers of CCOMP

Producers

None declared

Consumers

None declared

COMPLM

Description
Table 181. Producers and consumers of COMPLM

Producers

None declared

Consumers

None declared

CONCJ

Description

complex coordinating conjunctions such as "as well (as)" or "rather (than)"

Table 182. Producers and consumers of CONCJ

Producers

None declared

Consumers

None declared

CONJ

Description
Table 183. Producers and consumers of CONJ

Producers

None declared

Consumers

None declared

CONJP

Description
Table 184. Producers and consumers of CONJP

Producers

None declared

Consumers

None declared

CONJP

Description
Table 185. Producers and consumers of CONJP

Producers

None declared

Consumers

None declared

CONJ_YET

Description
Table 186. Producers and consumers of CONJ_YET

Producers

None declared

Consumers

None declared

COP

Description
Table 187. Producers and consumers of COP

Producers

None declared

Consumers

None declared

CSUBJ

Description
Table 188. Producers and consumers of CSUBJ

Producers

None declared

Consumers

None declared

CSUBJPASS

Description
Table 189. Producers and consumers of CSUBJPASS

Producers

None declared

Consumers

None declared

DEP

Description
Table 190. Producers and consumers of DEP

Producers

None declared

Consumers

None declared

DET

Description
Table 191. Producers and consumers of DET

Producers

None declared

Consumers

None declared

DOBJ

Description
Table 192. Producers and consumers of DOBJ

Producers

None declared

Consumers

None declared

EXPL

Description
Table 193. Producers and consumers of EXPL

Producers

None declared

Consumers

None declared

FRAG

Description
Table 194. Producers and consumers of FRAG

Producers

None declared

Consumers

None declared

INFMOD

Description
Table 195. Producers and consumers of INFMOD

Producers

None declared

Consumers

None declared

INTJ

Description

interjection

Table 196. Producers and consumers of INTJ

Producers

None declared

Consumers

None declared

INTJ

Description
Table 197. Producers and consumers of INTJ

Producers

None declared

Consumers

None declared

IOBJ

Description
Table 198. Producers and consumers of IOBJ

Producers

None declared

Consumers

None declared

LST

Description

enumeration symbol

Table 199. Producers and consumers of LST

Producers

None declared

Consumers

None declared

LST

Description
Table 200. Producers and consumers of LST

Producers

None declared

Consumers

None declared

MARK

Description
Table 201. Producers and consumers of MARK

Producers

None declared

Consumers

None declared

MEASURE

Description
Table 202. Producers and consumers of MEASURE

Producers

None declared

Consumers

None declared

MWE

Description
Table 203. Producers and consumers of MWE

Producers

None declared

Consumers

None declared

NAC

Description
Table 204. Producers and consumers of NAC

Producers

None declared

Consumers

None declared

NC

Description

noun chunk (non-recursive noun phrase)

Table 205. Producers and consumers of NC

Producers

None declared

Consumers

None declared

NEG

Description
Table 206. Producers and consumers of NEG

Producers

None declared

Consumers

None declared

NN

Description
Table 207. Producers and consumers of NN

Producers

None declared

Consumers

None declared

NP

Description
Table 208. Producers and consumers of NP

Producers

None declared

Consumers

None declared

NPADVMOD

Description
Table 209. Producers and consumers of NPADVMOD

Producers

None declared

Consumers

None declared

NSUBJ

Description
Table 210. Producers and consumers of NSUBJ

Producers

None declared

Consumers

None declared

NSUBJPASS

Description
Table 211. Producers and consumers of NSUBJPASS

Producers

None declared

Consumers

None declared

NUM

Description
Table 212. Producers and consumers of NUM

Producers

None declared

Consumers

None declared

NUMBER

Description
Table 213. Producers and consumers of NUMBER

Producers

None declared

Consumers

None declared

NX

Description
Table 214. Producers and consumers of NX

Producers

None declared

Consumers

None declared

O

Description

other or outside a chunk

Table 215. Producers and consumers of O

Producers

None declared

Consumers

None declared

PARATAXIS

Description
Table 216. Producers and consumers of PARATAXIS

Producers

None declared

Consumers

None declared

PARN

Description

This cateory is called PRN in the Penn Treebank tagset. However, PRN is a reserved device name on Window. Thus we had to rename this category. The old PRN type is still present in the DKPro Core type system, but it is deprecated, no longer used, and no JCas classes are generated for it.

Table 217. Producers and consumers of PARN

Producers

None declared

Consumers

None declared

PARTMOD

Description
Table 218. Producers and consumers of PARTMOD

Producers

None declared

Consumers

None declared

PC

Description

prepositional chunk

Table 219. Producers and consumers of PC

Producers

None declared

Consumers

None declared

PCOMP

Description
Table 220. Producers and consumers of PCOMP

Producers

None declared

Consumers

None declared

POBJ

Description
Table 221. Producers and consumers of POBJ

Producers

None declared

Consumers

None declared

POSS

Description
Table 222. Producers and consumers of POSS

Producers

None declared

Consumers

None declared

POSSESSIVE

Description
Table 223. Producers and consumers of POSSESSIVE

Producers

None declared

Consumers

None declared

PP

Description
Table 224. Producers and consumers of PP

Producers

None declared

Consumers

None declared

PRECONJ

Description
Table 225. Producers and consumers of PRECONJ

Producers

None declared

Consumers

None declared

PRED

Description
Table 226. Producers and consumers of PRED

Producers

None declared

Consumers

None declared

PREDET

Description
Table 227. Producers and consumers of PREDET

Producers

None declared

Consumers

None declared

PREP

Description
Table 228. Producers and consumers of PREP

Producers

None declared

Consumers

None declared

PREPC

Description
Table 229. Producers and consumers of PREPC

Producers

None declared

Consumers

None declared

PRN

Description

This type is no longer used and no JCas wrapper is generated for it because on Windows, it conflicts with the reserved device name for printers. @deprecated Use PARN instead

Table 230. Producers and consumers of PRN

Producers

None declared

Consumers

None declared

PRP

Description
Table 231. Producers and consumers of PRP

Producers

None declared

Consumers

None declared

PRT

Description
Table 232. Producers and consumers of PRT

Producers

None declared

Consumers

None declared

PRT

Description

verb particle

Table 233. Producers and consumers of PRT

Producers

None declared

Consumers

None declared

PRT

Description
Table 234. Producers and consumers of PRT

Producers

None declared

Consumers

None declared

PUNCT

Description
Table 235. Producers and consumers of PUNCT

Producers

None declared

Consumers

None declared

PURPCL

Description
Table 236. Producers and consumers of PURPCL

Producers

None declared

Consumers

None declared

QP

Description
Table 237. Producers and consumers of QP

Producers

None declared

Consumers

None declared

QUANTMOD

Description
Table 238. Producers and consumers of QUANTMOD

Producers

None declared

Consumers

None declared

RCMOD

Description
Table 239. Producers and consumers of RCMOD

Producers

None declared

Consumers

None declared

REF

Description
Table 240. Producers and consumers of REF

Producers

None declared

Consumers

None declared

REL

Description
Table 241. Producers and consumers of REL

Producers

None declared

Consumers

None declared

ROOT

Description

Dependency tree root.

Table 242. Producers and consumers of ROOT

Producers

None declared

Consumers

None declared

ROOT

Description
Table 243. Producers and consumers of ROOT

Producers

None declared

Consumers

None declared

RRC

Description
Table 244. Producers and consumers of RRC

Producers

None declared

Consumers

None declared

S

Description
Table 245. Producers and consumers of S

Producers

None declared

Consumers

None declared

SBAR

Description
Table 246. Producers and consumers of SBAR

Producers

None declared

Consumers

None declared

SBARQ

Description
Table 247. Producers and consumers of SBARQ

Producers

None declared

Consumers

None declared

SINV

Description
Table 248. Producers and consumers of SINV

Producers

None declared

Consumers

None declared

SQ

Description
Table 249. Producers and consumers of SQ

Producers

None declared

Consumers

None declared

TMOD

Description
Table 250. Producers and consumers of TMOD

Producers

None declared

Consumers

None declared

UCP

Description
Table 251. Producers and consumers of UCP

Producers

None declared

Consumers

None declared

VC

Description

verb complex

Table 252. Producers and consumers of VC

Producers

None declared

Consumers

None declared

VP

Description
Table 253. Producers and consumers of VP

Producers

None declared

Consumers

None declared

WHADJP

Description
Table 254. Producers and consumers of WHADJP

Producers

None declared

Consumers

None declared

WHADVP

Description
Table 255. Producers and consumers of WHADVP

Producers

None declared

Consumers

None declared

WHNP

Description
Table 256. Producers and consumers of WHNP

Producers

None declared

Consumers

None declared

WHPP

Description
Table 257. Producers and consumers of WHPP

Producers

None declared

Consumers

None declared

X

Description
Table 258. Producers and consumers of X

Producers

None declared

Consumers

None declared

XCOMP

Description
Table 259. Producers and consumers of XCOMP

Producers

None declared

Consumers

None declared

XSUBJ

Description
Table 260. Producers and consumers of XSUBJ

Producers

None declared

Consumers

None declared

Tfidf

Tfidf

Description

Annotates the tf.idf score of a token, stem, or lemma.

Features of Tfidf (2)
tfidfValue (Double)

The tf.idf score.

term (String)

The string that was used to compute this tf.idf score. If a stem or lemma was used, the covered text of this annotation does not need to be equal to this string.

This string can be used to construct a vector space with the right terms without having to access the indexes again.

Table 261. Producers and consumers of Tfidf

Producers

TfIdfAnnotator

Consumers

None declared

Topic Modeling

TopicDistribution

Description

An array representing the topic proportions in a document.

Features of TopicDistribution (2)
TopicProportions (DoubleArray)

Each topic’s proportion in the document.

TopicAssignment (IntegerArray)

Pointers to topics the document has been assigned to.

Table 262. Producers and consumers of TopicDistribution

Producers

MalletLdaTopicModelInferencer

Consumers

DiTop (format)

WordEmbedding

Description

An array representing the word embedding vector.

Features of WordEmbedding (1)
WordEmbedding (FloatArray)

A word embedding vector.

Table 263. Producers and consumers of WordEmbedding

Producers

MalletEmbeddingsAnnotator

Consumers

None declared

Transformation

SofaChangeAnnotation

URI

http://dkpro.github.io/dkpro-core/releases/2.2.0/docs/typesystem-reference.html#de.tudarmstadt.ukp.dkpro.core.api.transform.type.SofaChangeAnnotation

Name

de.tudarmstadt.ukp.dkpro.core.api.transform.type.SofaChangeAnnotation

Supertype

Annotation

Description

Encodes an edit operation that can be interpreted by the ApplyChangesAnnotator.

Features of SofaChangeAnnotation (3)
value (String)

In case of an "insert" or "replace" operation, this feature indicates the value to be inserted or replaced.

operation (String)

Operation to perform: "insert", "replace", "delete"

reason (String)

The reason for the change.

Table 264. Producers and consumers of SofaChangeAnnotation

Producers

ApplyChangesAnnotator NorvigSpellingCorrector ReplacementFileNormalizer Tcf (format)

Consumers

ApplyChangesAnnotator Tcf (format)

Utility

TimerAnnotation

URI

http://dkpro.github.io/dkpro-core/releases/2.2.0/docs/typesystem-reference.html#de.tudarmstadt.ukp.dkpro.core.performance.type.TimerAnnotation

Name

de.tudarmstadt.ukp.dkpro.core.performance.type.TimerAnnotation

Supertype

Annotation

Description

Used for storing timing information (e.g. for performance testing).

Features of TimerAnnotation (3)
startTime (Long)

No description

endTime (Long)

No description

name (String)

The name of the timer. Used to automatically determine whether this is an upstream or downstream timer.

Table 265. Producers and consumers of TimerAnnotation

Producers

None declared

Consumers

None declared

Wikipedia

Description

Wikipedia link

Features of WikipediaLink (3)
LinkType (String)

The type of the link, e.g. internal, external, image, …​

Target (String)

The link target url

Anchor (String)

The anchor of the link

Table 266. Producers and consumers of WikipediaLink

Producers

WikipediaLink (format)

Consumers

None declared

Wikipedia (JWPL)

ArticleInfo

Description

Contains basic information about the article.

Features of ArticleInfo (4)
Authors (Integer)

Number of unique authors of this article

Revisions (Integer)

Number of revisions of this article.

FirstAppearance (Long)

The Timestamp of the first appearance of this article.

LastAppearance (Long)

The Timestamp of the last appearance of this article.

Table 267. Producers and consumers of ArticleInfo

Producers

WikipediaArticleInfo (format)

Consumers

None declared

DBConfig

Description

Database configuration for the connection to the database where the CAS data was retrieved.

Features of DBConfig (5)
Host (String)

DB Host

DB (String)

Database

User (String)

Username

Password (String)

User password

Language (String)

Wikipedia Language Versions

Table 268. Producers and consumers of DBConfig

Producers

WikipediaDiscussion (format) WikipediaLink (format) WikipediaPage (format) WikipediaRevision (format) WikipediaRevisionPair (format) WikipediaTemplateFilteredArticle (format)

Consumers

None declared

WikipediaRevision

Description

Represents a revision in Wikipedia.

Features of WikipediaRevision (7)
revisionId (Integer)

The ID of the revision.

pageId (Integer)

The pageId of the Wikipedia page of this revision.

contributorName (String)

The username of the user/contributor who edited this revision.

comment (String)

The comment that the editor entered for this revision.

contributorId (Integer)

The userId of the user/contributor who created this revision

timestamp (Long)

The timestamp of the revision, given in milliseconds since the standard base time (January 1, 1970, 00:00:00 GMT)

minor (Boolean)

Whether this revision has been marked as minor edit by its contributor.

Table 269. Producers and consumers of WikipediaRevision

Producers

WikipediaRevision (format)

Consumers

None declared

XML

ts xml
Figure 8. XML structure types

XmlAttribute

Description
Features of XmlAttribute (5)
uri (String)

Namespace URI of the attribute.

localName (String)

Local name of the attribute.

value (String)

Value of the XML attribute.

qName (String)

No description

valueType (String)

No description

Table 270. Producers and consumers of XmlAttribute

Producers

HtmlDocument (format) XmlDocument (format)

Consumers

XmlDocument (format)

XmlDocument

Description

XML document

Features of XmlDocument (1)
root (XmlElement)

Root element of the XML document.

Table 271. Producers and consumers of XmlDocument

Producers

HtmlDocument (format) XmlDocument (format)

Consumers

XmlDocument (format)

XmlElement

Description

XML element

Features of XmlElement (5)
uri (String)

Namespace URI of the element.

localName (String)

Local name of the XML element.

attributes (FSArray of XmlAttribute)

Array of attributes of the XML element.

children (FSArray of XmlNode)

Children of this XML element.

qName (String)

No description

Table 272. Producers and consumers of XmlElement

Producers

HtmlDocument (format) XmlDocument (format)

Consumers

XmlDocument (format)

XmlNode

Description

Supertype for XmlElements and XmlTextNodes.

Features of XmlNode (1)
parent (XmlElement)

No description

Table 273. Producers and consumers of XmlNode

Producers

HtmlDocument (format) XmlDocument (format)

Consumers

XmlDocument (format)

Table 274. Sub-types of XmlNode (2)
Type Description

XmlElement

XML element

XmlTextNode

XML text node.

XmlTextNode

Description

XML text node.

Features of XmlTextNode (2)
text (String)

No description

captured (Boolean)

Whether the text node has been added to the document text.

Table 275. Producers and consumers of XmlTextNode

Producers

HtmlDocument (format) XmlDocument (format)

Consumers

XmlDocument (format)

Subtype tables

Table 276. Sub-types of Dependency
Type Description

ABBREV

No description

ACOMP

No description

ADVCL

No description

ADVMOD

No description

AGENT

No description

AMOD

No description

APPOS

No description

ATTR

No description

AUX0

No description

AUXPASS

No description

CC

No description

CCOMP

No description

COMPLM

No description

CONJ

No description

CONJP

No description

CONJ_YET

No description

COP

No description

CSUBJ

No description

CSUBJPASS

No description

DEP

No description

DET

No description

DOBJ

No description

EXPL

No description

INFMOD

No description

IOBJ

No description

MARK

No description

MEASURE

No description

MWE

No description

NEG

No description

NN

No description

NPADVMOD

No description

NSUBJ

No description

NSUBJPASS

No description

NUM

No description

NUMBER

No description

PARATAXIS

No description

PARTMOD

No description

PCOMP

No description

POBJ

No description

POSS

No description

POSSESSIVE

No description

PRECONJ

No description

PRED

No description

PREDET

No description

PREP

No description

PREPC

No description

PRT

No description

PUNCT

No description

PURPCL

No description

QUANTMOD

No description

RCMOD

No description

REF

No description

REL

No description

ROOT

Dependency tree root.

TMOD

No description

XCOMP

No description

XSUBJ

No description

Table 277. Sub-types of POS
Type Description

ADJ

Adjective

@deprecated Use POS_ADJ instead

ADP

Adposition

@deprecated Use POS_ADP instead

ADV

Adverb

@deprecated Use POS_ADV instead

ART

Determiners and articles.

AUX

Auxiliary verb

@deprecated Use POS_AUX instead

CARD

Numerals

@deprecated Use POS_NUM instead

CONJ

Conjunction

@deprecated Use POS_CONJ instead

DET

Determiner

@deprecated Use POS_DET instead

INTJ

Interjection

@deprecated Use POS_INTJ instead

N

Nouns

@deprecated Use POS_NOUN instead

NOUN

Noun

@deprecated Use POS_NOUN instead

NUM

Numeral

@deprecated Use POS_NUM instead

O

Catch-all for other categories such as abbreviations or foreign words

@deprecated Use POS_X instead

PART

Particle

@deprecated Use POS_PART instead

POS_ADJ

Adjective

POS_ADP

Adposition

POS_ADV

Adverb

POS_AUX

Auxiliary verb

POS_CONJ

Conjunction

POS_DET

Determiner

POS_INTJ

Interjection

POS_NOUN

Noun

POS_NUM

Numeral

POS_PART

Particle

POS_PRON

Pronoun

POS_PROPN

Proper noun

POS_PUNCT

Punctuation

POS_SCONJ

Subordinating conjunction

POS_SYM

Symbol

POS_VERB

Verb

POS_X

Other

PP

Prepositions and postpositions

@deprecated Use POS_ADP instead

PR

Pronoun

@deprecated Use POS_PRON instead

PRON

Pronoun

@deprecated Use POS_PRON instead

PROPN

Proper noun

@deprecated Use POS_PROPN instead

PRT

Particles

@deprecated Use POS_PART instead

PUNC

Punctuation marks

@deprecated Use POS_PUNCT instead

PUNCT

Punctuation

@deprecated Use POS_PUNCT instead

SCONJ

Subordinating conjunction

@deprecated Use POS_SCONJ instead

SYM

Symbol

@deprecated Use POS_SYM instead

V

Verbs

@deprecated Use POS_VERB instead

VERB

Verb

@deprecated Use POS_VERB instead

X

Other

@deprecated Use POS_X instead

Table 278. Sub-types of Chunk
Type Description

ADJC

adjective chunks

ADVC

adverb chunks

CONCJ

complex coordinating conjunctions such as "as well (as)" or "rather (than)"

INTJ

interjection

LST

enumeration symbol

NC

noun chunk (non-recursive noun phrase)

O

other or outside a chunk

PC

prepositional chunk

PRT

verb particle

VC

verb complex

Table 279. Sub-types of Constituent
Type Description

ADJP

No description

ADVP

No description

CONJP

No description

FRAG

No description

INTJ

No description

LST

No description

NAC

No description

NP

No description

NX

No description

PARN

This cateory is called PRN in the Penn Treebank tagset.

PP

No description

PRN

This type is no longer used and no JCas wrapper is generated for it because on Windows, it conflicts with the reserved device name for printers.

PRP

No description

PRT

No description

QP

No description

ROOT

No description

RRC

No description

S

No description

SBAR

No description

SBARQ

No description

SINV

No description

SQ

No description

UCP

No description

VP

No description

WHADJP

No description

WHADVP

No description

WHNP

No description

WHPP

No description

X

No description

Table 280. Sub-types of O
Type Description

AT

at-mention (indicates another user as a recipient of a tweet)

@deprecated Use POS_AT instead

DM

discourse marker, indications of continuation of a message across multiple tweets

@deprecated Use POS_DM instead

EMO

emoticon

@deprecated Use POS_EMO instead

HASH

Hashtag (indicates topic/category for tweet)

@deprecated Use POS_HASH instead

INT

proper noun + verbal

@deprecated Use POS_INT instead

URL

URL or email address

@deprecated Use POS_URL instead

Table 281. Sub-types of NamedEntity
Type Description

Animal

No description

Cardinal

No description

ContactInfo

No description

Date

No description

Disease

No description

Event

No description

Fac

No description

FacDesc

No description

Game

No description

Gpe

No description

GpeDesc

No description

Language

No description

Law

No description

Location

No description

Money

No description

Nationality

No description

Norp

No description

Ordinal

No description

OrgDesc

No description

Organization

No description

PerDesc

No description

Percent

No description

Person

No description

Plant

No description

Product

No description

ProductDesc

No description

Quantity

No description

Substance

No description

Time

No description

WorkOfArt

No description

Table 282. Sub-types of AnnotationBase
Type Description

ArticleMetaData

A document annotation that describes the metadata of a newspaper article.

CoreferenceChain

Marks the beginning of a chain.

Table 283. Sub-types of Split
Type Description

CompoundPart

A CompoundPart represents one fragment from the compounding word.

LinkingMorpheme

This type represents a linking morpheme between two CompoundParts.

Table 284. Sub-types of RSTTreeNode
Type Description

DiscourseRelation

No description

EDU

No description

Table 285. Sub-types of Div
Type Description

Document

No description

Heading

Document title, section heading, etc.

Paragraph

No description

Table 286. Sub-types of DocumentAnnotation
Type Description

DocumentMetaData

The DocumentMetaData annotation stores information about a single processed document.

Table 287. Sub-types of DiscourseRelation
Type Description

ExplicitDiscourseRelation

Discourse relation

Table 288. Sub-types of Anomaly
Type Description

GrammarAnomaly

No description

SpellingAnomaly

No description

Table 289. Sub-types of DiscourseRelation
Type Description

ImplicitDiscourseRelation

Implicit discourse relation

Table 290. Sub-types of Token
Type Description

JapaneseToken

No description

Table 291. Sub-types of N
Type Description

NN

Common noun

@deprecated Use POS_NOUN instead

NNV

nominal + verbal

@deprecated Use POS_NNV instead

NP

Proper noun

@deprecated Use POS_PROPN instead

NPV

proper noun + verbal

@deprecated Use POS_NPV instead

Table 292. Sub-types of POS_X
Type Description

POS_AT

at-mention (indicates another user as a recipient of a tweet)

POS_DM

discourse marker, indications of continuation of a message across multiple tweets

POS_EMO

emoticon

POS_HASH

Hashtag (indicates topic/category for tweet)

POS_INT

proper noun + verbal

POS_URL

URL or email address

Table 293. Sub-types of POS_NOUN
Type Description

POS_NNV

nominal + verbal

POS_NPV

proper noun + verbal

Table 294. Sub-types of TOP
Type Description

SemArgLink

The SemArgLink type is used to attach SemPred annotations to their respective SemArg annotations while giving each link a role.

TagDescription

Description of an individual tag.

XmlAttribute

No description

Table 295. Sub-types of XmlNode
Type Description

XmlElement

XML element

XmlTextNode

XML text node.