If n-grams are required by some downstream component, the NGramAnnotator can be used to add NGram annotations to the CAS.
Quickly Creating n-grams from Annotations
If the n-grams are only needed locally but not by downstream components, it is more efficient to use the NGramIterable that returns the n-grams constructed from a certain set of arbitrary annotations without adding them to the CAS.
If you are just interested in the String representation of the n-grams, you can also use NGramStringIterable.
n-gram Frequency Counts
Many applications require to determine the number of occurrences of a certain n-gram (or phrase) in a collection. DKPro supports that by providing special external resources.
A component that wants to use frequency counts should declare this by specifying an external resource:
The frequency count can then be accessed using:
The user of this component then just needs to add the resource as a configuration parameter: