If n-grams are required by some downstream component, the NGramAnnotator can be used to add NGram annotations to the CAS.
Quickly Creating n-grams from Annotations
If the n-grams are only needed locally but not by downstream components, it is more efficient to use the NGramIterable that returns the n-grams constructed from a certain set of arbitrary annotations without adding them to the CAS.
If you are just interested in the String representation of the n-grams, you can also use NGramStringIterable.
n-gram Frequency Counts
Many applications require to determine the number of occurrences of a certain n-gram (or phrase) in a collection. DKPro supports that by providing special external resources.
A component that wants to use frequency counts should declare this by specifying an external resource:
The frequency count can then be accessed using:
The user of this component then just needs to add the resource as a configuration parameter:
Please support DKPro Core project by allowing this site to use cookies to track your activity. Doing so allows us to get an idea of how interesting our project is to the community. The EU General Data Protection Regulation (GDPR) requires us to ask you for your consent about the use of cookies. To learn more about how our site makes use of cookies and uses your activity data, please refer to our privacy policy. You can also always revise the choice you make here by visiting out privacy policy page.
Do you allow tracking your activity on this site using cookies?