Uses the LeftToRightSplitter
as the splitter resource and no ranker resource, decompounds the compounds in a sentence after tokenizing it, then print the tokens and each compound part.
@Grab ( group = 'de.tudarmstadt.ukp.dkpro.core' , version = '1.7.0' ,
module = 'de.tudarmstadt.ukp.dkpro.core.decompounding-asl' )
@Grab ( group = 'de.tudarmstadt.ukp.dkpro.core' , version = '1.7.0' ,
module = 'de.tudarmstadt.ukp.dkpro.core.opennlp-asl' )
import de.tudarmstadt.ukp.dkpro.core.decompounding.uima.annotator.* ;
import de.tudarmstadt.ukp.dkpro.core.decompounding.uima.resource.* ;
import de.tudarmstadt.ukp.dkpro.core.opennlp.* ;
import de.tudarmstadt.ukp.dkpro.core.decompounding.uima.resource.* ;
import static org . apache . uima . fit . pipeline . SimplePipeline .*;
import static org . apache . uima . fit . factory . JCasFactory .*;
import static org . apache . uima . fit . factory . AnalysisEngineFactory .*;
import static org . apache . uima . fit . factory . ExternalResourceFactory .*;
import static org . apache . uima . fit . util . JCasUtil .*;
import de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.* ;
def doc = createJCas ();
doc . documentText = "Wir brauchen einen Aktionsplan."
doc . documentLanguage = "de" ;
runPipeline ( doc ,
createEngineDescription ( OpenNlpSegmenter ),
createEngineDescription (
CompoundAnnotator ,
CompoundAnnotator . PARAM_SPLITTING_ALGO ,
createExternalResourceDescription (
LeftToRightSplitterResource ,
( Object ) LeftToRightSplitterResource . PARAM_DICT_RESOURCE ,
createExternalResourceDescription ( SharedDictionary ),
LeftToRightSplitterResource . PARAM_MORPHEME_RESOURCE ,
createExternalResourceDescription ( SharedLinkingMorphemes ))));
println select ( doc , Token ). collect { it . coveredText }
println select ( doc , CompoundPart ). collect { it . coveredText }
Example output:
Jul 02, 2014 4:52:49 PM de.tudarmstadt.ukp.dkpro.core.api.resources.ResourceObjectProviderBase info
Information: :: loading settings :: url = jar:file:/usr/share/groovy/lib/ivy.jar!/org/apache/ivy/core/settings/ivysettings.xml
Jul 02, 2014 4:52:49 PM de.tudarmstadt.ukp.dkpro.core.api.resources.ResourceObjectProviderBase configure
Information: Producing resource from [jar:file:/home/santos/.ivy2/cache/de.tudarmstadt.ukp.dkpro.core/de.tudarmstadt.ukp.dkpro.core.opennlp-upstream-sentence-de-maxent/jars/de.tudarmstadt.ukp.dkpro.core.opennlp-upstream-sentence-de-maxent-20120616.jar!/de/tudarmstadt/ukp/dkpro/core/opennlp/lib/sentence-de-maxent.bin] redirected from [jar:file:/home/santos/.ivy2/cache/de.tudarmstadt.ukp.dkpro.core/de.tudarmstadt.ukp.dkpro.core.opennlp-model-sentence-de-maxent/jars/de.tudarmstadt.ukp.dkpro.core.opennlp-model-sentence-de-maxent-20120616.1.jar!/de/tudarmstadt/ukp/dkpro/core/opennlp/lib/sentence-de-maxent.properties]
Jul 02, 2014 4:52:49 PM de.tudarmstadt.ukp.dkpro.core.api.resources.ResourceObjectProviderBase loadResource
Information: Producing resource took 55ms
Jul 02, 2014 4:52:49 PM de.tudarmstadt.ukp.dkpro.core.api.resources.ResourceObjectProviderBase info
Information: :: loading settings :: url = jar:file:/usr/share/groovy/lib/ivy.jar!/org/apache/ivy/core/settings/ivysettings.xml
Jul 02, 2014 4:52:50 PM de.tudarmstadt.ukp.dkpro.core.api.resources.ResourceObjectProviderBase configure
Information: Producing resource from [jar:file:/home/santos/.ivy2/cache/de.tudarmstadt.ukp.dkpro.core/de.tudarmstadt.ukp.dkpro.core.opennlp-upstream-token-de-maxent/jars/de.tudarmstadt.ukp.dkpro.core.opennlp-upstream-token-de-maxent-20120616.jar!/de/tudarmstadt/ukp/dkpro/core/opennlp/lib/token-de-maxent.bin] redirected from [jar:file:/home/santos/.ivy2/cache/de.tudarmstadt.ukp.dkpro.core/de.tudarmstadt.ukp.dkpro.core.opennlp-model-token-de-maxent/jars/de.tudarmstadt.ukp.dkpro.core.opennlp-model-token-de-maxent-20120616.1.jar!/de/tudarmstadt/ukp/dkpro/core/opennlp/lib/token-de-maxent.properties]
Jul 02, 2014 4:52:50 PM de.tudarmstadt.ukp.dkpro.core.api.resources.ResourceObjectProviderBase loadResource
Information: Producing resource took 257ms
[Wir, brauchen, einen, Aktionsplan, .]
[Aktion, plan]