DKPro Core - OpenNLP Named Entity Recognition pipeline

Analytics

Reads all text files (*.txt) in the specified folder and prints the named entities contained in the file

Call with groovy pipeline <inputfolder> <language>, e.g. pipeline input en.

Mind that using . as the input folder is currently not supported.

@Grab(group='de.tudarmstadt.ukp.dkpro.core', 
      module='de.tudarmstadt.ukp.dkpro.core.io.text-asl', 
      version='1.8.0')
@Grab(group='de.tudarmstadt.ukp.dkpro.core', 
      module='de.tudarmstadt.ukp.dkpro.core.opennlp-asl', 
      version='1.8.0')
import static org.apache.uima.fit.pipeline.SimplePipeline.*;
import static org.apache.uima.fit.util.JCasUtil.*;
import static org.apache.uima.fit.factory.CollectionReaderFactory.*;
import static org.apache.uima.fit.factory.AnalysisEngineFactory.*;

import de.tudarmstadt.ukp.dkpro.core.io.text.*;
import de.tudarmstadt.ukp.dkpro.core.opennlp.*;
import de.tudarmstadt.ukp.dkpro.core.api.ner.type.*;
import de.tudarmstadt.ukp.dkpro.core.api.metadata.type.*;

// Assemble and run pipeline
def pipeline = iteratePipeline(
  createReaderDescription(TextReader,
    TextReader.PARAM_PATH, args[0],     //first command line parameter
    TextReader.PARAM_LANGUAGE, args[1], //second command line parameter
    TextReader.PARAM_PATTERNS, "[+]*.txt"),
  createEngineDescription(OpenNlpSegmenter),
  createEngineDescription(OpenNlpNamedEntityRecognizer, "modelVariant", "person"),
  createEngineDescription(OpenNlpNamedEntityRecognizer, "modelVariant", "organization"),
  createEngineDescription(OpenNlpNamedEntityRecognizer, "modelVariant", "location"));

for (def document : pipeline) {
  def dmd = DocumentMetaData.get(document);
  println "${dmd.documentUri}:";
  for (def ne : select(document, NamedEntity)) {
    println "  ${ne.coveredText}";
  }
} 

Example inpu example.txt:

John Miller works at the IBM headquarters in the United States.

Example output:

file:/Users/john/example.txt:
  John Miller
  IBM
  United States

Support DKPro Core by allowing the use of cookies

Please support DKPro Core project by allowing this site to use cookies to track your activity. Doing so allows us to get an idea of how interesting our project is to the community. The EU General Data Protection Regulation (GDPR) requires us to ask you for your consent about the use of cookies. To learn more about how our site makes use of cookies and uses your activity data, please refer to our privacy policy. You can also always revise the choice you make here by visiting out privacy policy page.

Do you allow tracking your activity on this site using cookies?