Installing Java and Groovy
These steps install the basis system requirements needed to implement DKPro Core pipelines using the [http://groovy.codehaus.org Groovy] language. They need to be performed only once.
- Download and install the Java SE Development Kit 7 from the Oracle Java Site
- Windows: download and run the Windows Installer from the Groovy homepage
- Linux/OS X: Open a terminal which we will use to install Groovy using gvm
curl -s "https://get.sdkman.io" | bash
- Open a new terminal window to activate gvm and in the new window enter
gvm install groovy
Running the pipeline
For a start, let’s try a simple analysis pipeline:
- Read an English text file called “document.txt”
- Perform tokenization and sentence boundary detection using OpenNLP
- Perform lemmatization using LanguageTool
- Perform dependency parsing using MaltParser
- Write the result to disk in CoNLL 2006 format
Here is how to run that:
- Open a text editor and copy/paste the following script into it.
- Save the file under the name pipeline.groovy.
- Create another text file in the editor, write some English text into it, and save under the name document.txt.
- Open a command line in the directory to which you saved the two files
- Invoke the script using the command
- This will take quite a while the first time because the software components and models are downloaded
The result is written to a file called document.txt.conll and could look something like this:
Where to go from here?
You can find many more examples of what you can do with DKPro Core and Groovy on our Groovy recipes for DKPro Core pipelines page