DKPro WSD

Lexical semantic resources

This page lists the lexical semantic resources (sense inventories) most commonly used for word sense disambiguation experiments, and includes links to where they can be downloaded.

WordNet

Versions 1.5 to 3.1 of WordNet can be obtained from the WordNet home page.

DKPro WSD can automatically convert between versions of WordNet. In order to use this feature you need to obtain synset mapping files such as the WN-Map mappings from the UPC TALP Research Center.

To access WordNet through the de.tudarmstadt.ukp.dkpro.wsd.si.wordnet module, you need to prepare an extJWNL properties file which points to your WordNet installation.

You can also access WordNet through the de.tudarmstadt.ukp.dkpro.wsd.si.lsr module (in which case you need to prepare a DKPro LSR resources.xml file which points to your WordNet installation) or the de.tudarmstadt.ukp.dkpro.wsd.si.uby module (in which case you need to produce or download a UBY database containing WordNet).

EuroWordNet

EuroWordNet (EWN) sense inventories are generally not freely available; they must be purchased from the individual publishers. The de.tudarmstadt.ukp.dkpro.wsd.si.wordnet module can read EWN resources, provided they are in Princeton WordNet format. You can use JMWNL to convert EWN sense inventories to Princeton WordNet format.

WordNet++

WordNet++ is no longer available for download from its authors; it has been superseded by the BabelNet project. If you happen to have a copy of WordNet++ already, the de.tudarmstadt.ukp.dkpro.wsd.si.wordnet module can read it.

Turk Bootstrap Word Sense Inventory (TWSI)

The TWSI sense inventory is available for download from its authors.

Wiktionary and Wikipedia

Database dumps of Wiktionary and Wikipedia for use with the de.tudarmstadt.ukp.dkpro.wsd.si.lsr module are available from the Wikimedia Downloads page. Alternatively you can download a UBY database containing Wiktionary and Wikipedia for use with the de.tudarmstadt.ukp.dkpro.wsd.si.uby module.

GermaNet

Instuctions for obtaining GermaNet can be found on the GermaNet home page. DKPro WSD can access GermaNet directly through its de.tudarmstadt.ukp.dkpro.wsd.si.germanet module, or via UBY and the de.tudarmstadt.ukp.dkpro.wsd.si.uby module. (Since GermaNet is not freely redistributable, you need to produce your own UBY database containing it.)

FrameNet

You can download a UBY database containing FrameNet for use with the de.tudarmstadt.ukp.dkpro.wsd.si.uby module.

VerbNet

You can download a UBY database containing VerbNet for use with the de.tudarmstadt.ukp.dkpro.wsd.si.uby module.

OmegaWiki

You can download a UBY database containing OmegaWiki for use with the de.tudarmstadt.ukp.dkpro.wsd.si.uby module.

OpenThesaurus

You can download OpenThesaurus for use with the de.tudarmstadt.ukp.dkpro.wsd.si.lsr module.

UBY

UBY is a network of interlinked lexical resources, including support for English WordNet, Wiktionary, Wikipedia, FrameNet, and VerbNet; the German Wikipedia, Wiktionary, GermaNet, and IMSLex-Subcat; and the multilingual OmegaWiki. You can download a UBY database containing most of these resources.