public class MalletEmbeddingsTrainer extends MalletModelTrainer
Set MalletModelTrainer.PARAM_TOKEN_FEATURE_PATH
to define what is considered as a token (Tokens, Lemmas, etc.).
Set MalletModelTrainer.PARAM_COVERING_ANNOTATION_TYPE
to define what is considered a document (sentences, paragraphs, etc.).
JCasFileWriter_ImplBase.NamedOutputStream
Modifier and Type | Field and Description |
---|---|
static String |
PARAM_DIMENSIONS
The dimensionality of the output word embeddings (default: 50).
|
static String |
PARAM_EXAMPLE_WORD
An example word that is output with its nearest neighbours once in a while (default: null, i.e.
|
static String |
PARAM_MIN_DOCUMENT_LENGTH
Ignore documents with fewer tokens than this value (default: 10).
|
static String |
PARAM_NUM_NEGATIVE_SAMPLES
The number of negative samples to be generated for each token (default: 5).
|
static String |
PARAM_WINDOW_SIZE
The context size when generating embeddings (default: 5).
|
PARAM_COVERING_ANNOTATION_TYPE, PARAM_FILTER_REGEX, PARAM_FILTER_REGEX_REPLACEMENT, PARAM_LOWERCASE, PARAM_MIN_TOKEN_LENGTH, PARAM_NUM_THREADS, PARAM_STOPWORDS_FILE, PARAM_STOPWORDS_REPLACEMENT, PARAM_TOKEN_FEATURE_PATH, PARAM_USE_CHARACTERS
JAR_PREFIX, PARAM_COMPRESSION, PARAM_ESCAPE_DOCUMENT_ID, PARAM_OVERWRITE, PARAM_SINGULAR_TARGET, PARAM_STRIP_EXTENSION, PARAM_TARGET_LOCATION, PARAM_USE_DOCUMENT_ID
Constructor and Description |
---|
MalletEmbeddingsTrainer() |
Modifier and Type | Method and Description |
---|---|
void |
collectionProcessComplete() |
getInstanceList, getNumThreads, initialize, process
getCompressionMethod, getOutputStream, getOutputStream, getRelativePath, getTargetLocation, isStripExtension, isUseDocumentId
getRequiredCasInterface, process
getCasInstancesRequired, hasNext, next
public static final String PARAM_NUM_NEGATIVE_SAMPLES
public static final String PARAM_DIMENSIONS
public static final String PARAM_WINDOW_SIZE
public static final String PARAM_EXAMPLE_WORD
public static final String PARAM_MIN_DOCUMENT_LENGTH
public void collectionProcessComplete() throws org.apache.uima.analysis_engine.AnalysisEngineProcessException
collectionProcessComplete
in interface org.apache.uima.analysis_component.AnalysisComponent
collectionProcessComplete
in class JCasFileWriter_ImplBase
org.apache.uima.analysis_engine.AnalysisEngineProcessException
Copyright © 2007–2018 Ubiquitous Knowledge Processing (UKP) Lab, Technische Universität Darmstadt. All rights reserved.