Skip to main content

Extract Vocabulary

Synopsis

This operator exports the word vectors of a given Word2Vec model into an example set.

Description

A Word2Vec model can be thought of a dictionary or hash map. This dictionary contains a word vector for every word in the training corpus. This operator exports the training corpus to an example set. The resulting set can be used for further calculations - e.g. synonym detection.

Input

tree

A trained Word2Vec model

Output

exa

The original Word2Vec Model passed throug

original tree

The original tree model passed through.

Parameters

Get full vocabulary

If checked all words are exported. This might take a lot of memory.

Take random words

If checked random words will be exported. If not checked the internal order will be preserved. This is usually ordered by Vocabulary Frequency.

Number of words to pull

This specificies the number of words which are pulled from the model.