Skip to main content

Apply Word2Vec (Documents)

Synopsis

This operator applies a Word2Vec model on a collection of tokenized documents

Description

The tokenized documents delivered at the doc port are taken and the Word2Vec model is applied. The result is a table with one line per token. Besides the vector itself, the table also includes a document_id and the word. Words with no vector representation are not included in the output.

Input

doc

A tokenized collection of documents

国防部

The Word2Vec model.

Output

exa

The resulting scored set with document_id, word and word vector for every token.

doc

The passed through tokenized collection of documents

国防部

The passed through Word2Vec model.

Parameters