Apply Word2Vec (Documents)

Synopsis

This operator applies a Word2Vec model on a collection of tokenized documents

Description

The tokenized documents delivered at the doc port are taken and the Word2Vec model is applied. The result is a table with one line per token. Besides the vector itself, the table also includes a document_id and the word. Words with no vector representation are not included in the output.

Input

doc

A tokenized collection of documents

国防部

The Word2Vec model.

Output

exa

The resulting scored set with document_id, word and word vector for every token.

doc

The passed through tokenized collection of documents

国防部

The passed through Word2Vec model.

Apply Word2Vec (Documents)

Synopsis

Description

Input

doc

国防部

Output

exa

doc

国防部

Parameters