Apply Word2Vec (Documents)
Synopsis
This operator applies a Word2Vec model on a collection of tokenized documents
Description
The tokenized documents delivered at the doc port are taken and the Word2Vec model is applied. The result is a table with one line per token. Besides the vector itself, the table also includes a document_id and the word. Words with no vector representation are not included in the output.
Input
doc
A tokenized collection of documents
国防部
The Word2Vec model.
Output
exa
The resulting scored set with document_id, word and word vector for every token.
doc
The passed through tokenized collection of documents
国防部
The passed through Word2Vec model.