"apply weka:W-HierarchicalClusterer"
Hello friends of the community. a query
I'm working with text mining - clustering
I performed the pre-processing text files, create the TF-IDF Vertor, filter the STOP-WORDS and the next step I need to apply the algorithm "Weka: W-HierarchicalClusterer" but I get the following error:
Jan 28, 2013 6:01:32 PM SEVERE: Process failed: W-HierarchicalClusterer caused an error: com.rapidminer.operator.UserError: caused an error: java.lang.ArrayIndexOutOfBoundsException: 10
Jan 28, 2013 6:01:32 PM SEVERE: Here: Process[1] (Process)
subprocess 'Main Process'
+- Process Documents from Files[1] (Process Documents from Files)
subprocess 'Vector Creation'
| +- Transform Cases[6] (Transform Cases)
| +- Tokenize[6] (Tokenize)
| +- Filter stopwords_pronombres_preposiciones[6] (Filter Stopwords (Dictionary))
| +- Filter stopwords_caratula[6] (Filter Stopwords (Dictionary))
| +- Filter Stopwords (English)[6] (Filter Stopwords (English))
| +- Filter Tokens (by Length)[6] (Filter Tokens (by Length))
==> +- W-HierarchicalClusterer[1] (W-HierarchicalClusterer)
Add my process XML down
I'm working with text mining - clustering
I performed the pre-processing text files, create the TF-IDF Vertor, filter the STOP-WORDS and the next step I need to apply the algorithm "Weka: W-HierarchicalClusterer" but I get the following error:
Jan 28, 2013 6:01:32 PM SEVERE: Process failed: W-HierarchicalClusterer caused an error: com.rapidminer.operator.UserError: caused an error: java.lang.ArrayIndexOutOfBoundsException: 10
Jan 28, 2013 6:01:32 PM SEVERE: Here: Process[1] (Process)
subprocess 'Main Process'
+- Process Documents from Files[1] (Process Documents from Files)
subprocess 'Vector Creation'
| +- Transform Cases[6] (Transform Cases)
| +- Tokenize[6] (Tokenize)
| +- Filter stopwords_pronombres_preposiciones[6] (Filter Stopwords (Dictionary))
| +- Filter stopwords_caratula[6] (Filter Stopwords (Dictionary))
| +- Filter Stopwords (English)[6] (Filter Stopwords (English))
| +- Filter Tokens (by Length)[6] (Filter Tokens (by Length))
==> +- W-HierarchicalClusterer[1] (W-HierarchicalClusterer)
Add my process XML down
<宏/ >
< =“tru运营商激活e" class="process" compatibility="5.2.008" expanded="true" name="Process">
<运营商激活= " true " class = "文本:process_document_from_file" compatibility="5.2.004" expanded="true" height="76" name="Process Documents from Files" width="90" x="112" y="75">
< =“tru运营商激活e" class="text:transform_cases" compatibility="5.2.004" expanded="true" height="60" name="Transform Cases" width="90" x="45" y="30"/>
< =“tru运营商激活e" class="text:tokenize" compatibility="5.2.004" expanded="true" height="60" name="Tokenize" width="90" x="45" y="120"/>
< =“tru运营商激活e" class="text:filter_stopwords_dictionary" compatibility="5.2.004" expanded="true" height="76" name="Filter stopwords_pronombres_preposiciones" width="90" x="45" y="210">
< =“tru运营商激活e" class="text:filter_stopwords_dictionary" compatibility="5.2.004" expanded="true" height="76" name="Filter stopwords_caratula" width="90" x="45" y="300">
< =“tru运营商激活e" class="text:filter_stopwords_english" compatibility="5.2.004" expanded="true" height="60" name="Filter Stopwords (English)" width="90" x="179" y="30"/>
< =“tru运营商激活e" class="text:filter_by_length" compatibility="5.2.004" expanded="true" height="60" name="Filter Tokens (by Length)" width="90" x="179" y="120">
< =“tru运营商激活e" class="weka:W-HierarchicalClusterer" compatibility="5.1.001" expanded="true" height="76" name="W-HierarchicalClusterer" width="90" x="309" y="173"/>
If that still does not help, you could try one of RapidMiner's built-in operators for hierarchical clustering. You'll find all clustering algorithms in the group Modelling / Clustering and Segmentation.
Best regards,