"Exception: java.lang.NullPointerException while clustering - More help please."

Diana_WegnerDiana_Wegner MemberPosts:4Contributor I
edited June 2019 inHelp
我是新的数据挖掘和RM,和有能力to get started very quickly on a simple text mining problem. I ran into problems once I started clustering. I have almost 3000 text strings to cluster and none are blank. Each has a unique ID and a description. I'm getting the following error...

Exception: java.lang.NullPointerException error.

I tried re-installing the software per a previously reported issue, but that didn't help. I turned on debugging, but the trace is not helping me either. When I click "send bug report", I get the following error...

Cannot connect to BugZilla server. Please check your internet connection and try again.

My internet works fine (I'm able to run "updates and extensions" for example)

Here is my xml...






//Local Repository/Result 1 Process Document Cluster
//Local Repository/Result 2 clustering
//Local Repository/Result 3

































































Any help is appriciated. Thanks!!
Tagged:

Answers

  • Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University ProfessorPosts:1,984RM Engineering
    Hi,

    thank you for the report! I've added it to our internal bug tracker.

    Regards,
    Marco
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:2531年Unicorn
    Hi Diane,

    although this is a user unfriendly bug, that is occurring in your situation, this only results from a wrongly set Process Documents Operator. You de-selected the parameter "create word vector". This results in an ExampleSet without any columns, with the exception of the text column, which is special. K-Means should be able to give a proper error message, however, it can't run in this situation.

    To solve this problem, simply enable the creation of the word vectors in the parameters of Process Documents from Data.

    Greetings,
    Sebastian
  • Diana_WegnerDiana_Wegner MemberPosts:4Contributor I
    Thank you for the quick response! K-means now runs.:D

    However, there is a follow-up question. K-means, like many of the other clustering modules, places all 3000 records in one cluster. I've tried different parameters with no luck. Do you have any hints to resolve this issue?

    Again, thanks for your help!

    Diana
Sign InorRegisterto comment.