"StopWordFilterFile" doesn't work
IngoRM
Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University ProfessorPosts:1,751RM Founder
inHelp
Original message from SourceForge forum athttp://sourceforge.net/forum/forum.php?thread_id=2039566&;forum_id=390413
Hi,
I want to use the "StopWordFilterFile"-operator in a Java-application to filter terms based on a list in an external file.
My code looks like the following:
=====================================================
...
StopWordFilterFile stopfilter = new StopWordFilterFile(new FileReader(new File(Constants.STOPWORDS_PATH + filename)), false);
config.setConfigurationRule(WVTConfiguration.STEP_WORDFILTER, new WVTConfigurationFact(stopfilter));
...
=====================================================
But when I use the constructor to set the stopword-file and the case-sensitive-flag which should be provided in the 4.1-version of the RapidMiner I only got the following error-message:
=====================================================
cannot find symbol
[javac] symbol : constructor StopWordFilterFile(java.io.FileReader,boolean)
[javac] location: class edu.udo.cs.wvtool.generic.wordfilter.StopWordFilterFile
[javac] StopWordFilterFile stopfilter = new StopWordFilterFile(new FileReader(new File(Constants.STOPWORDS_PATH + filename)), false);
=====================================================
Also the method "setMinNumChars" isn't provided anymore.
=====================================================
((AbstractStopWordFilter) filter).setMinNumChars(1);
=====================================================
Because in my application I also want to give the user the option to use no stopword-filter. Otherwise if no WVTConfiguration.STEP_WORDFILTER is set, a default-stopword-filter will be executed which filters all 3-character-words which isn't what I want.
Hope someone can help me.
Greetings,
Mary-Anne
Answer by Ingo Mierswa:
Hello Mary-Anne,
I must admit that I don't see any problem with the line
StopWordFilterFile stopfilter = new StopWordFilterFile(new FileReader(new File(Constants.STOPWORDS_PATH + filename)), false);
The constructor is still there and therefore this should not be problem. Although the problem might be that we changed the way text processing should be performed: now inner operators should be used instead of the one big WVToolOperator in previous releases. The single big operator is now deprecated and will no longer be supported in future versions. Maybe this is the reason for the problem.
而不是使用setMi方法”nNumChars" you could now use the operator "TokenLengthFilter".
I would really recommend to redefine you text processing process in the GUI by using inner operators (have a look at the samples delivered together with the text plugin) and change you program according to this new architecture. It is actually more convenient and more powerful now.
Cheers,
Ingo
Hi,
I want to use the "StopWordFilterFile"-operator in a Java-application to filter terms based on a list in an external file.
My code looks like the following:
=====================================================
...
StopWordFilterFile stopfilter = new StopWordFilterFile(new FileReader(new File(Constants.STOPWORDS_PATH + filename)), false);
config.setConfigurationRule(WVTConfiguration.STEP_WORDFILTER, new WVTConfigurationFact(stopfilter));
...
=====================================================
But when I use the constructor to set the stopword-file and the case-sensitive-flag which should be provided in the 4.1-version of the RapidMiner I only got the following error-message:
=====================================================
cannot find symbol
[javac] symbol : constructor StopWordFilterFile(java.io.FileReader,boolean)
[javac] location: class edu.udo.cs.wvtool.generic.wordfilter.StopWordFilterFile
[javac] StopWordFilterFile stopfilter = new StopWordFilterFile(new FileReader(new File(Constants.STOPWORDS_PATH + filename)), false);
=====================================================
Also the method "setMinNumChars" isn't provided anymore.
=====================================================
((AbstractStopWordFilter) filter).setMinNumChars(1);
=====================================================
Because in my application I also want to give the user the option to use no stopword-filter. Otherwise if no WVTConfiguration.STEP_WORDFILTER is set, a default-stopword-filter will be executed which filters all 3-character-words which isn't what I want.
Hope someone can help me.
Greetings,
Mary-Anne
Answer by Ingo Mierswa:
Hello Mary-Anne,
I must admit that I don't see any problem with the line
StopWordFilterFile stopfilter = new StopWordFilterFile(new FileReader(new File(Constants.STOPWORDS_PATH + filename)), false);
The constructor is still there and therefore this should not be problem. Although the problem might be that we changed the way text processing should be performed: now inner operators should be used instead of the one big WVToolOperator in previous releases. The single big operator is now deprecated and will no longer be supported in future versions. Maybe this is the reason for the problem.
而不是使用setMi方法”nNumChars" you could now use the operator "TokenLengthFilter".
I would really recommend to redefine you text processing process in the GUI by using inner operators (have a look at the samples delivered together with the text plugin) and change you program according to this new architecture. It is actually more convenient and more powerful now.
Cheers,
Ingo
0