"Difficulties using Filter Tokens (by Region) operator"
I am using the text processing extension to extract information from patent files. If I use Tokenization and some other filters (like Stoppword - Filter) it works fine.
If I work with the Filter Tokens (by Region) operators I am getting zero results. The condition is: Contains "Klebstoff", no case sensitive. This expression appears many times in the readed documents. Interestingly, the program complains if I select the option contains that the regular expression must be specified. In my thought I need this regular expression only if I select the match condition. I am wrong here?
My idea is the automatic extraction from patentfiles content around a given subject. Any help is willcome, I am working on my master thesis.
For the test I have put in the same expression for the condition regular expression and search string. Without defining the regular expression the filter does not work.
If I work with the Filter Tokens (by Region) operators I am getting zero results. The condition is: Contains "Klebstoff", no case sensitive. This expression appears many times in the readed documents. Interestingly, the program complains if I select the option contains that the regular expression must be specified. In my thought I need this regular expression only if I select the match condition. I am wrong here?
My idea is the automatic extraction from patentfiles content around a given subject. Any help is willcome, I am working on my master thesis.
For the test I have put in the same expression for the condition regular expression and search string. Without defining the regular expression the filter does not work.
Tagged:
0
Answers