"Extract Information Regular Expression query type failed (Text Processing)"
CharlieFirpo
MemberPosts:48Contributor II
Dear All!
I have a simple process: Create Document + Extract Information. I create a simple text: "string1 string2 string3 string4" and I use a simple regular expression: ^\S* so I want to extract the first string from my document. And RapidMiner gives the following error: Process Failed. No group 1.
If I use not a Regular Expression query type but a String Matching one and I set string2 and string4 at query expression, then I get string3 as result. So String Matching works well. But Regular Expression does not.
有人能检查瞿这为什么ery type does not work? Or did I make any mistake? (what?)
If I use Regular Region and set eg. ^\S* and .* as region delimiter, then RapidMiner gives the correct result: string1 string2 string3 string4.
Only the normal Regular Expression does not work........
Of course if I use Regular Region and ^\S* and '\ ' as the two delimiter, then I will get the result I want: string1
But why Regular Expression query type does not work?
Thank you for reading it and trying to help me!
I have a simple process: Create Document + Extract Information. I create a simple text: "string1 string2 string3 string4" and I use a simple regular expression: ^\S* so I want to extract the first string from my document. And RapidMiner gives the following error: Process Failed. No group 1.
If I use not a Regular Expression query type but a String Matching one and I set string2 and string4 at query expression, then I get string3 as result. So String Matching works well. But Regular Expression does not.
有人能检查瞿这为什么ery type does not work? Or did I make any mistake? (what?)
If I use Regular Region and set eg. ^\S* and .* as region delimiter, then RapidMiner gives the correct result: string1 string2 string3 string4.
Only the normal Regular Expression does not work........
Of course if I use Regular Region and ^\S* and '\ ' as the two delimiter, then I will get the result I want: string1
But why Regular Expression query type does not work?
Thank you for reading it and trying to help me!
Tagged:
0
Answers
But the second Extract Information operator works on the whole original document. I checked that the input document of the second Extract Information operator is 'string2 string3 string4'.
So why does the second Extract Information operator extract not this but the original 'string1 string2 string3 string4'?
Thank you!
You have to use brackets when using the Regular Expression query type at Extract Information operator.
So eg.:
wrong: ^\S*
good: (^\S*)
These brackets are not part of the regular expression.
Nice day!
Is there anyone know why is this 'bracket soulution' required for Rapidminer. As I am always aware of, a regular expression does not need brackes unless a it is a group captures..
Thank you