How can we check whether a word is present in the list or not?
Hi All!
I have a dataset, which is having 3 columns. The first 2 columns are having a list of words and 3r column has a single word in each row. I need to check that word present in the 3rd column whether it's present in 1st column list or 2nd column list.
Source Data:
list1 list2 ch
shape,size,type,endi toldis,umbr,oilv,poll type
shape,size,type,endi toldis,umbr,oilv,poll oilv
shape,size,type,endi toldis,umbr,oilv,poll umbr
Desired output:
list1 list2 ch flag_1(list1) flag_2(list2)
shape,size,type,endi toldis,umbr,oilv,poll type 1 0
shape,size,type,endi toldis,umbr,oilv,poll oilv 0 1
shape,size,type,endi toldis,umbr,oilv,poll umbr 0 1
as "type" is present in list1 flag_1 should be "1" and flag_2 should be "0"
"oilv" and "umbr" are present in list2 column so flag_2 should be "1" for them.
I have tried array_contains, IN, NOT IN and loop values but unable to get the required answer. can anyone help me in resolving this?
Thanks in Advance!
I have a dataset, which is having 3 columns. The first 2 columns are having a list of words and 3r column has a single word in each row. I need to check that word present in the 3rd column whether it's present in 1st column list or 2nd column list.
Source Data:
list1 list2 ch
shape,size,type,endi toldis,umbr,oilv,poll type
shape,size,type,endi toldis,umbr,oilv,poll oilv
shape,size,type,endi toldis,umbr,oilv,poll umbr
Desired output:
list1 list2 ch flag_1(list1) flag_2(list2)
shape,size,type,endi toldis,umbr,oilv,poll type 1 0
shape,size,type,endi toldis,umbr,oilv,poll oilv 0 1
shape,size,type,endi toldis,umbr,oilv,poll umbr 0 1
as "type" is present in list1 flag_1 should be "1" and flag_2 should be "0"
"oilv" and "umbr" are present in list2 column so flag_2 should be "1" for them.
I have tried array_contains, IN, NOT IN and loop values but unable to get the required answer. can anyone help me in resolving this?
Thanks in Advance!
0
Best Answer
-
BalazsBarany Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified ExpertPosts:953UnicornHi,
it is easy with Generate Attributes. I tried two different approaches:
if(contains(list1, ch), 1, 0)
if(matches(list2, "(^|.*,)" + ch + "($|,.*)"), 1, 0)
The solution with "contains()" is simpler but not exactly foolprof: it could also match substrings.
正则表达式搜索与匹配()检查s for "either the start of the string or a text followed by a comma", the search string, and "either the end of the string or a text after a comma".
Here's an example process:
Regards,
Balázs0