Filtering with whitespace regex not working

AKAK MemberPosts:2Contributor I
edited November 2018 inHelp

I have nominal attributes Entity and EntityType. Entity contains a number of names of people and organizations. EntityType contains which type the Entity is (Organization or Person). I am trying to filter on EntityType == Person where Entity contains a full name. I want to omit single names.

I am using Filter Examples with attribute_value_filter. The string parameter is set as below.

EntityType = PERSON && Entity = [a-zA-Z]*\s.*

This does not select any of the names. I think I have isolated the issue to the whitespace character class within Entity. Any filtering on whitespace fails. For example, all of the below parameters failed to catch Bart Simpson.

Entity = Bart\sSimpson

Entity = Bart[ ]Simpson

Entity = Bart Simpson

However, the below worked.

Entity = Bart.+Simpson

Any ideas on how to match full names?

下面是我的一个例子want my filter to work.

Input
Entity EntityType
Fox Network ORGANIZATION
Homer J. Simpson PERSON
Homer PERSON
Marge PERSON
Bart Simpson PERSON
Lisa Simpson PERSON

Desired Output
Entity EntityType
Homer J. Simpson PERSON
Bart Simpson PERSON
Lisa Simpson PERSON

Tagged:

Answers

  • JEdwardJEdward RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:578Unicorn

    I'm finding that Bart\sSimpson works in the data sample you gave. What if it isn't a single space?

    Have you tried Bart\s+Simpson for example?

Sign InorRegisterto comment.