[Solved] Weighting examples

qwertz · August 2012

Dear all,

Does anyone happen to know whether there is a good way to weight examples?
I would like to achieve that newer examples are weighted higher.

ID att1 att2 weight
a 12 45 1
b 10 27 2
c 33 17 3

I tried to loop over all examples and write the iteration macro into a new generated attribute but that didn't work.





















< portSpacing端口= " source_exampleset" spacing="0"/>

Thank you for sharing your ideas...
Sachs

Nils_Woehler · August 2012

Hi,

you can try to use the Generate ID operator

Best,
Nils

qwertz · August 2012

Hi Nils,

谢谢你的主意。我不是p接缝recise enough in my formulation. The weight is not supposed to be incremented by one for each example but by a value that could be different each time the process is being run.

So it could also be that weights like this have to be applied:

ID att1 att2 weight
a 12 45 2
b 10 27 4
c 33 17 6

Bye for now & take care
Sachs

haddock · August 2012

Just add a 'generate attributes" operator to Nils' answer, then you can have whatever you want.

Best
H

qwertz · August 2012

Ok, that means that I have to create an ID first. In a second step I can generate another attribute then which is a function of ID.
非常感谢你

Kind regards
Sachs

qwertz · August 2012

When I generate a new ID the former ID is being removed. Therefore, I have to set another role to the former ID first, generate a new ID, set role of the new ID to weight and finally set role of the former ID back to ID. Just wanted to share that...

All the best
Sachs

haddock · August 2012

No, just set the role of yournew attributeto weight.

qwertz · August 2012

Hi haddock,

I tried your proposal and found that it works if the former id is a number.
However, in my data set the id is a date and in this case it doesn't work. No idea why ???

In the attached sample process represents an implementation of your proposal. --> Result is that "data" id attribute is missing.
Connect and activate the two "set role" operators as described in my last post and it works.
Seems to be a bug related to the date type.

http://datahost.bplaced.net/sample4.xls

Best regards
Sachs

haddock · August 2012

Hi again,

Don't want to sound like the Thought Police, so here are some tips towards RM Nirvana.

1. Treat dates as dates!
2. Observe Marius' etiquette on questions.
3. Be careful about bug calling.

That being said, here's some code.

Best

H

qwertz · August 2012

Hi haddock,

Your are right, bug calling was probably a little too hasty.

Referring to the issue again: In my process the date column is classified as type "date" and role "id". Therefore, my understanding is, that it is treated as a date already. Consequently, I don't understand why I cannot have a column which is both, type "date" and role "id" at the same time in the given setup. (Code see my last post).

All the best
Sachs

haddock · August 2012

Hi

Fair enough, you can always declare "date" as an "id" later - that avoids your double id issue,andsaves an operator, because you can use the " no double ids " property to advantage, like this..

I spend most of my time CUDA programming, and am probably a bit obsessed by speed and clarity!

Best

H

PS Ignore ( nearly always ) the warnings, they areonlywarnings, just press the green j!
PPS Green if running on RA, Blue on RM.

qwertz · August 2012

So it seems to be a kind of a hidden feature that RapidMiner only allows a single ID in the data set and removes the others automatically.

Thanks & have a nice day
Sachs

haddock · August 2012

喂! !

Indeedy, data doesn't make much sense when it has more than one identity, bit like humans

On the other hand we each contributed to a neat solution, so grouping is cool 8)

Best

H

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

[Solved] Weighting examples

Answers