Classification Problem
Hi
我使用事例的算法分类。
my learning data has 3 dimensions and like this:
Learning Data:
Username(Polynomial),Count_of_connections(Integer), Destination_IP(Polynomial), Status (as label)
Alex, 100,172.16.1.11,normal
Alfred,8,172.16.10.50, anomaly
Mat, 200, 172.16.5.1 , normal
Angelo, 50, 172.16.4.11, normal
Test Data:
Alexis, 8 ,172.16.1.10
Result:
Alexis, 8 ,172.16.1.10, normal
I want to the algorithm do not compare text in username and Destination_IP and decide based on Count_of_Connections only.
It means the characters of Username and Destination_IP values most not effect on algorithm and this dimensions considered as unit entity, i don't know how solve this problem!
thanks friends.
Tagged:
0
Answers
Just my 5 cents to add to Rodrigo's detailed answer. All models are using only attributes of type 'regular' for training and predictions, so to quickly exclude any variables subset from modelling I often use this type of hack when I change unwanted variables type from regular to 'id' or 'cluster', and convert them back to 'regular' afterwards.
Vladimir
http://whatthefraud.wtf