Mining Time series Data using Rapid Miner
Dear All,
This is regarding the Mining of Time series data.
I have a time series data as follows :
Date Feature Time
1/1/2013 Add 1:00:00 PM
1/1/2013子1:01:00点
1/1/2013 Add 1:02:00 PM
1/1/2013 Equals 1:03:00 PM
1/1/2013 Add 1:04:00 PM
1/1/2013 Equals 1:05:00 PM
1/1/2013 Add 1:04:00 PM
1/1/2013 Equals 1:05:00 PM
1/1/2013 Add 1:04:00 PM
1/1/2013 Equals 1:05:00 PM
1/1/2013 Add 1:06:00 PM
1/1/2013 Equals 1:07:00 PM
1/1/2013 Add 1:08:00 PM
1/1/2013 Equals 1:09:00 PM
1/1/2013 Add 1:10:00 PM
1/1/2013 Equals 1:11:00 PM
1/1/2013 Add 1:12:00 PM
1/1/2013 Equals 1:13:00 PM
1/1/2013 Add 1:14:00 PM
1/1/2013 Equals 1:15:00 PM
1/1/2013 Add 1:16:00 PM
1/1/2013 Equals 1:17:00 PM
1/1/2013 Add 1:18:00 PM
1/1/2013 Equals 1:19:00 PM
1/1/2013 Add 1:20:00 PM
1/1/2013 Equals 1:21:00 PM
1/1/2013 Add 1:22:00 PM
1/1/2013 Equals 1:23:00 PM
By observing the data is clear that most of the times "Add" is followed by "Equals"
Please help in identifying the appropriate Mining technique to arrive at such kind of result and procedure to do the same.
Thanks in Advance
Regards,
Uday.
This is regarding the Mining of Time series data.
I have a time series data as follows :
Date Feature Time
1/1/2013 Add 1:00:00 PM
1/1/2013子1:01:00点
1/1/2013 Add 1:02:00 PM
1/1/2013 Equals 1:03:00 PM
1/1/2013 Add 1:04:00 PM
1/1/2013 Equals 1:05:00 PM
1/1/2013 Add 1:04:00 PM
1/1/2013 Equals 1:05:00 PM
1/1/2013 Add 1:04:00 PM
1/1/2013 Equals 1:05:00 PM
1/1/2013 Add 1:06:00 PM
1/1/2013 Equals 1:07:00 PM
1/1/2013 Add 1:08:00 PM
1/1/2013 Equals 1:09:00 PM
1/1/2013 Add 1:10:00 PM
1/1/2013 Equals 1:11:00 PM
1/1/2013 Add 1:12:00 PM
1/1/2013 Equals 1:13:00 PM
1/1/2013 Add 1:14:00 PM
1/1/2013 Equals 1:15:00 PM
1/1/2013 Add 1:16:00 PM
1/1/2013 Equals 1:17:00 PM
1/1/2013 Add 1:18:00 PM
1/1/2013 Equals 1:19:00 PM
1/1/2013 Add 1:20:00 PM
1/1/2013 Equals 1:21:00 PM
1/1/2013 Add 1:22:00 PM
1/1/2013 Equals 1:23:00 PM
By observing the data is clear that most of the times "Add" is followed by "Equals"
Please help in identifying the appropriate Mining technique to arrive at such kind of result and procedure to do the same.
Thanks in Advance
Regards,
Uday.
Tagged:
0
Answers
you should install the Series extension from the Marketplace and use the Windowing operator to bring the data into the correct format. Define the Feature as label attribute, and remove the Date and Time columns if they are not important for the prediction.
If you need any help, please let us know.
Best regards,
Marius
Could you please help in modeling the same. When i used the model with the other data set it is not giving the accurate results.
Please help me in this regard.
Thanks in Advance
Regards,
Uday.
Please help me in interpreting the following data table :
Rows Items Size Freq Support Score
1 Feature-1 = Equals 1.0 158.0 0.48466257668711654 1.0
2 Feature-0 = Equals 1.0 158.0 0.48466257668711654 1.0
3 Feature-1 = Addition 1.0 108.0 0.3312883435582822 1.0
4 Feature-0 = Addition 1.0 108.0 0.3312883435582822 1.0
5 Feature-1 = Subtraction 1.0 56.0 0.17177914110429449 1.0
6 Feature-0 = Subtraction 1.0 56.0 0.17177914110429449 1.0
7 Feature-1 = Equals, Feature-0 = Addition 2.0 107.0 0.3282208588957055 2.044186591654946
8 Feature-1 = Equals, Feature-0 = Subtraction 2.0 50.0 0.15337423312883436 1.842224231464738
9 Feature-0 = Equals, Feature-1 = Addition 2.0 102.0 0.3128834355828221 1.9486638537271452
10 Feature-0 = Equals, Feature-1 = Subtraction 2.0 52.0 0.15950920245398773 1.9159132007233273
Best regards,
Marius
The process is as follows :
<操作符activated="true" class="process" compatibility="5.2.008" expanded="true" name="Process">
<操作符activated="true" breakpoints="after" class="read_excel" compatibility="5.2.008" expanded="true" height="60" name="Read Excel" width="90" x="83" y="136">
<操作符activated="true" class="set_role" compatibility="5.2.008" expanded="true" height="76" name="Set Role" width="90" x="246" y="120">
<操作符activated="true" breakpoints="after" class="series:windowing" compatibility="5.2.000" expanded="true" height="76" name="Windowing" width="90" x="380" y="75">
<操作符activated="true" breakpoints="after" class="select_attributes" compatibility="5.2.008" expanded="true" height="76" name="Select Attributes" width="90" x="447" y="210">
<操作符activated="true" breakpoints="after" class="nominal_to_binominal" compatibility="5.2.008" expanded="true" height="94" name="Nominal to Binominal" width="90" x="581" y="300">
<操作符activated="true" breakpoints="after" class="fp_growth" compatibility="5.2.008" expanded="true" height="76" name="FP-Growth" width="90" x="648" y="165">
<操作符activated="true" class="multiply" compatibility="5.2.008" expanded="true" height="94" name="Multiply" width="90" x="715" y="300"/>
<操作符activated="true" breakpoints="after" class="item_sets_to_data" compatibility="5.2.008" expanded="true" height="76" name="Item Sets to Data" width="90" x="849" y="300"/>
<操作符activated="true" breakpoints="after" class="create_association_rules" compatibility="5.2.008" expanded="true" height="76" name="Create Association Rules" width="90" x="782" y="75">
Thanks & Regards,
Uday
Please do let know if any further information is required.
Thanks & Regards,
Uday.
If you want to predict the value of the next value based on the current and/or previous values, frequent item sets and association rules are probably not the best choice. Try a classification algorithm instead.
Best regards,
Marius
Thanks for the Reply, Just need one clarification can we represent the output of the process in a graphical format , like tree view.
Kindly help me in this regard.
Thanks & Regards,
Uday.
if you create a tree model, just connect the corresponding model output to the process output and you will get a visualization of the tree.
Best regards,
Marius
Sorry for the delay in the response.
Thanks for the Reply
Just need one more clarification regarding the filtering of input data
consider for example if the input data is in the following format:
Date Feature Time
1/1/2013 Add 1:00:00 PM
1/1/2013 Add 1: 00:01 PM
1/1/2013 Add 1: 00:02 PM
1/1/2013 Add 1: 00:03 PM
1/1/2013 Add 1: 00:04 PM
1/1/2013 Add 1: 00:05 PM
1/1/2013 Add 1: 00:06 PM
1/1/2013 Add 1: 00:07 PM
1/1/2013 Add 1: 00:08 PM
1/1/2013子1:01:00点
1/1/2013 Add 1:02:00 PM
1/1/2013 Equals 1:03:00 PM
1/1/2013 Add 1:04:00 PM
1/1/2013 Equals 1:05:00 PM
1/1/2013 Add 1:04:00 PM
1/1/2013 Equals 1:05:00 PM
1/1/2013 Add 1:04:00 PM
1/1/2013 Add 1:16:00 PM
1/1/2013 Equals 1:17:00 PM
1/1/2013 Add 1:18:00 PM
1/1/2013 Equals 1:19:00 PM
1/1/2013 Add 1:20:00 PM
1/1/2013 Equals 1:21:00 PM
1/1/2013 Add 1:22:00 PM
1/1/2013 Equals 1:23:00 PM
1/1/2013 Sub 1:23:01 PM
1/1/2013 Sub 1:23:01 PM
1/1/2013 Sub 1:23:02 PM
1/1/2013 Sub 1:23:03 PM
1/1/2013 Sub 1:23:04 PM
1/1/2013 Sub 1:23:05 PM
1/1/2013 Sub 1:23:06 PM
after applying the filtering or transformations on the data , the data should be as follows:
Date Feature Time
1/1/2013 Add 1: 00:08 PM
1/1/2013子1:01:00点
1/1/2013 Add 1:02:00 PM
1/1/2013 Equals 1:03:00 PM
1/1/2013 Add 1:04:00 PM
1/1/2013 Equals 1:05:00 PM
1/1/2013 Add 1:04:00 PM
1/1/2013 Equals 1:05:00 PM
1/1/2013 Add 1:04:00 PM
1/1/2013 Add 1:16:00 PM
1/1/2013 Equals 1:17:00 PM
1/1/2013 Add 1:18:00 PM
1/1/2013 Equals 1:19:00 PM
1/1/2013 Add 1:20:00 PM
1/1/2013 Equals 1:21:00 PM
1/1/2013 Add 1:22:00 PM
1/1/2013 Equals 1:23:00 PM
1/1/2013 Sub 1:23:06 PM
if on the same date the feature appears with in secs, i need to take the last occurrence of it.
Please help in this regard.
Kindly let me know what transformations or filtering is available in Rapid Miner.
Could you please help me in creating the process , which takes the data of the following format and identify the frequent used patterns in the forward way
This is regarding the Mining of Time series data.
I have a time series data as follows :
Date Feature Time
1/1/2013 Add 1:00:00 PM
1/1/2013子1:01:00点
1/1/2013 Add 1:02:00 PM
1/1/2013 Equals 1:03:00 PM
1/1/2013 Add 1:04:00 PM
1/1/2013 Equals 1:05:00 PM
1/1/2013 Add 1:04:00 PM
1/1/2013 Equals 1:05:00 PM
1/1/2013 Add 1:04:00 PM
1/1/2013 Equals 1:05:00 PM
1/1/2013 Add 1:06:00 PM
1/1/2013 Equals 1:07:00 PM
1/1/2013 Add 1:08:00 PM
1/1/2013 Equals 1:09:00 PM
1/1/2013 Add 1:10:00 PM
1/1/2013 Equals 1:11:00 PM
1/1/2013 Add 1:12:00 PM
1/1/2013 Equals 1:13:00 PM
1/1/2013 Add 1:14:00 PM
1/1/2013 Equals 1:15:00 PM
1/1/2013 Add 1:16:00 PM
1/1/2013 Equals 1:17:00 PM
1/1/2013 Add 1:18:00 PM
1/1/2013 Equals 1:19:00 PM
1/1/2013 Add 1:20:00 PM
1/1/2013 Equals 1:21:00 PM
1/1/2013 Add 1:22:00 PM
1/1/2013 Equals 1:23:00 PM
By observing the data is clear that most of the times "Add" is followed by "Equals"
To Arrive at this conclusion as you mentioned i have selected date as ID and done with the Windowing and applied nominal to binominal operator and then followed by the FP growth operator to identify the frequent itemsets.
But i just want the result Add-> Equals 10(count)
if i set the window size as 2.
FeatureName-1 =添加- > FeatureName-0 = = (13)
FeatureName-0 = Add -> FeatureName-1 = Equals (10)
Which one to consider and i just only the forward rules.
Thanks in Advance
Regards,
Uday
Please do reply, this is very urgent.
Sorry if i am commanding.
Thanks in Advance
Regards,
Uday
I have been on holidays. For very urgent questions we offer commercial support
Anyway, your output is already what you requested, and even a bit more:
FeatureName-1 =添加- > FeatureName-0 = = (13)
FeatureName-0 = Add -> FeatureName-1 = Equals (10)
这告诉你,如果FeatureName-1(之前s action) is "Add", then FeatureName-0 (the current action) is likely to be "Equals". Of course there is also the other direction, represented by the second rule, i.e. if the current action is "Add" then it is likely that the previous action was Equals.
Best regards,
Marius