time series creation

data1mathsdata1maths MemberPosts:27Contributor I
edited December 2018 inHelp

Hello everyone,

我甲肝病毒e this date-time ettribut from which i want to create its appropriate time series counting the number of events per hour, so that i can plot the evolution of the time series and apply arima on it later on.

ID Date-time
1 31/12/2017 22:55
2 31/12/2017 22:27
3 31/12/2017 22:17
4 31/12/2017 21:55
5 31/12/2017 20:48
6 31/12/2017 11:03
7 31/12/2017 09:58
8 31/12/2017 09:52
9 31/12/2017 09:16
10 31/12/2017 08:11
11 31/12/2017 07:55
12 31/12/2017 06:54
13 31/12/2017 04:10
14 30/12/2017 12:15
15 30/12/2017 11:20
16 30/12/2017 10:50
17 30/12/2017 05:40
18 30/12/2017 04:49
19 30/12/2017 04:30
20 30/12/2017 04:00
21 30/12/2017 02:34
22 29/12/2017 23:02
23 29/12/2017 18:38
24 29/12/2017 08:04
25 29/12/2017 01:30
26 28/12/2017 23:59
27 28/12/2017 16:22
28 28/12/2017 14:43
29 28/12/2017 14:39
30 28/12/2017 14:39
31 28/12/2017 14:02
32 28/12/2017 13:46
33 28/12/2017 13:28
34 28/12/2017 12:51
35 28/12/2017 12:45
36 28/12/2017 12:24
37 28/12/2017 12:06
ETC

I do really need your help.

Thank you in advance.

Best regards

Tagged:

Best Answers

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, MemberPosts:1,195Unicorn
    Solution Accepted

    Hi@data1maths,

    我甲肝病毒e no idea how to perform your task with RapidMiner's native operator(s).

    So I propose a solution using a Python script :

    Time_series_count.png

    To execute this process, you have to :

    - Install Python on your computer

    - Install thePython Scriptingextension from the MarketPlace

    - Set the name of your datetime attributewith quotesin the parameters of theSet Macrosoperator :

    Time_series_count_2.pngikik

    The process :










































    I hope it helps,

    Regards,

    Lionel

    data1maths
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,404RM Data Scientist
    Solution Accepted

    Hi,

    something like this should do it.

    BR,

    Martin




























    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
    sgenzer
  • tftemmetftemme Administrator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, RMResearcher, MemberPosts:164RM Research
    Solution Accepted

    Hi@data1maths,

    For creating timestamps for time series data the 'Create ExampleSet' operator from the toolbox extension is the way to go, as@mschmitzpointed out.

    When I understand you correctly, you have already timestamps and wants to count the events per hour, right?

    您可以使用生成与date_get属性(倍tamp,DATE_UNIT_HOUR) to extract the hour of your timestamp and then the Aggregate operator with the new hour_attribute as the group by attributes and count(id) to get the number of events in the corresponding hour. When you have events in every hour, you are already finshed. If not you should create hourly timestamps with the Create ExampleSet and joins this with your results and then Replace Missing Values with 0.

    Best regards,
    Fabian

    sgenzer MartinLiebig data1maths
  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University ProfessorPosts:1,751RM Founder
    Solution Accepted

    Hi,

    And last but not least, you can also get to the desired result with Turbo Prep. In fact, there are multiple ways to achieve this. Here is one I like personally:

    1.你的数据导入到存储库,打开它in Turbo Prep. Please note that I kept the date column in my example below as categorical (nominal) column. You may need to adapt this a bit if you already have a proper date column.

    tp_timestamps1.pngLoad in timestamp data into Turbo Prep

    2. Copy the date column in Transform. I called the new column "Hour".

    tp_timestamps2.pngCopy date column

    3. Change the type of the new Hour column to "date and time" using the "Change Type" function in the Transform group. For the date format you show above, the correct date format string would be "dd/MM/yyyy HH:mm". Don't forget to change the output type to "date and time" as well!

    tp_timestamps3.pngChange type to date

    4. Now change the type of the new column again, but this time to "number". Extract the "hour" relative to the "epoch". You now should have three columns: ID, the date / time column (still categorical), and a numerical column with all the hours (starting with a high number, but that should not bother you - if it does, you can change it later on. This column will be useful for plotting though...)

    tp_timestamps4.pngExtract hour as number

    5. Now comes the cool trick and a bit of regular expression magic. You select your original date column (you can do all of this in the same transformation session BTW) and use the "Replace" function. You should activate the regular expressions and replace "(.*):.*" with "$1:00". This will replace all minutes in this column by "00" which allows you to use these new values as a nice representation in your aggregation.

    tp_timestamps5.pngUse a regular expression to replace the minutes by "00"

    6. The final result should look like the table below. Commit it.

    tp_timestamps6.pngFinal result of transformations - ready for the aggregation

    7. Now go into the "Pivot" group and use the date column (the one without the minutes) and the numerical hours column as "Group-By" columns. Use the ID column as aggregate at the bottom - but change the function to "count". This result is below.

    tp_timestamps7.pngCreate the Pivot table

    8. You can sort the data now according to the Hours column in the Transform group. Then you can plot it to your heart's content. Of course you can also build a process from this etc.

    tp_timestamps8.pngThe final chart as bars

    Hope this helps. I will post the created process below as well.

    Best,

    Ingo











    Loading timestamps





















    Unify column types


    <参数键= =“日期时间”/“attribute_name”价值>

    Copy Date-time to Hour


















    Parse dates in Hour



















    Extract hour of epoch for Hour






    Replace '(.*):.*' by '$1:00' in Date-time






    Aggregate ID grouped by Date-time, Hour



    Ensure order of columns matches the table view.














    sgenzer kypexin data1maths

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,404RM Data Scientist

    Hi@data1maths,

    have a look at "Create Example Set" in operator toolbox. One of the examples shows a similar way of using it.

    BR,

    Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
    sgenzer data1maths
  • data1mathsdata1maths MemberPosts:27Contributor I

    Himschmitz,

    Honestly i couldn't find what i needed in the exemples.

    What i want is to get like for the time between 22h 23h of the 31/12/2017 it gets me 3 as the number of records in this hour.

    先生感谢你的帮助。

    Best regards

  • data1mathsdata1maths MemberPosts:27Contributor I

    Thank you so much to all of you.

    Best Regards

Sign InorRegisterto comment.