"Send xls file to Rapidanalytics service"

juliojulio MemberPosts:17Contributor II
edited June 2019 inHelp
Dear all,

Is it possible to process a plain .xls file from a Rapidanalytics service (external input and READ Excel operator)? There is a good video explaining how .csv files work, but how about .xls? It is supported? Some trick to take into account?

I am getting a "file cannot be loaded error".
With rapidminer and local .xls files, the service works fine.

Thank you,

Julio
Tagged:

Answers

  • JEdwardJEdward RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:578Unicorn
    Hi Julio,

    Are you meaning using the Read Excel within a process setup as a service? Or do you mean loading the file directly within the RA web interface?
    The Read Excel method works (to the best of my knowledge).

    J
  • juliojulio MemberPosts:17Contributor II
    Hi J,

    Read Excel directly works fine. However, we could not make it work posting a .xls file into a service.
    FYI, as an additional test, I read the posted file and saved it. Excel then warned that the file was not correct. Excel was though able to retrieve most of the info.

    We are on Rapidminer 5.2.8, RapidAnalytics 1.2 and running on Linux. The excel was generated under windows. We tried to work with different encodings, but no difference.

    Any help will be appreciated.

    Thank you,

    Julio
  • JEdwardJEdward RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:578Unicorn
    Hi Julio,

    You tried several encodings, but did you try all of them?;)

    I used the loop parameters operator & handle exception to go through every encoding & try to encode, read an xls file then store it in the repository.
    The Handle Exception would skip any errors on ones it couldn't open & then (when it found an encoding that it could read) it would store it.
    Apparently, the winning encoding from the sample xls I used was ISO-8859-1 so try this on your process first & if not I've included my loop in the sample process below, you'd just need to reenable it.

    Hope it helps,
    J

    <过程version = " 5.3.012 " >



    <宏>

    mparam
    "1"






    http://www.econ.yale.edu/~shiller/data/ie_data.xls"/>






    <参数键= "写文档。编码“价值= "O-8859-1,ISO-8859-13,ISO-8859-15,ISO-8859-2,ISO-8859-3,ISO-8859-4,ISO-8859-5,ISO-8859-6,ISO-8859-7,ISO-8859-8,ISO-8859-9,JIS_X0201,JIS_X0212-1990,KOI8-R,KOI8-U,Shift_JIS,TIS-620,US-ASCII,UTF-16,UTF-16BE,UTF-16LE,UTF-32,UTF-32BE,UTF-32LE,UTF-8,windows-1250,windows-1251,windows-1252,windows-1253,windows-1254,windows-1255,windows-1256,windows-1257,windows-1258,windows-31j,x-Big5-HKSCS-2001,x-Big5-Solaris,x-euc-jp-linux,x-EUC-TW,x-eucJP-Open,x-IBM1006,x-IBM1025,x-IBM1046,x-IBM1097,x-IBM1098,x-IBM1112,x-IBM1122,x-IBM1123,x-IBM1124,x-IBM1364,x-IBM1381,x-IBM1383,x-IBM33722,x-IBM737,x-IBM833,x-IBM834,x-IBM856,x-IBM874,x-IBM875,x-IBM921,x-IBM922,x-IBM930,x-IBM933,x-IBM935,x-IBM937,x-IBM939,x-IBM942,x-IBM942C,x-IBM943,x-IBM943C,x-IBM948,x-IBM949,x-IBM949C,x-IBM950,x-IBM964,x-IBM970,x-ISCII91,x-ISO-2022-CN-CNS,x-ISO-2022-CN-GB,x-iso-8859-11,x-JIS0208,x-JISAutoDetect,x-Johab,x-MacArabic,x-MacCentralEurope,x-MacCroatian,x-MacCyrillic,x-MacDingbat,x-MacGreek,x-MacHebrew,x-MacIceland,x-MacRoman,x-MacRomania,x-MacSymbol,x-MacThai,x-MacTurkish,x-MacUkraine,x-MS932_0213,x-MS950-HKSCS,x-MS950-HKSCS-XP,x-mswin-936,x-PCK,x-SJIS_0213,x-UTF-16LE-BOM,X-UTF-32BE-BOM,X-UTF-32LE-BOM,x-windows-50220,x-windows-50221,x-windows-874,x-windows-949,x-windows-950,x-windows-iso2022jp"/>











































    <参数键= value = " Dividend.true.real.attr“8”ibute"/>




























































    <参数键= value = " Dividend.true.real.attr“8”ibute"/>












  • juliojulio MemberPosts:17Contributor II
    Hi J,

    Sorry (again) about the misunderstanding. Apparently I am just not using enough words!

    Getting from Rapidminer an xls from a remote web page works fine (Rapidminer picking it up). No encoding issues. That is the example that you published (actually we explored this as a potential workaround but we need basic authentication/site-login from RapidAnalytics, which we were not able to figure out (but I just saw that someone answered on that other thread).

    What we cannot make work is the following:

    We define a Rapidminer process. We publish the process in RapidAnalytics as a service. We then POST a .xls file to the RapidAnalytics service. That does not work.

    Still, we will play around with the encoding you mentioned from the sending side and see if that helps!!

    Will be back!

    Julio
  • JEdwardJEdward RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:578Unicorn
    Hi Julio,

    Can you post a sample process as an example?

    Thanks,
    J.
  • juliojulio MemberPosts:17Contributor II
    Hi!

    For the purpose of simplicity:
    We created a one operator process that just reads an excel sheet with format.
    We published this as a service (how can you tell Analytics what file it should connect to what input port if you have multiple files?)







    <宏>

    mparam
    "1"

















    We then created a php script to send a test.xls file


    $target_url = 'http://192.168.0.30:8090/RA/public_process/PostXLSWS';

    $revisionFile = realpath('test.xls');

    $postData = array(
    'revision' => '@'.$revisionFile,
    );

    $response = sendPost($target_url, $postData);

    echo "

    Response Code:

    ";
    echo "

    " . $response['code'] . "

    ";
    echo "

    Response Content:

    ";
    echo "
    ";
    echo "

    " . $response['content'] . "

    ";
    echo "
    ";
    echo "

    REFRESH

    ";


    function sendPost($address, $data) {

    $ch = curl_init();

    curl_setopt($ch, CURLOPT_URL, $address);
    curl_setopt($ch, CURLOPT_POST, 1);
    curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

    $responseContent = curl_exec($ch);
    $responseCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);

    $response = array();
    $response['content'] = $responseContent;
    $response['code'] = $responseCode;

    curl_close($ch);

    return $response;

    }

    ?>
    We are getting the following error:
    de.rapidanalytics.ejb.service.ServiceDataSourceException: Error executing process /home/anonymous/CRMDocuments/PostXLS for service PostXLSWS: Could not read file 'null': Unable to recognize OLE stream.

    I assume that the file we post should be in standard format, and not a blob...?

    Thank you!

    Julio
Sign InorRegisterto comment.