You are viewing the RapidMiner Studio documentation for version 9.3 -Check here for latest version
Read SPSS(Advanced File Connectors)
Synopsis
This operator is used for reading SPSS files.Description
The Read SPSS operator can read the data files created by SPSS (Statistical Package for the Social Sciences), an application used for statistical analysis. SPSS files are saved in a proprietary binary format and contain a dataset as well as a dictionary that describes the dataset. These files save data by 'cases' (rows) and 'variables' (columns).
These files have a '.SAV' file extension. SAV files are often used for storing datasets extracted from databases and Microsoft Excel spreadsheets. SPSS datasets can be manipulated in a variety of ways, but they are most commonly used to perform statistical analysis tests such as regression analysis, analysis of variance, and factor analysis.
Input
file(File)
This optional port expects a file object.
Output
output(IOObject)
Data from the SPSS file is delivered through this port mostly in form of an ExampleSet.
Parameters
- filename这个参数规格ifies the path of the SPSS file. It can be selected using thechoose a filebutton.Range: filename
- datamanagementThis parameter determines how the data is represented internally. This is an expert parameter. There are different options, users can choose any of them.Range: selection
- attribute_naming_modeThis parameter determines which SPSS variable properties should be used for naming the attributes.Range: selection
- use_value_labels这个参数规格ifies if the SPSS value labels should be used as values.Range: boolean
- recode_user_missings这个参数规格ifies if the SPSS user missings should be recoded to missing values.Range: boolean
- sample_ratio这个参数规格ifies the fraction of the data set which should be read. If it is set to 1, the complete data set is read. If it is set to -1 then thesample sizeparameter is used for determining the size of the data to read.Range: real
- sample_size这个参数规格ifies the exact number of samples which should be read. If it is set to -1, then thesample ratioparameter is used for determining the size of data to read. If both are set to -1 then the complete data set is read.Range: integer
- use_local_random_seedThis parameter indicates if alocal random seedshould be used for randomization. Using the same value oflocal random seedwill produce the same randomization.Range: boolean
- local_random_seed这个参数规格ifies thelocal random seed. This parameter is only available if theuse local random seedparameter is set to true.Range: integer
Tutorial Processes
Reading an SPSS file
You need to have an SPSS file for this process. In this process, the name of the SPSS file is airline_passengers.sav and it is placed in the D drive of the computer. The file is read using the Read SPSS operator. All parameters are used with default values. After execution of the process you can see the resultant ExampleSet in the Results Workspace.