编写一类支持向量机的输入滤波器

niccay · 2010年10月

你好,

我正在使用一个具有二进制(标称)标签属性的数据集，我希望通过One-Class SVM进行分类。在我的项目中，我用csv导入操作符导入我的数据集，并将其用作x验证操作符的输入。

X-Validation只包含一个简单的单类(Lib)SVM学习器和用于测试应用模型+性能操作符。

不幸的是，One ClassLibSVM学习器期望我的训练集只有一个标签值(我有两个不同的值…)，因此我认为编写一个丢弃正(少数)类示例的过滤器应该足以完成这项任务(见下面的源代码)。不幸的是，事实并非如此。好吧，我想我必须修复标称映射我的标签属性，但我不知道从哪里开始，该怎么做。

有人知道怎么解决我的问题吗?(但请帮我一个小忙，不要参考白皮书)

谢谢,
尼科


/ *
*
*删除样本集的少数类示例，以便使用单类SVM
*
* * /
公共exampleeset apply(exampleeset trainingsset)抛出operatoreexception {
List majorityExamples = new LinkedList();
nomalattribute标签= (nomalattribute) trainingsset.getAttributes().getLabel();
double minorityLabelValue = getMinorityClass(trainingsset);


//只关心exampleset包含2个不同的类
if(label.isNominal() && label.getMapping().size()==2)

for(例如:trainingsset)
{
如果(e.getLabel () ! = minorityLabelValue)
majorityExamples.add (e);

｝

//不工作:-(
// nomalmapping labelMapping = label.getMapping();
//(String majorityLabelMapping = label.getMapping().getNegativeString();
/ / labelMapping.clear ();
/ / labelMapping。setMapping (majorityLabelMapping 0);

返回computeExampleSet (majorityExamples);
｝

protectedexampleset computeExampleSet(List examples)
{
int anzBsp = examples.size();
int anzAtr = examples.get(0).getAttributes().size();
double[][] values = new double[anzBsp][anzAtr];
double[] labels = new double[anzBsp];
String[] attrNames = new String[anzAtr];
字符串labelName;
int i = 0, n = 0;

for(属性a: examples.get(0).getAttributes()) {
attrNames= a.getName ();
我+ +;
｝
属性标签= examples.get(0).getAttributes().getLabel();
labelName = label.getName();
我= 0;
for(示例e: examples) {
n = 0;
for(属性a: e.getAttributes()) {
值= e.getValue(一个);
n + +;
｝
标签= e.getLabel ();
我+ +;
｝

返回ExtendedExampleSetFactory.createExampleSet(价值观、标签attrNames、labelName label.getMapping ());


｝

createExampleSet(双[][]data，双[]labels, String[] attrNames, String labelName, nomalmapping lblMapping) {
如果(数据)。长度== 0){
抛出新的运行时异常("ExampleSetFactory. "createExampleSet(double[][]， double[]):数据矩阵不允许为空");
｝

//创建属性
int numberOfAttributes = data[0].length;
List attributeList = new ArrayList(numberOfAttributes + (labels != null ?)1: 0));
For (int a = 0;a < numberOfAttributes;+ +) {
attributeList.add (AttributeFactory.createAttribute (attrNamesOntology.NUMERICAL));
｝
labelAttribute = null;

If (label != null) {
labelAttribute = AttributeFactory。createAttribute (labelName Ontology.NOMINAL);
labelAttribute.setMapping (lblMapping);
attributeList.add (labelAttribute);
｝


//创建表
MemoryExampleTable表= new MemoryExampleTable(attributeList);
For (int e = 0;E < data.length;e + +) {
double[] dataRow = data;
if (labelAttribute != null) {
dataRow = new double[numberOfAttributes + 1];
系统。arraycopy(data, 0, dataRow, 0, data.length);
我们[我们。长度- 1]=标签;
｝
表格addDataRow(新DoubleArrayDataRow(我们));
｝

返回table.createExampleSet (labelAttribute);
｝

土地 · 2010年10月

你好,
实际上有一份白皮书…P:开个玩笑。

查看CommunityExtension并下载一个类svm的过程。Marco Stolpe在另一个线程中描述，他已经创建了一个过程，用于在具有二项标签的训练集上训练和验证一类SVM。

问候,
塞巴斯蒂安。

你好,陌生人!

快速链接

类别

Altair RapidMiner社区

得到帮助。学习最佳实践。与你的同事建立联系。

编写一类支持向量机的输入滤波器

答案