在 Java 应用程序中集成 RapidMiner

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15834182/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-31 20:59:32  来源:igfitidea点击:

Integration of RapidMiner in Java application

javarapidminer

提问by ArmMiner

I have a text classification process in RapidMiner. It reads the test data from specified excel ssheet and does the classification. I have also a small Java application which is just running this process. Now I want to make the file input part in my aplication, so that everytime I would be able to specify the excel file from my application (not from RapidMiner). Any hints?

我在 RapidMiner 中有一个文本分类过程。它从指定的 excel 表格中读取测试数据并进行分类。我还有一个小型 Java 应用程序,它正在运行这个进程。现在我想在我的应用程序中制作文件输入部分,这样每次我都可以从我的应用程序(而不是 RapidMiner)中指定 excel 文件。任何提示?

This is the code:

这是代码:

import com.rapidminer.RapidMiner;
import com.rapidminer.Process;
import com.rapidminer.example.Attribute;
import com.rapidminer.example.Example;
import com.rapidminer.example.ExampleSet;
import com.rapidminer.operator.IOContainer;
import com.rapidminer.operator.Operator;
import com.rapidminer.operator.OperatorException;



import java.io.File;
import java.io.IOException;
import java.util.Iterator;
import com.rapidminer.operator.io.ExcelExampleSource; 
import com.rapidminer.tools.XMLException;


public class Classification {

    public static void main(String [] args) throws Exception{
         ExampleSet resultSet1 = null;
         IOContainer ioInput = null;
        IOContainer ioResult;
        try {
            RapidMiner.setExecutionMode(RapidMiner.ExecutionMode.COMMAND_LINE);
            RapidMiner.init();
            Process pr = new Process(new File("C:\Users\MP-TEST\Desktop\Rapid_Test\Wieder_Model.rmp"));
            Operator op = pr.getOperator("Read Excel");
            op.setParameter(ExcelExampleSource.PARAMETER_EXCEL_FILE, "C:\Users\MP-TEST\Desktop\Rapid_Test\HaendlerRatings_neu.xls");
            ioResult = pr.run(ioInput);
            if (ioResult.getElementAt(0) instanceof ExampleSet) {
                resultSet1 = (ExampleSet)ioResult.getElementAt(0);

                for (Example example : resultSet1) {
                    Iterator<Attribute> allAtts = example.getAttributes().allAttributes();
                    while(allAtts.hasNext()) {
                        Attribute a = allAtts.next();
                                if (a.isNumerical()) {
                                        double value = example.getValue(a);
                                        System.out.println(value);

                                } else {
                                        String value = example.getValueAsString(a);
                                        System.out.println(value);
                                }
                         }
                }
                    }
        } catch (IOException | XMLException | OperatorException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }




          }
}

This is the error:

这是错误:

Apr 09, 2013 9:06:05 AM com.rapidminer.Process run
INFO: Process C:\Users\MP-TEST\Desktop\Rapid_Test\Wieder_Model.rmp starts
com.rapidminer.operator.UserError: A value for the parameter 'excel_file' must be specified! 
    at com.rapidminer.operator.nio.model.ExcelResultSetConfiguration.makeDataResultSet(ExcelResultSetConfiguration.java:316)
    at com.rapidminer.operator.nio.model.AbstractDataResultSetReader.createExampleSet(AbstractDataResultSetReader.java:127)
    at com.rapidminer.operator.io.AbstractExampleSource.read(AbstractExampleSource.java:52)
    at com.rapidminer.operator.io.AbstractExampleSource.read(AbstractExampleSource.java:1)
    at com.rapidminer.operator.io.AbstractReader.doWork(AbstractReader.java:126)
    at com.rapidminer.operator.Operator.execute(Operator.java:855)
    at com.rapidminer.operator.execution.SimpleUnitExecutor.execute(SimpleUnitExecutor.java:51)
    at com.rapidminer.operator.ExecutionUnit.execute(ExecutionUnit.java:711)
    at com.rapidminer.operator.OperatorChain.doWork(OperatorChain.java:379)
    at com.rapidminer.operator.Operator.execute(Operator.java:855)
    at com.rapidminer.Process.run(Process.java:949)
    at com.rapidminer.Process.run(Process.java:873)
    at com.rapidminer.Process.run(Process.java:832)
    at com.rapidminer.Process.run(Process.java:827)
    at Classification.main(Classification.java:29)

Best regards

最好的祝福

Armen

阿门

采纳答案by Josef Borkovec

I see two ways to do that.

我看到有两种方法可以做到这一点。

The first one would be to change programatically the XML definition of your process. Rapidminer processes are specified by an XML file with .rmpextension. In the file you will find the definition of the operator you wish to change. This is an excerpt from a simple process specifiing the Read Excel operator:

第一个是以编程方式更改流程的 XML 定义。Rapidminer 进程由带有.rmp扩展名的 XML 文件指定。在文件中,您将找到要更改的运算符的定义。这是指定 Read Excel 运算符的简单过程的摘录:

<operator activated="true" class="read_excel" compatibility="5.3.005" expanded="true" height="60" name="Read Excel" width="90" x="313" y="75">
    <parameter key="excel_file" value="D:\file.xls"/>    <!-- HERE IS THE FILE PATH -->
    <parameter key="sheet_number" value="1"/>
    <parameter key="imported_cell_range" value="A1"/>
    <parameter key="encoding" value="SYSTEM"/>
    <parameter key="first_row_as_names" value="true"/>
    <list key="annotations"/>
    <parameter key="date_format" value=""/>
    <parameter key="time_zone" value="SYSTEM"/>
    <parameter key="locale" value="English (United States)"/>
    <list key="data_set_meta_data_information"/>
    <parameter key="read_not_matching_values_as_missings" value="true"/>
    <parameter key="datamanagement" value="double_array"/>
</operator>

I highlighted the part where the path to the excel file is. You can overwrite that in your application. Just be careful not to break the XML file.

我突出显示了excel文件路径所在的部分。您可以在您的应用程序中覆盖它。请注意不要破坏 XML 文件。



The other way is to modify the operator after you load the process in your java application. You can get a reference to your operator by Process#getOperator(String name)or Process#getAllOperators(). I guess it should be of one of these classes:

另一种方法是在 Java 应用程序中加载进程后修改运算符。您可以通过Process#getOperator(String name)或获得对您的运营商的参考Process#getAllOperators()。我想它应该属于以下类别之一:

com.rapidminer.operator.io.ExcelExampleSource
com.rapidminer.operator.nio.ExcelExampleSource

When you find the correct operator you modify the path by Operator#setParameter(String key, String Value).

当您找到正确的运算符时,您可以通过 修改路径Operator#setParameter(String key, String Value)

This code works for me with RapidMiner 5.3: (the process is just a Read Excel operator and a Write CSV operator)

此代码适用于 RapidMiner 5.3 :(该过程只是一个读取 Excel 操作符和一个写入 CSV 操作符)

package sorapid;

import com.rapidminer.Process;
import com.rapidminer.RapidMiner;
import com.rapidminer.operator.Operator;
import com.rapidminer.operator.OperatorException;
import com.rapidminer.operator.io.ExcelExampleSource;
import com.rapidminer.tools.XMLException;
import java.io.File;
import java.io.IOException;

public class SOrapid {

  public static void main(String[] args) {
    try {
      RapidMiner.setExecutionMode(RapidMiner.ExecutionMode.COMMAND_LINE);
      RapidMiner.init();

      Process process = new Process(new File("c:\Users\Matlab\.RapidMiner5\repositories\Local Repository\processes\test.rmp"));
      Operator op = process.getOperator("Read Excel");
      op.setParameter(ExcelExampleSource.PARAMETER_EXCEL_FILE, "d:\excel.xls");
      process.run();

    } catch (IOException | XMLException | OperatorException ex) {
      ex.printStackTrace();
    }
  }
}

回答by Toto

Works fine for me:

对我来说很好用:

  • Download Rapidminer(and unzip the file)
  • Into "lib" directory, you need:
    1. rapidminer.jar
    2. launcher.jar
    3. All jar in "/lib/freehep" directory.
  • Put libs 1, 2 and 3 in your classpath java project (libraries)
  • Copy this code and run:
  • 下载 Rapidminer(并解压文件)
  • 进入“lib”目录,你需要:
    1. Rapidminer.jar
    2. 启动器.jar
    3. “/lib/freehep”目录中的所有 jar。
  • 将库 1、2 和 3 放在您的类路径 java 项目(库)中
  • 复制此代码并运行:


    import com.rapidminer.Process;
    import com.rapidminer.RapidMiner;
    import com.rapidminer.operator.Operator;
    import com.rapidminer.operator.OperatorException;
    import com.rapidminer.operator.io.ExcelExampleSource;
    import com.rapidminer.tools.XMLException;
    import java.io.File;
    import java.io.IOException;
    import java.lang.Object;

    public class ReadRapidminerProcess {
      public static void main(String[] args) {
        try {
          RapidMiner.setExecutionMode(RapidMiner.ExecutionMode.COMMAND_LINE);
          RapidMiner.init();

          Process process = new Process(new File("/your_path/your_file.rmp"));
          process.run();

        } catch (IOException | XMLException | OperatorException ex) {
          ex.printStackTrace();
        }
      }
    }

I hope to help you, I searched a lot before finding the answer.

希望能帮到你,找了很多资料才找到答案。

回答by Maxim

Try this:

试试这个:

private SimpleExampleSet ReadExcel( File processXMLFile_, File excelFile_ ) throws IOException, XMLException, OperatorException
{
    IOContainer outParameters   = null;
    Process     readExcel       = new Process( processXMLFile_ );
    IOObject    inObject        = new SimpleFileObject( excelFile_ );
    IOContainer inParameters    = new IOContainer( inObject );

    outParameters   = readExcel.run( inParameters );

    SimpleExampleSet    result  = (SimpleExampleSet) outParameters.getElementAt( 0 );

    return result;

}

Sorry, I cannot post image with RapidMiner script if you need, I can send it to email.

抱歉,如果您需要,我无法使用 RapidMiner 脚本发布图像,我可以将其发送到电子邮件。