java 如何找到 JAR:/home/hadoop/contrib/streaming/hadoop-streaming.jar

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32543734/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-02 20:21:35  来源:igfitidea点击:

how to find JAR: /home/hadoop/contrib/streaming/hadoop-streaming.jar

javapythonhadoopamazon-web-servicesemr

提问by harshil bhatt

I'm practicing a video tutorial from plural sight about Amazon EMR. I am stuck as i cannot proceed as i am getting this error

我正在从多个角度练习有关 Amazon EMR 的视频教程。我被卡住了,因为我收到此错误而无法继续

Not a valid JAR: /home/hadoop/contrib/streaming/hadoop-streaming.jar

Not a valid JAR: /home/hadoop/contrib/streaming/hadoop-streaming.jar

Please note that tutorial is old and it is using a older Emr version. I am using the latest version is that a problem ?

请注意,教程是旧的,它使用的是较旧的 Emr 版本。我用的是最新版本,有问题吗?

The steps that i took are after entering the credentials in putty

我采取的步骤是在腻子中输入凭据后

1) Hadoop

2) mkdir streamingCode`

3) wget -o ./streamingCode/wordSplitter.py s3://elasticmapreduce/samples/wordcount/wordSplitter.py

4) hadoop jar contrib/streaming/hadoop-streaming.jar -files streamingCode/wordSplitter.py -mapper wordSplitter.py input s3://elasticmapreduce/samples/wordcount/input -output streamingCode/wordCountOut -reducer aggregate`

1) Hadoop

2) mkdir 流代码`

3) wget -o ./streamingCode/wordSplitter.py s3://elasticmapreduce/samples/wordcount/wordSplitter.py

4) hadoop jar contrib/streaming/hadoop-streaming.jar -files streamingCode/wordSplitter.py -mapper wordSplitter.py input s3://elasticmapreduce/samples/wordcount/input -outputstreamingCode/wordCountOut -reducer聚合`

I cannot execute step 4 as i am getting the below error

我无法执行第 4 步,因为我收到以下错误

Not a valid JAR: /home/hadoop/contrib/streaming/hadoop-streaming.jar

Not a valid JAR: /home/hadoop/contrib/streaming/hadoop-streaming.jar

回答by ChristopherB

The Hadoop streaming jar is still available in the latest release of EMR Hadoop. Starting with EMR release 4.0.0 it can be found at /usr/lib/hadoop-mapreduce/hadoop-streaming.jar.

Hadoop 流 jar 在最新版本的 EMR Hadoop 中仍然可用。从 EMR 版本 4.0.0 开始,可以在/usr/lib/hadoop-mapreduce/hadoop-streaming.jar.

Another good resource for differences between versions can be found at http://docs.aws.amazon.com/ElasticMapReduce/latest/ReleaseGuide/emr-release-differences.html.

另一个关于版本差异的好资源可以在http://docs.aws.amazon.com/ElasticMapReduce/latest/ReleaseGuide/emr-release-differences.html找到。

回答by Nikhil B Agarwal

For the variable, HADOOP_STREAMING, obtaining the path is a bit more complicated depending on the HDP you are using.

对于变量 HADOOP_STREAMING,根据您使用的 HDP,获取路径有点复杂。

Search for where it is located via command: find / -name 'hadoop-streaming*.jar'

通过命令搜索它所在的位置: find / -name 'hadoop-streaming*.jar'

Src: http://thecoatlessprofessor.com/programming/installing-r-studio-server-on-hortonworks-virtual-box-image-and-rmr2-a-k-a-rhadoop-r-package/

源代码:http: //thecoatlessprofessor.com/programming/installing-r-studio-server-on-hortonworks-virtual-box-image-and-rmr2-aka-rhadoop-r-package/