hadoop - Input path does not exist: file:/D:/pigsample_1749383998_1377684507424 -


i facing wiered issue. running pig 0.11 on windows7/64 bit machine latest version of cygwin.

i weblog want order username have user activities same user feed next line of processing.

i starting commandprompt -> cygwin.bat -> on cygwin console go d:/ -> pig , typing following script on grunt shall (local mode). (note i've set pig_home, pig_classpath correctly).

script :

useractivities = load '/d:/path/of/logs/useractivities' using org.apache.pig.piggybank.storage.csvexcelstorage(',') (datetimeunprocessed:chararray, username:chararray, request:chararray);  useractivities_ordered = order useractivities username; store useractivities_ordered '/d:/readyfornextinput/useractivities' using org.apache.pig.piggybank.storage.csvexcelstorage(','); 

when illustrate useractivities_ordered see going smooth. when store/dump face wiered issue.

it fails saying : java.lang.runtimeexception: org.apache.hadoop.mapreduce.lib.input.invalidinputexception: input path not exist: file:/d:/pigsample_1749383998_1377684507424

when tried search pigsample_number file find in : d:/tmp//mapred/local/localrunner

i not sure how happening. not sure if windows/cygwin related issue or saw on linux also.

for reference, can find stacktrace attached here:

2013-08-28 15:38:28,863 [thread-46] warn org.apache.hadoop.mapred.localjobrunner - job_local_0004 java.lang.runtimeexception: org.apache.hadoop.mapreduce.lib.input.invalidinputexception: input path not exist: file:/d:/pigsample_1749383998_1377684507424 @ org.apache.pig.backend.hadoop.executionengine.mapreducelayer.partitioners.weightedrangepartitioner.setconf(weightedrangepartitioner.java:157) @ org.apache.hadoop.util.reflectionutils.setconf(reflectionutils.java:62) @ org.apache.hadoop.util.reflectionutils.newinstance(reflectionutils.java:117) @ org.apache.hadoop.mapred.maptask$newoutputcollector.(maptask.java:677) @ org.apache.hadoop.mapred.maptask.runnewmapper(maptask.java:756) @ org.apache.hadoop.mapred.maptask.run(maptask.java:370) @ org.apache.hadoop.mapred.localjobrunner$job.run(localjobrunner.java:214) caused by: org.apache.hadoop.mapreduce.lib.input.invalidinputexception: input path not exist: file:/d:/pigsample_1288777582_1377684802262 @ org.apache.hadoop.mapreduce.lib.input.fileinputformat.liststatus(fileinputformat.java:235) @ org.apache.pig.backend.hadoop.executionengine.mapreducelayer.pigfileinputformat.liststatus(pigfileinputformat.java:37) @ org.apache.hadoop.mapreduce.lib.input.fileinputformat.getsplits(fileinputformat.java:252) @ org.apache.pig.impl.io.readtoendloader.init(readtoendloader.java:190) @ org.apache.pig.impl.io.readtoendloader.(readtoendloader.java:126) @ org.apache.pig.backend.hadoop.executionengine.mapreducelayer.partitioners.weightedrangepartitioner.setconf(weightedrangepartitioner.java:131) ... 6 more

any on useful.

looks reproducible on cygwin environment. i've documented root cause , solution here


Comments

Popular posts from this blog

Unable to remove the www from url on https using .htaccess -