Retaining logs from Hadoop job after it's executed -

- January 15, 2012

i'm wondering if there's easy way grab job logs / task attempt logs of particular run, , persist them somewhere (hdfs, perhaps)?

i know logs on local filesystem @ /var/log/hadoop-0.20-mapreduce/userlogs particular job's task attempts, , write script ssh each of slave nodes , scoop them up. however, i'm trying avoid if makes sense - perhaps there's built-in function of hadoop i'm not aware of?

i did find this link, old, contains helpful information -- did not include answer i'm looking for.

mapreduce.job.userlog.retain.hours set 24 default, job's logs automatically purged after 1 day. there can besides increasing value of retain.hours parameter these persist?

i don't know of out of box exists, have done similar manually.

we set cron jobs run every 20 minutes new logs task attempts, pumps them hdfs specific directory. modified files names hostname coming appended. then, had mapreduce jobs try find issues, calculate stats runtimes, etc. pretty neat. did similar namenode logs, too.

Search This Blog

Code wiki

Retaining logs from Hadoop job after it's executed -

Comments

Post a Comment

Popular posts from this blog

sql - Is there any inbuilt stored procedure which will return the output of a query as an XML document..? -

design - Custom Styling Qt Quick Controls -

Unable to remove the www from url on https using .htaccess -