Retaining logs from Hadoop job after it's executed -
i'm wondering if there's easy way grab job logs / task attempt logs of particular run, , persist them somewhere (hdfs, perhaps)?
i know logs on local filesystem @ /var/log/hadoop-0.20-mapreduce/userlogs
particular job's task attempts, , write script ssh each of slave nodes , scoop them up. however, i'm trying avoid if makes sense - perhaps there's built-in function of hadoop i'm not aware of?
i did find this link, old, contains helpful information -- did not include answer i'm looking for.
mapreduce.job.userlog.retain.hours
set 24 default, job's logs automatically purged after 1 day. there can besides increasing value of retain.hours
parameter these persist?
i don't know of out of box exists, have done similar manually.
we set cron jobs run every 20 minutes new logs task attempts, pumps them hdfs specific directory. modified files names hostname coming appended. then, had mapreduce jobs try find issues, calculate stats runtimes, etc. pretty neat. did similar namenode logs, too.
Comments
Post a Comment