javascript - How to impersonate JS enabled browser? -


i need download webpage using script (php, python, bash) , not using gui browser. problem web page checks front deals js enabled browser. got using naive downloading given url initial page (in case think coursera courses page: http://pastebin.com/4tjjrmtu).

how can download "real" content using script? far can think such solutions (some crazy ones):

  • figuring out js on startup pages does, , mimic in script, loading page
  • scan network traffic using wireshark , find pattern request page abc1.html ends fetching page abc1body.html
  • instead of native (for given language) download feature launch external browser download page (exec firefox --dump http://foo.bar/x.html -- making up, don't know if there browser scripting capability).

and other ideas? grateful tested ones.

dropping script , instead writing browser plugin 1 of options, since put time write scripts seems quicker fix them, instead writing them scratch.

have @ phantomjs. headless browser, mimicking functionality.

using node , phantomjs module can download page , have full control on it, including complete access javascript.

var page = require('webpage').create(); var url = 'http://www.phantomjs.org/'; page.open(url, function (status) {     //page loaded!     phantom.exit(); }); 

Comments

Popular posts from this blog

Unable to remove the www from url on https using .htaccess -