javascript - How to impersonate JS enabled browser? -
i need download webpage using script (php, python, bash) , not using gui browser. problem web page checks front deals js enabled browser. got using naive downloading given url initial page (in case think coursera courses page: http://pastebin.com/4tjjrmtu).
how can download "real" content using script? far can think such solutions (some crazy ones):
- figuring out js on startup pages does, , mimic in script, loading page
- scan network traffic using wireshark , find pattern request page
abc1.html
ends fetching pageabc1body.html
- instead of native (for given language) download feature launch external browser download page (
exec firefox --dump http://foo.bar/x.html
-- making up, don't know if there browser scripting capability).
and other ideas? grateful tested ones.
dropping script , instead writing browser plugin 1 of options, since put time write scripts seems quicker fix them, instead writing them scratch.
have @ phantomjs. headless browser, mimicking functionality.
using node , phantomjs module can download page , have full control on it, including complete access javascript.
var page = require('webpage').create(); var url = 'http://www.phantomjs.org/'; page.open(url, function (status) { //page loaded! phantom.exit(); });
Comments
Post a Comment