Wednesday, June 1, 2016

Building a parser for the Stack Exchange review item screen

I've decided that it would be pretty neat to make YouTube videos of myself doing reviews on Stack Exchange, but in order to comply with attribution guidelines (since I'd be showing other people's work on screen), I need some way of producing a conveniently-browsable list of the posts I viewed and their authors, along with links to the involved user profiles.

Therefore, today I spent a bit of time throwing together an ASP.NET web site to be run on localhost that will be invoked by a Chrome extension for each review item. The .NET site uses the HTML Agility Pack to scrape the page for the relevant names and links. At the end of the session, it spits out a new HTML page that satisfies the attribution requirements.

I did run into a major snag. The review item UI is rendered dynamically with JavaScript, so the things I need aren't actually in the HTML when my program downloads the page. Therefore, I'm now working on rigging up a horror that involves the WebBrowser control.

No comments:

Post a Comment