Social Web Scraper

Problem #160

Tags: web-related strings

Who solved this?

Thanks to my colleague Zhanna Khaymedinova for the idea of this exercise!

Nowadays there are tons of information on the internet. And no wonder that such information even if targeted to humans, is often collected and processed by robots.

In this task you are to write a small program which collects data over the social network. Start from here:

John Doe at Fake Social Network

You see that each page represents a person with different name, date of birth and net worth. Also each page provides links to few other people somehow related to given one, so that from John Doe you can navigate to Dan Wagner (via "Friends") and from here to Dave Johnson (via messages on the "Wall").

The Task

The goal is to sum up net worth figures for all persons with specific last name (e.g. Johnson) who are reachable (via any number of links) from John Doe.

View source of the page (by pressing Ctrl-U or using Inspect element feature in Google Chrome or Firebug plugin for FireFox) to see how elements of text could be distinguished (with regexps or some other method).

Typical approach is like following:

There are also some things to note:

Input data will contain last name (lowercased) which we are interested in.
Answer should contain total net worth for all people with such last name who are reachable from the initial page.

Example:

input data:
doe

answer:
130000
You need to login to get test data and submit solution.