Intro to HTQL with Python (2)

Following on from part 1, here is an example of using HTQL to pull data from a table on a webpage. We’ll use the Wikipedia list of most expensive football transfers as our source web page. You can check out the list here. On viewing the page and the HTML source you’ll see that the first row of the table is a header row and that the “player”, “from” and “to” columns contain quite a bit of HTML in order to provide a link to the player/team and a graphical link to their country. Our HTQL will need to cut through this to just get the data that we want. ...

September 20, 2014 · 3 min · Steve

Intro to HTQL with Python (1)

HTQL – Hyper-Text Query Language – is a language for querying and extracting content from HTML pages. If SQL is a language to get data from tables within a database, then HTQL is a language to get data from webpages on the internet. It is useful when you need to pull data from the web and there is no web service available to use. An example might be to pull population statistics from Wikipedia. ...

September 7, 2014 · 3 min · Steve