Yelp Data Scraping, Manta.Com Data Scraping, Real Estate Data Scraping, Urbanspoon.Com Scraping, Opentable.Com Scraping, Jigsaw Data Scraping, Goldenpages Scraping, Hotelpronto Data Scraping, Expedia Data Scraping, Tripadvisor Data Scraping

Monday, 16 December 2013

The “Ultimate Guide to Web Scraping” is Now Available

I wrote an article on web scraping last winter that has since been viewed almost 100,000 times. Clearly there are people who want to learn about this stuff, so I decided I’d write a book.

A few months later, I’m happy to announce: The Ultimate Guide to Web Scraping.

No prior knowledge of web scraping is necessary to follow along — the book is designed to walk you from beginner to expert, honing your skills and helping you becomes a master craftsman in the art of web scraping.

The book talks about the reasons why web scraping is a valid way to harvest information — despite common complaints. It also examines various ways that information is sent from a website to your computer, and how you can intercept and parse it. We’ll also look at common traps and anti-scraping tactics and how you might be able to thwart them.

There are code samples in both Ruby and Python — I had to learn Ruby just so I could write the code samples! If anyone’s willing to translate the sample code into PHP or Javascript, I’ll give you a free copy of the book. Get in touch.



Check out the table of contents:

    Introduction to Web Scraping

    Web Scraping as a Legitimate Data Collection Tool

    Understand Web Technologies: A Brief Introduction to HTTP and the DOM

    Finding The Data: Discovering Your “API”

    Extracting the Data: Finding Structure in an HTML Document

    Sample Code to Get You Started

    Avoiding Common Scraping Traps

    Being a Good Web Scraping Citizen

As a special deal for my blog subscribers, get 20% off with the code BLOGSUB. That coupon code is only good for a limited time, so order your copy today!

Source: http://blog.hartleybrody.com/web-scraping-guide/

No comments:

Post a Comment