Josh Software

scraping

josh ruby

Raspar – Build a html parser in 5 minutes – Josh Software

Raspar – Build a html parser in 5 minutes – Josh Software jiren 2014-10-02 14:05:59 Raspar is a HTML parsing library that parses HTML pages and converts HTML to ruby object by defining a map of ‘css’ or ‘xpath’ selectors. This gem can also manage parsers for multiple websites. The sample output looks something like this { product: [ <Raspar::Result:0x007ffc91e4d640 @attrs …

Raspar – Build a html parser in 5 minutes – Josh Software Read More »

josh ruby

Hpricot scraping in ruby – Josh Software

Hpricot scraping in ruby – Josh Software Include gems/library required before getting started require ‘hpricot’ require ‘net/http’ require ‘rio’ # Pass website url to be scraped url = “www.funonrails.com” # Define filename to store file locally file = “temp.html” # Save page locally rio(url) < rio (file) # Open page through hpricot doc = Hpricot(open(file)) …

Hpricot scraping in ruby – Josh Software Read More »