PHP Web Grabber

PHP Web Grabber

Advanced PHP tag based web extractor engine

Changelog

Version 1.1.1, released on 2011-04-27

Initial release

Version 1.2.2, released on 2011-05-06

  • New Feature: added tag stripping on the grabbed content: now you can choose what tags will be stripped from the extracted content;
  • added new class methods for more usage flexibility;
  • small bug fixes;
  • documentation update.

Version 1.3.1, released on 2011-05-16

  • smarter cURL: opening and parsing urls is done now with a real internet browser signature;
  • documentation update.

Version 2.3.3, released on 2012-01-07

  • grabbing engine updated: added cookies support for cURL class;
  • grabbing engine updated: improved HTML DOM parser;
  • improved cache processing;
  • Important: Starting from version 2.3.3, the grabbing engine is able to grab all the matching tags founded (not only the first one), so the return format of the wlWgProcessor::get() method was changed accordingly.
  • Important: Starting from version 2.3.3, wlWgParam constructor accepts only the $tagSlice parameter; this is because the wlWgParam class was enriched with new methods and parameters (see below).
    So from now on, setting a parameter must be done now through the corresponding set method.
  • New Feature: added setTagInstanceIndex method for the wlWgParam class.
    Sets the tag instance occurrence index to be grabbed.
    Use this if you want to grab the contents of a tag that cannot be uniquely identified but you know its instance occurence index.
    Let's say that in the target page there are more <p> tags having no other distinct marks (like ID attributes or inline CSS styles) to uniquely identify them; for instance, you can set this value to 2 if you want to grab the 3rd paragraph contents (the instance occurence index is zero-based, zero represents the first occurence).
  • New Feature: added setTagContains method for the wlWgParam class.
    Sets the string used to filter the tag that will be extracted; only the tag containing this string will be grabbed.
    Use this if you want to grab the contents of a tag that cannot be uniquely identified but you know for sure that it contains this value.
    Let's say that in the target page there are more <p> tags having no other distinct marks (like ID attributes or inline CSS styles) to uniquely identify them; for instance, you can set this value to '<img ' if you want to grab only the paragraphs containing '<img ' string - images in this particular case.
  • documentation update.
    Note:
    If this update brought some inconveniences (like small changes into your existing code that needs to be performed), we are sorry for that; we are sure that you understand that this version 2.3.3 is a major improvement due to the new features included and there was no other way to include them and also keep an easy to use and understand programming model for this PHP script.
    Thank you.

Version 3.1.1, released on 2013-03-31

  • New Feature: added proxy support: the grabber engine will cycle through a given proxy list and for same request the next available proxy will be used;
  • documentation update.

Version 4.1.1, released on 2013-04-08

  • New Feature: added charset conversion support: the grabber engine is able to apply charset conversion between grabbed content and localhost in order to show special characters such as diacritics;
  • documentation update.

Version 4.2.1, released on 2013-10-15

  • New Feature: added clear cache operation; use wlWgUtils::clearCache() to empty the cache directory;
  • minor bug fixes;
  • documentation update.

Version 5.1.1, released on 2015-04-09

  • added wlWgContentProcessor class that allows grabbing and parsing custom content;
  • documentation update.

Version 5.1.2, released on 2015-05-25

  • small bug fixing;
  • documentation update.
Regular License $10.00
Use by you or one client, in a single end product which end users are not charged for.

Extended License $50.00
Use by you or one client, in a single end product which end users can be charged for.

Short Information

If you want to retrieve various contents from a public web site and display them on your site, PHP Web Grabber is the perfect tool that will help you do that.
No frames or iframes involved! Real HTML content, grabbed form the web and displayed in whithin your webpage just like it was generated by your site.

Buyer rating:
401 Sales