drupal/core/vendor/fabpot/goutte
effulgentsia 8b83aa0838 Issue #2493911 by dawehner, larowlan, damiankloip, hussainweb, jibran, cilefen, benjy, iMiksu, mtdowling: Update guzzle, goutte and mink-goutte-driver to the latest release 2015-07-21 14:33:30 -07:00
..
Goutte Issue #2493911 by dawehner, larowlan, damiankloip, hussainweb, jibran, cilefen, benjy, iMiksu, mtdowling: Update guzzle, goutte and mink-goutte-driver to the latest release 2015-07-21 14:33:30 -07:00
.gitignore Issue #2433009 by hussainweb, grom358, neclimdul, daffie, larowlan, alexpott, pcambra, benjy, jibran, phenaproxima, moshe weitzman, nick_schuch: Add Mink, with Goutte driver, to core 2015-02-25 19:49:58 -08:00
.travis.yml Issue #2493911 by dawehner, larowlan, damiankloip, hussainweb, jibran, cilefen, benjy, iMiksu, mtdowling: Update guzzle, goutte and mink-goutte-driver to the latest release 2015-07-21 14:33:30 -07:00
LICENSE Issue #2433009 by hussainweb, grom358, neclimdul, daffie, larowlan, alexpott, pcambra, benjy, jibran, phenaproxima, moshe weitzman, nick_schuch: Add Mink, with Goutte driver, to core 2015-02-25 19:49:58 -08:00
README.rst Issue #2493911 by dawehner, larowlan, damiankloip, hussainweb, jibran, cilefen, benjy, iMiksu, mtdowling: Update guzzle, goutte and mink-goutte-driver to the latest release 2015-07-21 14:33:30 -07:00
box.json Issue #2433009 by hussainweb, grom358, neclimdul, daffie, larowlan, alexpott, pcambra, benjy, jibran, phenaproxima, moshe weitzman, nick_schuch: Add Mink, with Goutte driver, to core 2015-02-25 19:49:58 -08:00
composer.json Issue #2493911 by dawehner, larowlan, damiankloip, hussainweb, jibran, cilefen, benjy, iMiksu, mtdowling: Update guzzle, goutte and mink-goutte-driver to the latest release 2015-07-21 14:33:30 -07:00
phpunit.xml.dist Issue #2433009 by hussainweb, grom358, neclimdul, daffie, larowlan, alexpott, pcambra, benjy, jibran, phenaproxima, moshe weitzman, nick_schuch: Add Mink, with Goutte driver, to core 2015-02-25 19:49:58 -08:00

README.rst

Goutte, a simple PHP Web Scraper
================================

Goutte is a screen scraping and web crawling library for PHP.

Goutte provides a nice API to crawl websites and extract data from the HTML/XML
responses.

Requirements
------------

Goutte depends on PHP 5.5+ and Guzzle 6+.

.. tip::

    If you need support for PHP 5.4 or Guzzle 4-5, use Goutte 2.x.
    If you need support for PHP 5.3 or Guzzle 3, use Goutte 1.x.

Installation
------------

Add ``fabpot/goutte`` as a require dependency in your ``composer.json`` file:

.. code-block:: bash

    composer require fabpot/goutte

.. tip::

    You can also download the `Goutte.phar`_ file:

    .. code-block:: php

        require_once '/path/to/goutte.phar';

    The phars for Goutte 1.x are also available for `download
    <http://get.sensiolabs.org/goutte-v1.0.7.phar>`.

Usage
-----

Create a Goutte Client instance (which extends
``Symfony\Component\BrowserKit\Client``):

.. code-block:: php

    use Goutte\Client;

    $client = new Client();

Make requests with the ``request()`` method:

.. code-block:: php

    // Go to the symfony.com website
    $crawler = $client->request('GET', 'http://www.symfony.com/blog/');

The method returns a ``Crawler`` object
(``Symfony\Component\DomCrawler\Crawler``).

Fine-tune cURL options:

.. code-block:: php

    $client->getClient()->setDefaultOption('config/curl/'.CURLOPT_TIMEOUT, 60);

Click on links:

.. code-block:: php

    // Click on the "Security Advisories" link
    $link = $crawler->selectLink('Security Advisories')->link();
    $crawler = $client->click($link);

Extract data:

.. code-block:: php

    // Get the latest post in this category and display the titles
    $crawler->filter('h2 > a')->each(function ($node) {
        print $node->text()."\n";
    });

Submit forms:

.. code-block:: php

    $crawler = $client->request('GET', 'http://github.com/');
    $crawler = $client->click($crawler->selectLink('Sign in')->link());
    $form = $crawler->selectButton('Sign in')->form();
    $crawler = $client->submit($form, array('login' => 'fabpot', 'password' => 'xxxxxx'));
    $crawler->filter('.flash-error')->each(function ($node) {
        print $node->text()."\n";
    });

More Information
----------------

Read the documentation of the BrowserKit and `DomCrawler
<http://symfony.com/doc/any/components/dom_crawler.html>`_ Symfony Components
for more information about what you can do with Goutte.

Pronunciation
-------------

Goutte is pronounced ``goot`` i.e. it rhymes with ``boot`` and not ``out``.

Technical Information
---------------------

Goutte is a thin wrapper around the following fine PHP libraries:

* Symfony Components: BrowserKit, CssSelector and DomCrawler;

*  `Guzzle`_ HTTP Component.

License
-------

Goutte is licensed under the MIT license.

.. _`Composer`:    http://getcomposer.org
.. _`Goutte.phar`: http://get.sensiolabs.org/goutte.phar
.. _`Guzzle`:      http://docs.guzzlephp.org