Using Curl 101

To use Curl in PHP, you must have the Curl extension compiled in –with-curl, and you’ll want –with-openssl, if you need to be able to hit https pages.

Once you get PHP working with Curl (I could explain how to do that, but for this article, I am focusing on how to use it).

The code below is for PHP5, but I’m sure you could modify it to work with PHP4, just have to change the syntax a bit.

Here is a good Curl class that I wrote, I call is class.Curl.php, don’t worry about the details of this file, just create it, then look at how easy it is to use it (example follows):

class.Curl.php

<?php
	define('VERIFYHOST', false);
	define('MAXREDIRS', 10);
	define('USERAGENT', "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; DKCRL 1.0)");

class Curl
{
	public $url;
	public $response_code;
	public $response_header;
	public $response_headers;
	public $response_body;
	public $cookieJar;
	private $response;
	private $ch;
	static $stderr = null;

	public function __construct($cookieJar=false)
	{
		$this->cookieJar = $cookieJar ? $cookieJar : tempnam("/tmp", "cookieJar");
		$this->ch = curl_init();
		curl_setopt($this->ch, CURLOPT_SSL_VERIFYPEER, VERIFYHOST);
		curl_setopt($this->ch, CURLOPT_SSL_VERIFYHOST, VERIFYHOST);
		curl_setopt ($this->ch, CURLOPT_USERAGENT, USERAGENT);
		curl_setopt ($this->ch, CURLOPT_COOKIEJAR, $this->cookieJar);
		curl_setopt ($this->ch, CURLOPT_COOKIEFILE, $this->cookieJar);
		curl_setopt ($this->ch, CURLOPT_CRLF, true);
		curl_setopt ($this->ch, CURLOPT_HEADER, true);
		curl_setopt ($this->ch, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_1);
		curl_setopt ($this->ch, CURLOPT_ENCODING, "gzip"); // "" means all supported
		curl_setopt($this->ch, CURLOPT_FOLLOWLOCATION, true);
		curl_setopt($this->ch, CURLOPT_MAXREDIRS, MAXREDIRS);
		curl_setopt($this->ch, CURLOPT_RETURNTRANSFER, true);
		curl_setopt($this->ch, CURLOPT_BINARYTRANSFER, true);
		curl_setopt($this->ch, CURLOPT_USERAGENT, USERAGENT);
		curl_setopt($this->ch, CURLOPT_CONNECTTIMEOUT, 300);
		curl_setopt($this->ch, CURLINFO_HEADER_OUT, true);
		

		curl_setopt($this->ch, CURLOPT_HTTPHEADER,
								array(
								"Accept: */*",
								"Accept-Language: en-US"
								));
	}

	public function __destruct()
	{
		// curl_close($this->ch);
	}

	public function nextpage($url, $method='GET', $data=false, $referer=false, $extraPost=false)
	{
		$this->url = $url;
		if ( $referer )
		curl_setopt ($this->ch, CURLOPT_REFERER, $referer);
		if (strtoupper($method)=='POST')
		{
			curl_setopt($this->ch, CURLOPT_POST, 1);
			$postdata = array();
			foreach ($data as $key=>$val)
				$postdata[] = urlencode($key)."=".urlencode($val);
			curl_setopt($this->ch, CURLOPT_POSTFIELDS, implode("&", $postdata).($extraPost ? (count($postdata)>0 ? '&' : '').$extraPost : ""));
			if ( $extraPost && count($postdata)==0)
			curl_setopt($this->ch, CURLOPT_HTTPHEADER,
								array(
								"Accept: */*",
								"Accept-Language: en-US",
								"Content-Type: application/json; charset=utf-8"
								));			
		}
		else
			curl_setopt($this->ch, CURLOPT_HTTPGET, true);

		curl_setopt($this->ch, CURLOPT_URL, $url);

		$this->response = curl_exec($this->ch);
		$this->parse_response();

		$this->url = $this->getUrl();


		return $this->response_body;
	}

	private function parse_response()
	{
		// Split response into header and body sections
		list($this->response_header, $this->response_body) = split("\r?\n\r?\n", $this->response, 2);
		$response_header_lines = split("\r?\n", $this->response_header);

		// First line of headers is the HTTP response code
		$http_response_line = array_shift($response_header_lines);
		if(preg_match('@^HTTP/[0-9]\.[0-9] ([0-9]{3})@',$http_response_line, $matches))
		{
			$this->response_code = $matches[1];
		}

		// put the rest of the headers in an array
		$this->response_headers = array();
		foreach($response_header_lines as $header_line)
		{
		if ( preg_match("/^\w/", $header_line) )
			list($header,$value) = explode(': ', $header_line, 2);
		else
			$value = $header_line;
		$this->response_headers[$header] .= ( $this->response_headers[$header] ? "\n" : "") . $value;
		}
	}

	public function getUrl()
	{
		return curl_getinfo ( $this->ch, CURLINFO_EFFECTIVE_URL );
	}
}

Ok, so now you have class.Curl.php, now here is how to use it, let’s do a USPS Track and Confirm:

<?php
  require_once(“class.Curl.php”);

  $url = “http://www.usps.gov/”;
  $aCurl = new Curl();

  $homePage = $aCurl->nextpage($url);

  print “HOMEPAGE: $homePage\n”;

  $url = “http://trkcnfrm1.smi.usps.com/PTSInternetWeb/InterLabelInquiry.do”;
  $data = array();
  $data[“CAMEFROM”] = “OK”;
  $data[“strOrigTrackNum”] = “9101150134711177503513”;
  $data[“Go to Track & Confirm”] = “Go”;
  $resultPage = $aCurl->nextpage($url, ‘POST’, $data);

  print “Result Page: $resultPage\n”;

?>

Pretty easy, eh? The hard part is knowing what data to put into the $data array. I suggest getting the Firefox HTTP LiveHeaders extension from Mozilla, you can manually perform the steps you want to automate and log all the GET/POST data you made along the way. The POST data is in URL format, you need to convert it to PHP array format, I suggest this little utility I wrote:

I call this “urldecode.php”:

#!/usr/local/bin/php -q
<?php

  $query = $argv[1];

  $parts = explode(“&”, $query);
  foreach ($parts as $part)
  {
    list($key, $val) = explode(“=”, $part);
    print “\$data[\””.urldecode($key).”\”] = \””.urldecode($val).”\”;\n”;
  }

?>

I use is like this, the string I got from HTTP LiveHeaders after performing the search manually with Firefox:

./urldecode.php ‘CAMEFROM=OK&strOrigTrackNum=555555555555&Go+to+Track+%26+Confirm.x=21&Go+to+Track+%26+Confirm.y=9&Go+to+Track+%26+Confirm=Go’

which results in:

$data[“CAMEFROM”] = “OK”;
$data[“strOrigTrackNum”] = “555555555555”;
$data[“Go to Track & Confirm.x”] = “21”;
$data[“Go to Track & Confirm.y”] = “9”;
$data[“Go to Track & Confirm”] = “Go”;

I then copy paste this result into my PHP code.

Comments are closed.