If you find yourself needing to validate a URI, not just checking that it’s well formed with a regular expression like this:
/^(http|https|ftp):\/\/([A-Z0-9][A-Z0-9_-]*(?:\.[A-Z0-9][A-Z0-9_-]*)+):?(\d+)?\/?/i
…but actually verifying that the URI points to a functioning web page, the following code will do the trick. It starts off feeding a well formed URL to PHP’s parse_url(), then uses cURL to follow any redirects (10 maximum) until it finds a 200 status. I’ve tried a number of different methods, but this one seems to work the best. I ran into problems where an OpenDNS wild card was causing all my bad URIs to return a status 200, so the code checks for the term “opendns” and returns false. Here’s the code:
function validate_url($url) {
if(empty($url)){ return false; }
$url = preg_match("/http:\/\//", $url) ? $url : "http://".$url;
$parts = parse_url($url);
$url = $parts['host'];
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_MAXREDIRS, 10);
$data = curl_exec($ch);
curl_close($ch);
preg_match_all("/HTTP\/1\.[1|0]\s(\d{3})/",$data,$matches);
$code = end($matches[1]);
if(!$data) { return false; }
if(stristr($data,'opendns')){ return false; }
return $code==200 ? true : false;
}