Posted on Tuesday 13th of November 2007 at 05:21 in Tutorials

How to: change your PHP user agent to avoid being blocked when using Curl

Back in March I wrote a tutorial on how to scrape website content with PHP using Curl (rather than the more risky fopen). I have since stumbled across some more useful information on the topic, and here it is.

On my daily travels across the Internet I came across an interesting short piece of content that I think a lot of PHP devs could actually find useful; let me explain. When Curling content with PHP, it's not uncommon for people to get upset because you're either being clever and avoiding paying for something, or you're just flat out stealing someone's content.

The easiest way for them to do this is by checking the user-agent and that's your biggest enemy. If you look in your php.ini file you're probably set to identify as 'PHP' which is not only obvious but it's easy to block. If you've got users visiting your domain identified as PHP; someone is trying to steal your stuff.

Fortunately there are numerous ways round this, you can modify your .htaccess file, set a PHP variable or modify the agent using Curl. Below are examples of how to identify your actions as a Mozilla browser:

.htaccess
Add the following line to your .htaccess file and that should do the trick:

php_value user_agent Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.9) Gecko/20071025 Firefox/2.0.0.9

PHP set
You could also use an ini_set to define the user agent too, just place this PHP line into the head of the script doing the Curl:

ini_set('user_agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.9) Gecko/20071025 Firefox/2.0.0.9');

Set it in Curl
You can also set the parameter in the Curl script itself, meaning that only this action is identified as Mozilla. Just add the following line into the Curl script in your PHP (not forgetting to change the $curl variable to whatever you're using:

curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.9) Gecko/20071025 Firefox/2.0.0.9');

For more information read the original piece of content at TorrentialWebDev as they deserve all the praise for this. If you'd like to learn how to scrape website content using Curl, check out my tutorial.

 

Enjoy this article? Why not subscribe to the full RSS feed?

blog comments powered by Disqus
Who is Seopher?

This is me. I'm a 27 year old web
developer, blogger and entrepreneur
from near London.

I've done work for people like
Samsung, Vauxhall, Cadburys,
Chevrolet, Center Parcs and TKMaxx.

I've been running this blog since 2006
and have reached more than
1.7 million readers

I'm passionate about the web, heavy metal, zombies and cats.

Seopher
Subscribe to the RSS Feed

Stay up to date with Seopher.com by subscribing to the RSS feed, either in your browser or subscribe via email using the form below

Updates by Email

By subscribing by email you’re also subscribing to the Seopher.com newsletter; a periodical email outlining new reviews, competitions and other subscriber-only content

  • ReviewMeReviewMe
  • buy 125x125 advert for $50 pcm
Want to give your product/website exposure?

Paying for a featured review is a great way to give your product, service or website exposure. For as little as $75 you can have a full review on the site forever.

Advertising Bundle! Review + Banner = $100

Buy a review and get a 125x125 advert half price. Your banner gets displayed on over 542 pages for a full month.