Want to stay up to date? Then why not subscribe to the RSS feed?

Or subscribe by email
Interested in Advertising? I sometimes have 125x125 banner slots available for only $40pcm. Reviews only cost $40 too.
I'm nearly fully booked so get in touch now
Posted on Thursday 8th of March 2007 at 12:07 in Tutorials

Scraping website content with PHP using Curl

I've been building a side project that just so happens to need this functionality so thought I'd document it on the site as I go.

The function below takes a URL and connects to it and returns all the contents.



How do I use it?
See the code below for an example implementation of the function.



$content will then become the full HTML of the page, so outputting it will replicate the page in it's entirety. Do with it what you will.

Enjoy this article? Why not subscribe to the full RSS feed?


Did you like this article?
If you liked this article then please show your support and give me a Digg. If you'd like to get in touch with me, email me at steven.york@seopher.com
Want to stay updated?
Sign up to RSS updates by email (or subscribe to the full RSS feed)

Enter your email address:


Add a comment






Comments

Showing most recent 4 of 4 comments

Form broke the code I posted.

inside the two forward slashes you would put

something like this

< a h r e f = ' ( . * ? ) ' >

hopefully the form doesnt break this. if it doesnt then just remove spaces.
try preg_match

$content = get_content("http://www.somewebsite.com");
preg_match("//",$content,$output);

echo $output[1];

// That will output the first url it comes across within an a href tag.
You can also use preg_match_all to grab them all and lots of other things.
thats fine but how can parse that to get only required contents and save them...
thx
http://apexvideo.blogspot.com
shaid
Any articles or snippets on ow to refine the $content and get only ceratin information from it?
Nice Article