Posted on Monday 9th of April 2007 at 12:23 in Tutorials

Writing a PHP Google Sitemap generator without using fopen

The importance of the Google sitemap is a commonly discussed thing but if you don't use a pre-built solution (like Wordpress) then how do you keep your sitemap.xml file up to date? Here is a tutorial explaining how to write your own sitemap.xml generator in PHP (without using fopen).

(And yes I am fully aware the formatting of comments is rather nasty but it'll do)

The Google sitemap is a sitemap.xml file that you place in the document root of your website (which you inform Google of). It enables the popular search engine to index the pages in your site more accurately - rather than relying on the Googlebot to do all the hard work.

Automate the process
There are plugins for Wordpress that update the sitemap.xml file every time you publish (as there are for other content-managed solutions) but if you build your own site then you have to generate the file manually. Previously I've relied on giving the URL to a sitemap generator, saving it's output and uploading it to the server via FTP. I got bored of this so wrote my own generator that I could give to Google and forget about it. I'll explain how...

#1 - Set the header
Create a new file called sitemap.php. You need to set the header so that when you view the sitemap.php file it outputs as XML:



#2 - Open the connection to your database and do the query
However you store your information, it'll still be databased so the normal PHP connection/query code applies - this code is lifted directly off my site directly:



#3 - Start the XML document
Now that we've got all the current/published articles selected into a dataset we need to write the start of the XML document:



#4 - Work out the URL-Product
This is the thing that's most likely to differ depending on how you've developed your site. At Seopher.com I use URL-rewriting to associate a clean-URL with the ID of the article. Therefore what is realistically "http://www.seopher.com/viewarticle.php?id=5" becomes "http://www.seopher.com/articles/an_article_title". So the code I use to produce the URL-product is:



Whereas if you're using Wordpress-esq conventions (ie. www.seopher.com/article.php?id=5) then the code would look more like:



So all you need to do is work out how to make a real URL out of your databased content and then move on to step 5.

#5 - Output a list of your databased URLs
Now that you've worked out how to create a working URL-product of your content, it's time to output that into an XML schema that Google can make sense of.



The above code loops through the resultset and outputs the content into the XML schema that Google expects. The "lastmod" field is populated by re-formatting the timestamp you *should* have against when your article was posted. The "loc" assett is populated using the URL-product we made earlier.

#6 - Close everything down
It's just a case of closing the connection and ending the XML document.



And that's it as far as outputting everything in XML format. So you can upload the sitemap.php file to your server, load it into the browser and you should (hopefully) see a mess of all your content. View the source of the page and you should see something like:



Obviously that's my sitemap.php output (which has more than two items in it I might add) but you should see something to that effect. If you don't you'll need to troubleshoot what's causing problems. However, once you've got sitemap.php outputting something like you can see in the area above - then you can move on to step #7.

#7 - Modify the .htaccess file so sitemap.php becomes sitemap.xml
This is a crucial step because Google needs to see a SITEMAP.XML file and all you've got is SITEMAP.PHP. What you need to do is either edit or create a .htaccess file with the following logic in it:



What this does is it turns on the rewrite rule (allowing you to modify how URLs are handled, essentially) and adds the logic that allows a file.php to be intepreted as file.xml.

This now means that if you put sitemap.xml into your browser you'll be viewing the output from sitemap.php and that's crucial because now when Google looks for sitemap.xml it's viewing live data from your PHP script. This means that your sitemap.xml file will never be inaccurate.

Conclusion of what you should have
A sitemap.php file on your server that you can access by entering "www.yourwebsite.com/sitemap.php" or "www.yourwebsite.com/sitemap.xml" into your browser. This means that you now have a constantly up-to-date sitemap.xml file because you're not having to get it generated by a third party and upload it to your server.

How to improve it
My sitemap generator doesn't index my static pages (or even the homepage) because the homepage is already indexed sufficiently and I consider the other pages (contact, about etc) to be of no use to search engines. They're easily accessible from the navigation too so Googlebot shouldn't have any problems indexing them anyway.

Why it's good
Most hosts disable the use of the PHP function fopen which you need to write a physical sitemap.xml file, so this method bipasses the physical creation and instead references the PHP file as an XML document.

Hope this was useful.

 

Enjoy this article? Why not subscribe to the full RSS feed?

Add Your Comments








Comments

Showing most recent 20 of 37 comments [View all comments]

That’s something,That's what I was thinking.Brilliant idea.
That’s something,That's what I was thinking.Brilliant idea.
Fancy knowing that.I'm counting on you.
We can even think of gifting it someone because these replica Rolex watches also provides a wide range of models to choose from like Daytona replica Rolex watch,Swiss Rolex replica watch etc. Particularly when compared to the high price of a genuine Rolex, a Replica fake replica rolex watches watch shows itself to be a true value.
agreeable article
Thank you very much!
Your article is very useful!
The Best Discount
you can click here to know more:
http://www.shoesiii.com; --UGG boots,timberland boots
http://www.mylacoste.com/ --NFL jerseys
http://www.ghdlinks.com --hair straightener flat
http://www.hibose.com/ --Bose In-Ear Headphones
Timberland Roll-Top Boots menwaterproof leather for comfort, durability and abrasion resistance; Direct-attach, seam-sealed, waterproof construction keep feet dry in any weather ; Padded collar for a comfortable fit around the ankle and help keep out debris ; When rolled down the leather lining is exposed ; Durable laces with Taslan? fibers and rustproof hardware for long-lasting wear ; Footbed and inside of shoe is completely lined with soft, breathable leather ; Non-marking, rubber lug outsole for traction and durability
Fashion News alway take me some ideas,becaust it ,I fell vey cool.If you want to buy some fashion clothes,you can come to see it.
your change can change lives
http://www.myjerseysky.com/products_all.html
your change can change lives
http://www.himk4.com/
your change can change lives
http://www.myjerseysky.com/products_new.html
your change can change lives
http://www.ghdprincess.com/
your change can change lives
http://www.brawbuy.com/

With the growing popularity of smart phones, smart phones have become people’s first choice for purchase.
It's a good idea .Thank you.
I really appreciate your help, it is very useful for me,you will get good grades!
You will be successful.
Here is the best Tiffany shop online.Tiffany jewellery and Tiffany uk Tiffany jewelllery uk Tiffany jewellery sale Tiffany jewellery london silver jewellery Tiffany jewelry
tiffany rings
tiffany sets
tiffany bracelet
tiffany necklace
tiffany pendant
tiffany earrings
The fashion Timberland Womens Roll Top Boots are made from the skin of sheep and boots womens locarno are very soft. You can take a long walk wearing them and they are available in many colours like ugg nightfall.
I really appreciate your help, it is very useful for me,you will get good grades!
You will be successful.
MBT
MBT Shoes
MBT sale
MBT Sport Shoes
MBT Fitness Shoes
MBT M Walk shoes
MBT Lami shoes
MBT Chapa shoes
Ugg Boots Sale Famous top quality Uggs with great discount are On Sale.
We are the best Ugg Sale online store.
welcome to timberland,have any questions contact us
welcome to choose timberland boots,happy shopping
Come to buy our timberland shoes ,you can choose any one you like

Do you want to buy shoes? Come outlet timberland quickly

Dear people,timberland uk introduced so many new products

I really appreciate your help, it is very useful for me,you will get good grades!I think you can!
091219C1LAH
Great shoe! Offers comfort, support and has a very nice look.Ugg Classic Tall Excellent in cold weather for warmth and under wet conditions my feet stay dry. I am a weekend

recreational walker and would recommend this shoe for all conditions.Ugg Classic Boots After about five miles, the shoe does get a bit heavy. Classic Short Ugg BootsAfter a brief rest …"
www.6inchboot.com
www.jerseybless.com
www.mbtking.com

Grey Ugg Boots 5815 Classic Tall
$213.00 $129.00
Save: 39% off
Australia's Classic Tall boot for women features lavish twin-faced sheepskin for the utmost comfort. Precision craftsmanship is evident in the reinforced heel and raw seams. Wrapped in a taller upper for supreme comfort, it can be worn folded down for a different look and accent.
Twin-faced Grade A sheepskin with suede heel guards
Approximate boot shaft height:11"
Approximate boot circumference at mid-calf:12.5"
Genuine sheepskin sock wicks moisture away
Flexible, lightweight molded EVA outsole
Special Sale Offering,Real Brand New,Authentic Quality & Package,Free Shipping & Customs,1 Week Delivery To Your Door!


Subscribe to the RSS Feed

Stay up to date with Seopher.com by subscribing to the RSS feed, either in your browser or subscribe via email using the form below

Updates by Email

By subscribing by email you’re also subscribing to the Seopher.com newsletter; a periodical email outlining new reviews, competitions and other subscriber-only content

  • Wordpress Campaign Manager
  • 125x125 banner only $50 pcm
Want to give your product/website exposure?

Paying for a featured review is a great way to give your product, service or website exposure. For as little as $50 you can have a full review on the site forever.

Advertising Bundle! Review + Banner = $70

To kick start the new improved Seopher.com, buy a review and get a 125x125 advert half price. Your banner gets displayed on over 450 pages for a full month.