Inspirations, captions, ideas and notes.

Source: http://www.devx.com/xml/Article/10790

You can set up outbound syndication for your Web site and expose your content to the world in one afternoon using an XML technology called RSS. For anyone who ever wanted to increase audience and traffic while maintaining content and presentation control, RSS is the answer. Find out how to do it, step by step.

n RSS XML feed is an extremely simple way to let external sites link to content on your Web site. If you’re looking for a fast, painless way to do outbound syndication and find a larger audience for your content, RSS is going to be worth your time.At its most basic, RSS generates a list of links, generated programmatically, to various resources on your site with a description of the content for each link. This XML feed is just a publicly accessible text file and therefore you can easily create your feed to automatically regenerate periodically. This way, new content updates are added to the list without any human effort. With RSS, your news feed can be as dynamic as you like.And, because it’s in XML, other sites can easily consume and display your content feed automatically, thereby driving traffic back to your site. This is a win-win situation for sites that want to increase their traffic while maintaining control of their content.In this article, you’ll learn what you need to create a feed, see a sample feed structure, walk through the code that creates the XML (in this case we’ve used ColdFusion). I’ll also discuss additional steps to take for validating and implementing syndication for your site.

What You Need
To create the system to generate an RSS feed and make it publicly accessible, you will need three things: a server on the Internet where you can put the feed, a database that has descriptions of your content, and a server-side scripting language with access to your database.

  1. Decide where you will put the file on your Web site. You may want to have multiple feeds in the future, so consider putting it in a directory called “rss.” Decide on a filename that you will not change and put it on your Web site in a location that can remain the same for the foreseeable future. Here’s an example location and filename:
    
    http://www.myWebSite.com/rss/myWebSite.xml
    
  2. Decide what content you want to put in the feed. Keep in mind that you will need, at a minimum, two pieces of information for each piece of content you want to syndicate: the title of the content and the URL. For example: Title: “Choosing the Right Web Services Management Platform”
    Link: http://www.devx.com/content/id/10549Optionally, there are additional fields you should include for each piece of content, for example:Description: When you deploy any application, you are expected to have a plan for management and maintenance of that codebase—that’s part of the job. But developers have been so busy learning Web services that management issues have taken a back seat. Use these requirements as a checklist for investigating products that should underlie your company’s vital Web services and facilitate their long-term management.

    Remember that the author field must contain a valid email address to validate as RSS. The PubDate filed must be properly formatted as a datetime field. For example:

    Author: justin.murray@hp.com (Justin Murray)
    PubDate: Tue, 21 Jan 2003 14:20:36 PST

  3. You can use whatever scripting language you like, anything from Perl to ASP. You must be able to do simple string manipulation, pull data from your database, and set the MIME type of the file being served to the browser.
 
A Sample RSS Feed
Once you have completed steps 1 and 2 on the previous page, it’s time to dynamically build the text file. This file will be formatted as XML and has two distinct sections. The top section contains basic information about your feed, such as the title and the time the file was generated. The bottom section has information on each specific piece of content that you wish to syndicate (an “item”).First, I’ll show an example feed, and then I’ll show how to generate and write out the file in ColdFusion. The example feed follows; bold items should be replaced with information for your site.


<?xml version="1.0" ?>
<!-- RSS generated by DevX.com on Fri, 24 Jan 2003 12:38:45 PST --->
<rss version="0.91">
<channel> <!--- The 'channel' tag is the area where you specify
general information about your feed--->
    <title>DevX Featured Content</title>
    <link>http://www.devx.com</link>
    <description>Latest DevX Content</description>
    <language>en-us</language>
    <copyright>Copyright 2003 DevX</copyright>
    <docs>http://backend.userland.com/rss</docs>
    <lastBuildDate>Fri, 24 Jan 2003 12:38:45 PST</lastBuildDate>

If you want to show an image with your content feed, use the optional image section below:


<image>
    <title>DevX</title>
    <url>http://www.devx.com/assets/devx/3182.gif</url>
    <link>http://www.DevX.com</link>
</image>

The next section is where each piece of content is identified and described:


<item>
      <title>Attend to your future. </title>
      <description>A future where millions of
users are waiting.</description>
      <link>http://www.devx.com/content/id/10559</link>
      <author> editorial@devx.com </author>
      <pubDate>Wed, 22 Jan 2003 11:19:28 PST</pubDate>
</item>
<item>
      <title> etc  </title>
      <description> etc  </description>
      <link> etc  </link>
      <pubDate> etc </pubDate>
</item>

You can include as many items as you want. When you’re finished, close the channel and rss tags.


</channel>
</rss>
 
Creating the File
To create the file, you first need to query the database to grab the content descriptions you want to syndicate. The SQL code below is a quick, simple method of doing this. You will need to modify it to fit your situation. The <cfquery> tags are specific to ColdFusion. The code in this section of the article is taken from the complete sample code, which is available for download here.


<cfset NumberOfFeedItems = 10>

<cfquery name="getLatestPublishedArticles"
datasource="yourDB">
SELECT     TOP #numberOfFeedItems# *
FROM         yourContent
ORDER BY PublishedDate DESC
</cfquery>

Next, set a date variable.


<cfset theDatetime = "#dateformat(now(),
"ddd, dd mmm yyyy")# #timeformat(now(),
"HH:mm:ss")# PST">

Now, save the output to a variable. In my ColdFusion version I saved the output to a variable called “theXML.” It is a string that will contain all my XML.


<cfsetting enablecfoutputonly="yes">
<cfsavecontent variable="theXML">

Now write out the XML text. Here is the top portion of the file:


<cfoutput>
<?xml version="1.0" encoding="ISO-8859-1" ?>
<!--  RSS generated by DevX on #theDatetime# -->
<rss version="2.0">
<channel>
    <title>DevX Featured Content</title>
    <link>http://www.devx.com</link>
    <description>Latest DevX Content</description>
    <language>en-us</language>
    <copyright>Copyright 2002 DevX</copyright>
    <docs>http://backend.userland.com/rss</docs>
    <lastBuildDate>#theDatetime#</lastBuildDate>
    <image>
    <title>DevX</title>
    <url>http://www.devx.com/assets/devx/3182.gif</url>
    <link>http://www.DevX.com</link>
    </image>
</cfoutput>

The next step is to start adding content items that you want to syndicate. In ColdFusion, I did this with a simple loop. In other words, I loop over the values returned from the first query I did:


<cfloop from="1" to = "#numberOfFeedItems#" index="ctr">

Now set up variables with the data for each item. This is also a good place to “massage” the data so that you can avoid having any illegal characters or tags—you must have properly formatted XML. I used the “Replace” function to replace “<” signs with “& l t ;.” This gets around the problem of embedding HTML in your feed. If you might have HTML tags embedded in your XML, then the XML will not validate correctly unless you do this kind of replacement. Also, if you have URLs with URL parameters embedded in your fields, you must replace the “&” signs with “& a m p ;” so the XML will validate correctly. For example, in the URL http://www.aWebSite.com/index.htm?value1=one&value2=two, the “&” sign in the string will cause trouble.


<cfscript>
  title = replace(getContent.title[ctr], "<", "<", "ALL");
  description = replace(getContent.abstract[ctr], "<", "<", "ALL");
  description = replace(description, "&", "&", "ALL");
  date = dateformat(getContent.dateFirstPublished[ctr], "ddd, dd mmm yyyy");
  time = timeformat(getContent.dateFirstPublished[ctr], "HH:mm:ss") & " PST";
  author = replace(getContent.author[ctr], "<", "<", "ALL");
  pubDate = date & " " & time;
</cfscript>

Then output the fields for each item:


<cfoutput>
    <item>
	<title>#title#</title>
        <description>#description#</description>
        <link>http://www.devx.com/content/id/#getContent.content_id[ctr]#</link>
        <author>#authors#</author>
        <pubDate>#pubDate#</pubDate>
    </item>
</cfoutput>

Now close the loop, close the channel and rss tags, and close the cfsavecontent tag.


</cfloop>
<cfoutput>
</channel>
</rss>
</cfoutput>
</cfsavecontent>

The XML is stored in “theXml” variable, and you can write it to a publicly accessible Web directory using this line:


<cffile action="write"
file="c:devxsyndicationoutgoingdevxFeed.xml" output="#theXml#">

Finally, set the MIME type of the document so the browser knows the output is in XML. This is an optional step and simply allows you to see the XML you generate in a browser.


<cfcontent type="text/xml">
<cfoutput>#theXml#</cfoutput>
 
Finishing Touches
If you’ve done everything to this point, you now have a finished RSS file. You aren’t quite done though. These last few clean up items accomplish some important tasks:

  • get the file to update periodically
  • ensure that the format is valid
  • ensue that the character encoding is appropriate for your feed
  • registering your feed so it can be found
  • categorizing your content

Update the File at a Scheduled Time
You could generate the file every time it is called, but ideally you want it to update periodically, like once a day. Depending on your platform, there are a variety methods to do this. With ColdFusion, one method is to create a ColdFusion Scheduled Task that calls the script that generates the RSS file. You can schedule it to run as often as your needs require.

Other methods might be using the cachedWithin attribute of the cfquery tag, creating a cron job (on a GNU/Linux system), using a Windows Scheduled Task, or setting a timestamp in the application and checking it each time the script is run.

Validate your RSS
There are many places on the Internet that will check the validity of an RSS feed. I like the one below because it gives you a detailed breakdown of any errors it finds:


http://feeds.archive.org/validator/check?url=<your RSS URL here>

All you do is go to the link with your RSS URL appended, like this:


http://feeds.archive.org/validator/check?url
=http://services.devx.com:333/outgoing/devxfeed.xml

If your RSS feed is valid, you’ll see this: Otherwise you see a detailed breakdown of your feed, with lines numbers, errors, and a help link to explain what the error is and how to fix it.

Character Encoding
XML is, by default, encoded as Unicode. This can cause some problems if you have Windows character codes within the XML, such as the trademark character (). Characters like this can cause a lot of headaches, and there are two ways that I know of to get around the problem. The first is to put only valid Unicode characters into your XML. The other is to specify a character encoding-type at the top of your XML document. Setting the encoding type to a Windows-friendly character set can alleviate some problems.

To do this just replace the top-most tag in your XML


<?xml version="1.0" ?>

with this line:


<?xml version="1.0" encoding="windows-1252" ?>

There are many character encoding settings you can use. This article has a good explanation of character encoding. Register Your Feed
In order for other sites to find and consume your content feed you have to list your feed with a feed aggregator, which maintains lists of categorized RSS feeds. There are many such sites on the Web. Some aggregators have up to 10,000 valid feeds already listed and this number is growing daily. Still, registering your feed with one or more of these sites is the best way to get the word out to the world that you are syndicating your content.

On Syndic8.com, to add a feed you simply create a login and then suggest a valid feed URL. After suggesting a feed, it gets submitted for review by a human. Once a human has approved it, it gets added to the valid feed master list.

It’s a completely open and free process. When you sign up you will be asked to become one or all of the roles “Reviewer,” “Evangelist,” “Scraper,” or “Fixer” to help out with the community.

Categorization
Categorizing your data feeds helps make it more useful to your syndication hosts. By categorizing, you can expose the subject matter of each piece of content. As the sheer volume of syndicated information on the Internet grows, it will become more and more important to provide this metadata, and it greatly increases the probability that other sites will link to you. You can add categorization in the <channel> tags and <item> tags, and each can have as many categories as you like. Here’s an example of categories at the channel level and the item level:


<channel>
  <title>DevX Featured Content</title>
  <link>http://www.devx.com</link>
  <description>Latest DevX Content</description>
  <category>Technical Articles</category>
  <category>Computer Programming</category>
  <category>ColdFusion</category>

<item>
  <title> </title>
  <description>  </description>
  <link> </link>
  <category>Java</category>
  <category>software engineering</category>
</item>

You can create your own categories or you can use public taxonomies, where other folks have created standard categories. A couple of examples of public taxonomy sites are http://dmoz.org/ and http://www.superopendirectory.com/. To use a public taxonomy you provide a link to the taxonomy in the <category> tag like this:


  <category url="http://www.dmoz.org">Java</category>

Creating an RSS content feed is one of those projects that pays rich rewards for a very small upfront investment of time. For any Web site that wants to find painless ways to increase traffic while still controlling content, RSS is the right choice.Ladd Angelius is a Software Engineer at DevX. He can be reached at ladd@devx.com.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: