Registering and Discovering RSS Feeds in UDDI

By Karsten Januszewski

Table of Contents

Introduction
RSS To UDDI Mapping
Simple Sample
Two Canonical RSS tModels
Classification of RSS Feeds
Complete Publication and Inquiry Sample
Additional Resources

Introduction

The use of Universal, Description, Discovery and Integration (UDDI) to catalog and discover RSS news feeds is a logical application of UDDI in its mission of description and discovery of Web services.  RSS is one of the most frequently used applications of XML on the Web today.  It provides a standard way for organizations and individuals to distribute news on the Internet.  For more on RSS, see Additional Resources below.

One question that arises with RSS is the ability to discover the location of different RSS Feeds.  The question of discovery and aggregation of RSS Feeds has the following requirements:

  1. Programmatically publish an RSS Feed 
  2. Associate metadata (classification, geography, ownership, etc.) with that RSS Feed in an extensible manner
  3. Query for RSS Feeds based on a number of parameters
  4. Perform requirements 1, 2, and 3 in an interoperable, programming language independent way

It is in meeting these requirements that UDDI serves as a solution.  UDDI provides a mechanism to register RSS Feeds in a UDDI registry.  UDDI has a flexible classification system that can be employed to attribute those feeds with a range of different metadata in an extensible way.  Once RSS Feeds are registered in UDDI, users can query for those feeds deterministically across different metadata.  Client RSS readers can query UDDI and aggregate different RSS Feeds based on classification information.  And, lastly, UDDI is an interoperable, programming language independent service with a comprehensive XML SOAP API for both publication and inquiry.  For more on UDDI, see Additional Resources below.

The duration of this paper will outline exactly how to register and discover RSS Feeds in UDDI.  For a sample RSS publisher and aggregator application using the Microsoft .NET UDDI SDK, see the sample called RSS 0.9x UDDI Feed Aggregator.

RSS to UDDI Mapping

The mapping between RSS and UDDI is relatively straigtforward.  The following list provides a mapping between the two data models. 

  1. Each RSS <channel> corresponds to a UDDI <bindingTemplate>.
  2. The RSS Feed URL corresponds to the UDDI <accessPoint> of that <bindingTemplate>.
  3. For RSS 0.9x Feeds, the UDDI <bindingTemplate> will contain a UDDI <tModelInstanceInfo> containing a reference to the RSS 0.9x canonical tModel (uuid:8a056b70-bfe8-4fac-90cd-820c26dc2e48 ) and the RSS version attribute will be identified in the UDDI <instanceParm>.
  4. For RSS 1.0 Feeds, the UDDI <bindingTemplate> will contain a UDDI <tModelInstanceInfo> containing a reference to the RSS 1.0 canonical tModel (uuid:bf3d12a4-a6e8-4ef2-918c-18c60a04edfc).
  5. The RSS <link> corresponds to the UDDI <overviewURL> of the tModelInstanceInfo.
  6. The RSS <description> corresponds to the UDDI <description> of that <bindingTemplate>.
  7. The RSS <language> corresponds to the xml:lang attribute of the UDDI <description> of that <bindingTemplate>.
  8. The RSS <title> may correspond to the <name> of the UDDI <businessService>.
  9. RSS contact information may correspond to the <contact> of the UDDI <businessEntity>.

Simple Sample

RSS Feed - located at http://www.example.com/sample/rss.xml

 
<?xml version= "1.0"?> 
<rss version="0.91"> 
	<channel> 
		<title>Sample Feed</title> 
		<link>http://www.example.com/sample/rss.aspx</link> 
		<description>Sample Description</description> 
		<language>en-us</language> 
		<item> 
		...
		</item>
	</channel>
</rss>
		
		

UDDI Entry

    
<businessEntity businessKey="...">
	<name>Ficticious News Company</name>
	<businessServices>
		<businessService serviceKey="..." businessKey="...">
		<name>Sample Feed</name>
		<bindingTemplates>
			<bindingTemplate bindingKey="..." serviceKey="...">
			<description xml:lang="en">Sample Description</description>
			<accessPoint URLType="http">http://www.example.com/sample/rss.xml</accessPoint>
			<tModelInstanceDetails>
				<tModelInstanceInfo tModelKey="uuid:8a056b70-bfe8-4fac-90cd-820c26dc2e48">
					<instanceDetails>
						<overviewDoc>
							<overviewURL>http://www.example.com/sample/rss.aspx</overviewURL>
						</overviewDoc>
						<instanceParms>0.91</instanceParms>
					</instanceDetails>
				</tModelInstanceInfo>
			</tModelInstanceDetails>
			</bindingTemplate>
		</bindingTemplates>
		</businessService>
	</businessServices>
</businessEntity>
				

Two Canonical RSS tModels

UDDI uses tModels to represent schemas and technical metadata.  Two canonical RSS tModels have been created in the UBR to be used when modeling all RSS Feeds: one for all 0.9x RSS Feeds and one for 1.0 RSS Feeds. This tModel SHOULD be referenced by anyone registering an RSS Feed in UDDI, independent of which RSS version.  By employing this exact tModelKey in the tModelInstanceInfo of an RSS Feed entry in UDDI, all RSS Feeds will henceforth be discoverable.

RSS Canonical tModel - Version 0.9x

Name: RSS - Version 0.9x
tModelKey: uuid:8a056b70-bfe8-4fac-90cd-820c26dc2e48
Description: Use this tModel when registering 0.9x RSS Feeds.  Place the RSS Version information in the instanceParm element of the tModelInstanceInfo. 

RSS Canonical tModel - Version 1.0

Name: RSS - Version 1.0
tModelKey: uuid:bf3d12a4-a6e8-4ef2-918c-18c60a04edfc
Description: Use this tModel when registering 1.0 RSS Feeds. 

Modeling RSS Version Information

When considering how to model RSS version information in UDDI, there were several options.  A version specific tModel for each different RSS version could have been created such that each version (0.91, 0.92, etc) had an individual tModel.  However, there are two reasons that approach was not followed.  First, many RSS clients can gracefully consume feeds of different versions. Second, introducing a tModel for each RSS version introduces complexity into the modeling behavior, both when registering a feed and querying for feeds.  In the spirit of the simplicity of RSS, the paper has chosen the less complicated path.    

However, it did not seem appropriate to lump RSS Version 1.0 with the the 0.9x RSS tModel.  RSS Version 1.0 represents a significant change for RSS and warranted a separate tModel.  Especially given the extensible nature of RSS 1.0 through the adoption of RDF and XML namespaces, the creation of a canonical RSS 1.0 tModel is appropriate.  One can imagine future tModels being created to represent additional RSS modules to support RSS modularization. 

Classification of RSS Feeds

In order to attribute metadata to an RSS Feed, the businessService representing that feed should be adorned with a categoryBag element containing the appropriate :keyedReference elements.  A keyedReference contains the three attributes: tModelKey, keyName and keyValue. The required tModelKey refers to the tModel that represents the categorization system, and the required keyValue contains the actual categorization within this system. The optional keyName can be used to provide a descriptive name of the categorization.  

There are three main options for adding keyedReferences to a UDDI entry, which will be discussed below.  Using a combination of these different options would be the recommended approach to classifying RSS Feeds in UDDI.

Using Existing Checked Taxonomies

The classification and reification of data within UDDI using structured taxonomies is key to its purpose of discovery.  Within the UBR, several taxonomies are made available through the specification itself, including North American Industry Classification System (NAICS) 1997 Release, Universal Standard Products and Services Classification (UNSPSC) Version 7.3 and ISO 3166 Geographic Taxonomy.  Each of these taxonomies is checked: when publishing a UDDI entry that attempts to use a value from one of these taxonomies, the operating UDDI node will perform validation on that value to insure the value is valid within the taxonomy.

For example, a feed about Hollywood movies might be classified under the NAICS category "Motion Picture and Video Industries" with the value 5121.  The XML would look as follows:

    <categoryBag>
      <keyedReference 
      	tModelKey="uuid:c0b9fe13-179f-413d-8a5b-5004db8e5bb2" 
      	keyName="Motion Picture and Video Industries" 
      	keyValue="5121" 
      />
    </categoryBag>
			

Creating and Using New Checked and Unchecked Taxonomies

The taxonomies already provided by the UBR have merits within the context of classifying RSS Feeds, but they have shortfalls as well. While comprehensive, NAICS and UNSPSC are more oriented to industry and business, which isn't always appropriate in the context of a news feed.   

UDDI was designed to support multiple classification systems so that as new and alternative taxonomies emerge, they can be incorporated into UDDI.  It is quite possible to create a new taxonomy, establish a set of values and promote it amongst a community for usage.  The process of creating a new taxonomy differs depending on whether that taxonomy is to be checked or not. 

Creating a new checked taxonomy within the UBR is a more complicated process than creating an unchecked taxonomy, as it requires coordination with the operators of the UBR itself.  Providing an unchecked taxonomy is relatively trivial.  One need only create a new tModel, establish the value set, and broadcast its usage.  For more information, see Providing a Taxonomy  for Use in UDDI Version 2 for more.

Using General Keywords

Because taxonomies are hierarchical in nature, they do not afford any flexibility to accommodate topics that do not fit into their scheme.  Human knowledge is often not accommodated within such rigidity. An alternative to using taxonomies is to use the general_keywords feature in UDDI (also known as the misc-taxonomy), which allows free form words to be used as a classification mechanism. This allows someone publishing an RSS Feed to simply associate that feed with appropriate keywords. 

The general keywords taxonomy is a unique taxonomy within UDDI, in that per the specification, queries can be issued that match on the keyName value as well as the keyValue.  As such, a keyName must be provided in a keyedReference if its tModelKey refers to the general_keywords category system.  For the purposes of keywords used to describe RSS Feeds, the string "RSS" should be used as the keyName. 

For example, a feed about movies might be classified with the words "hollywood", "film" and "movie", etc. 

    <categoryBag>
      <keyedReference 
      	tModelKey="uuid:a035a07c-f362-44dd-8f95-e2b134bf43b4" 
      	keyName="RSS" 
      	keyValue="hollywood" 
      />
      <keyedReference 
      	tModelKey="uuid:a035a07c-f362-44dd-8f95-e2b134bf43b4" 
      	keyName="RSS" 
      	keyValue="film" 
      />
      <keyedReference 
      	tModelKey="uuid:a035a07c-f362-44dd-8f95-e2b134bf43b4" 
      	keyName="RSS" 
      	keyValue="movie" 
      />
    </categoryBag>
			

See http://www.uddi.org/taxonomies/Core_Taxonomy_OverviewDoc.htm#GenKW for more information.

Complete Publication and Inquiry Samples

Publication

Below is a sample UDDI businessEntity.  It is for a company that makes films and provides a news feed about the film industry.  Because there are several different RSS Versions, it provides twodifferent feeds: RSS 0.91 and 1.0.

<businessEntity businessKey="...">
	<name>Ficticious Film Company</name>
	<businessServices>
		<businessService serviceKey="..." businessKey= "..."> 
			<name>Ficticious News Feed</name> 
			<bindingTemplates> 
				<bindingTemplate bindingKey="..." serviceKey="...">
					<accessPoint URLType= "http">http://example.com/rss91.aspx</accessPoint>
					<tModelInstanceDetails> 
						<tModelInstanceInfo tModelKey="uuid:8a056b70-bfe8-4fac-90cd-820c26dc2e48"> 
							<instanceDetails>
								<overviewDoc>
									<overviewURL>http://www.example.com/sample/rss.aspx</overviewURL>
								</overviewDoc>
								<instanceParms>0.91</instanceParms>
							</instanceDetails>
						</tModelInstanceInfo>
					</tModelInstanceDetails>
				</bindingTemplate>
				<bindingTemplate bindingKey="..." serviceKey="...">
					<accessPoint URLType="http">http://example.com/rss10.aspx</accessPoint>
					<tModelInstanceDetails>
						<tModelInstanceInfo tModelKey="uuid:bf3d12a4-a6e8-4ef2-918c-18c60a04edfc"/>
							<instanceDetails>
								<overviewDoc>
									<overviewURL>http://www.example.com/sample/rss.aspx</overviewURL>
								</overviewDoc>
							</instanceDetails>
					</tModelInstanceDetails>
				</bindingTemplate>
			</bindingTemplates>
			<categoryBag>
				<keyedReference tModelKey="uuid:a035a07c-f362-44dd-8f95-e2b134bf43b4" keyName="RSS" keyValue="hollywood" />
				<keyedReference tModelKey="uuid:a035a07c-f362-44dd-8f95-e2b134bf43b4" keyName="RSS" keyValue="film" />
				<keyedReference tModelKey="uuid:a035a07c-f362-44dd-8f95-e2b134bf43b4" keyName="RSS" keyValue="movie" />
				<keyedReference tModelKey="uuid:c0b9fe13-179f-413d-8a5b-5004db8e5bb2" keyName="Motion Picture and Video Industries" keyValue="5121" />
			</categoryBag>
		</businessService>
	</businessServices>
</businessEntity>
				

Discussion:

First, note how there are two bindingTemplates, with an accessPoint for each RSS Feed.  Each bindingTemplate implements the canoncial RSS tModel for that feed's version.  The version number for the feed in the case of the 0.91 feed is then placed in the <instanceParms> element. This number should be identical to the version attribute of the rss:feed element.  

Second, note how the businessService has been categorized with four keyedReferences.  The first three keyedReferences are referencing the general_keywords taxonomy.   The keyValue in each case is the relevant key word and the keyName is always RSS.  Again, by standardizing on the convention of using the token "RSS" in the keyName of any general_keywords, consistant searches can be performed. The last keyedReference uses the NAICS taxonomy.  In this case, the keyName is not important, but the keyValue refers to a value from that taxonomy.

Inquiry

The feeds presented in the example are discovered using a two step process. 

Step One : A find_service is issued, parameterized depending on situation:

  1. Find all RSS 0.9x Feeds. This is the most basic query and will return all feeds.
    <find_service generic="2.0" xmlns="urn:uddi-org:api_v2" businessKey="">
    	<tModelBag>
    		<tModelKey>uuid:8a056b70-bfe8-4fac-90cd-820c26dc2e48</tModelKey>
    	</tModelBag>
    </find_service>
    								
  2. Find all RSS 0.9x Feeds categorized with the general keyword "movie" and the NAICS classification for "movie".  Note that because no findQualifier was specified, there is a logical AND between all the categoryBag keyedReferences. 
    <find_service generic="2.0" xmlns="urn:uddi-org:api_v2" businessKey="">
        <categoryBag>
            <keyedReference tModelKey="uuid:a035a07c-f362-44dd-8f95-e2b134bf43b4" keyName="RSS" keyValue="movie" />
            <keyedReference tModelKey="uuid:c0b9fe13-179f-413d-8a5b-5004db8e5bb2" keyName="Motion Picture and Video Industries" keyValue="5121" />
        </categoryBag>
    	<tModelBag>
    		<tModelKey>uuid:8a056b70-bfe8-4fac-90cd-820c26dc2e48</tModelKey>
    	</tModelBag>
    </find_service>
    
    				
  3. Find all RSS 0.9x Feeds categorized with the general keyword "movie", or the NAICS classification for "movie".  Note that the orAllKeys findQualifier was specified and there is a logical OR between all the categoryBag keyedReferences. 
    <find_service generic="2.0" xmlns="urn:uddi-org:api_v2" businessKey="">
        <findQualifiers>
    		<findQualifier>orAllKeys</findQualifier>
        </findQualifiers>
        <categoryBag>
            <keyedReference tModelKey="uuid:a035a07c-f362-44dd-8f95-e2b134bf43b4" keyName="RSS" keyValue="movie" />
            <keyedReference tModelKey="uuid:c0b9fe13-179f-413d-8a5b-5004db8e5bb2" keyName="Motion Picture and Video Industries" keyValue="5121" />
        </categoryBag>
    <tModelBag> <tModelKey>uuid:8a056b70-bfe8-4fac-90cd-820c26dc2e48</tModelKey> </tModelBag> </find_service>

Each of these these queries will return a serviceInfos collection, which contains the service name and the service key.  This list can be displayed to the user and/or used to retrieve one or more of the actual feed locations from UDDI. 

Step Two: The next step is to actually get the URL for the feed as well as any other information about the feed, such as the version of RSS it is using.  Use a find_binding to do this.  This API will return all the contents of the binding.  The tModelKey will need to be passed again to differentiate any bindings that may exist under that service which are not RSS Feeds.  This second retrieval query would look as follows:

<find_binding generic="2.0" serviceKey="..." xmlns="urn:uddi-org:api_v2">
    <tModelBag>
		<tModelKey>uuid:8a056b70-bfe8-4fac-90cd-820c26dc2e48</tModelKey>
    </tModelBag>
</find_binding>
		

Additional Resources

RSS 1.0 Specification. http://purl.org/rss/1.0/spec

UDDI on MSDN. http://msdn.microsoft.com/uddi

UDDI Specifications (all versions). http://uddi.org/specfications.html

Registering and Discovering RSS Feeds in Microsoft Windows Server 2003 UDDI Services. http://winfx.members.winisp.net/karstenj/docs/rss_in_uddi_services.aspx