Skip to main content

Topic Map Applications

Note

This is the text of a paper written in 2001. It is here primarily for historical interest (hello future historians!), you may find that some of the links and projects referred to in this paper are now defunct - that is just the nature of the Web. If you own / are still involved in any of the projects in this paper and would like to provide an update I'd be interested to hear from you!

Introduction

Topic maps provide a standardized way for representing structured information as a set of resources grouped around topics; and as relationships between those topics. The topic map concept is described in detail in the ISO standard ISO/IEC 13250:2000 [ISO13250] , which also specifies a standard interchange syntax based on SGML and HyTime. The XML Topic Maps (XTM) 1.0 specification developed by TopicMaps.Org defines an interchange syntax based on XML and XLink. Despite the relatively new status of these standards (the ISO specification was published in January 2000, with the XTM specification being completed in February 2001), commercial and non-commercial implementations of so-called 'topic map engines' are already being offered. These 'engines' have certain features in common - all provide the means to persistently store data structured as topic maps and to retrieve that data programmatically via an API. Other commercial and non-commercial applications of those engines are also available, including applications to present topic map data in a user-friendly browser-interface and applications to manually or automatically create topic maps from data sources.

This paper does not review the current crop of topic map applications, but instead is intended as a review of real-life applications of the topic map standard to business problems. All of the examples in this paper are extant and are either in production or in trial for production. The goal of this paper is to demonstrate that topic maps can be, and are being, used to effectively solve problems in the business world and to try and reveal some of the reasons for this approach being adopted by the developers of the solutions.

This paper was written from research carried out amongst a number of consultants, vendors and users of topic map technology.

Publishing Solutions

One of the stated goals of the XTM specification is "to improve the findablility and manageability of information", and one of the earliest and most obvious applications of topic maps has been to the organisation and navigation of large quantities of published information. In the solutions which fall into this category, the topic map typically acts as a highly structured index of statically published information, relating each piece of information to the topic or topics to which it is relevant and then interrelating those topics. For an end-user, the result is a highly intuitive set of cross-linked resources which make it easier to browse and search for information contained within the corpus.

The IRS Tax Publications

The United States Inland Revenue Service (IRS) produce a large quantity of documentation on tax requirements and tax legislation which are important to individuals and organisations both large and small. However, each publication is produced and indexed separately. When gathering a collection of publications on a single CD-ROM, the IRS then found that they had the problem of not only unifying the indexes but in providing access to the unified index from within the content of each of the publications. The "2001 Tax Products" CD-ROM produced by the IRS makes use of topic maps to create a navigation layer that spans all of the publications on the CD-ROM. The "entry" to the collection is a traditional contents list, showing the publications on the CD-ROM (see Figure 1).

/images/papers/03-05-01-fig01.jpg

Figure 1: IRS 2001 Tax Products Contents

Each publication then has its own contents page which includes a link to an index generated from the names of the topics found by processing the source of that document (see Figure 2). Entry to the index is also provided from within the document and is presented as links to the left of the content (see Figure 3). These links are generated when an occurrence of a topic within the content is located. The links then take the user to a page generated for the topic itself. This page aggregates all of the occurrences of the topic across the entire CD-ROM, allowing the user quick access to a whole range of information related to the index term. In the example shown below ( Figure 4 ) we can see how browsing the term "Estimated Tax" can be used to jump to information for small business, the self-employed or for business partners - the publication titles and the content fragments displayed in the index enable the user to quickly choose the most relevant documents.

/images/papers/03-05-01-fig02.jpg

Figure 2: Contents Of A Single Publication

/images/papers/03-05-01-fig03.jpg

Figure 3: Index Terms In Publication Content

/images/papers/03-05-01-fig04.jpg

Figure 4: Links To Occurrences Of The Index Term (Topic)

The flexibility of the topic map structure enables the index information to carry more than just the locations of occurrences of each index term. Additional relationships to other index terms can also be included within the topic map, allowing both a hierarchical organisation of the index terms as well as facilitating cross-references between terms (see Figure 5).

/images/papers/03-05-01-fig05.jpg

Figure 5: Index Term Hierarchy

The topic map and the rendering to HTML was created by consultant Michel Biezunski, using his tool Topic Map Loom. The starting point for generating the topic map was the original source documents, which are in SGML. Michel uses Topic Map Loom to allow the automatic extraction of topics and occurrences of the topics from the source documents. The topic map generated by doing this is then used to generate the navigational aspects of the rendered HTML. In general, topic maps are particularly well suited to situations where a corpus is composed of a set of separately indexed pieces. Topic maps were designed from the start to be mergeable - the individual topic maps for each separate publication can be combined using the merging principles of the topic map paradigm to create a single global topic map.

The IRS were looking for a technology to enable them to smoothly upgrade their existing, single-document indexes into a collection index suitable for publishing on a CD-ROM. They were also looking for a way to represent not only the index terms and their occurrences but also to model more complex relationships between index terms to help guide a user through the documentation. Both of these needs could be met with a solution based on representing these structures with a topic map. The use of the topic map also provides the publishers the flexibility to later enrich the navigation with further cross-references and external references without the need to alter the underlying technology and processes used to generate that navigation.

Cogitech - Topic Maps And XSLT

Cogitech specialize in the production of low-cost intranet and internet web-sites using a combination of XML technologies in which topic maps are used extensively to create the structure of the web-site. As with the Quid web-site, the web pages are generated from topics in the web-site's topic map. However on the Cogitech web-site (http://www.cogx.com), not only is each page represented by a topic in the topic map, but the pages themselves are constructed by traversing associations from the 'page' topic to other topics, each of which become an item on the page. Another difference here is the use of XSLT to transform the structural information in the topic map into navigable links in the published web-site. Topic maps were chosen by the site's designer and implementer, Nikita Ogievetsky, because of the ability to not only represent the web-site as a non-directed network of display elements but also to encode style and behaviour information within the same information structure. Additionally, he found that the XML interchange syntax for topic maps to be sufficiently simple to be readily handled with XSLT.

In addition to using XSLT to generate web-site content from topic maps, Nikita has also developed reusable style-sheets for harvesting topic map information from other meta data forms. The use of XSLT for producing the hyper-linked pages reduces the cost of production by enabling the use of free tools.

Before applying topic maps to this problem, Nikita investigated the use of XLink but found that it was not sufficiently expressive to represent the styling and behavioural aspects of the site. He also found the RDF and RDF Schema caused different problems - although both are sufficiently expressive, each different site organisation would require a different RDF schema with a different set of elements - thus requiring more custom XSLT code.

Web Application Development

Publishing solutions such as those described in the previous section might be applied both to paper and to on-line distribution of information. However, web applications are not necessarily limited to the publication of relatively static information sets and may encompass far more dynamic complexity such as the dynamic creation of information objects such as documents, users, and groups; and of new information containers such as projects or subject areas which may be used to organise some or all of the information objects. As the list of types of information objects and information containers grows so does the complexity of the web application, added to which, a web environment is never static for very long - what may have at first seemed to be the correct profile of attributes for a given information object or container may over time become restrictive. Traditional web publishing systems which make use of some form of fixed back-end schema (e.g. a relational or object-oriented database) make it costly to alter the attributes of the information objects used by the web application. However, a system using a topic map to represent the key information objects can make use of a single back-end schema which is concerned only with representing the topics, the data occurrences of those topics and the associations between them. The precise profile of any given object can be modified simply by altering the topic map. This ability to change the application schema in the data - without the need to modify the underlying database makes web application development and update far simpler.

ITU

The Network for IT Research and Competence in Education (ITU) is a part of the Faculty of Education at the University of Oslo which is funded by the Norwegian government with the goal of promoting the application of information technology at all levels of education, from primary school through to teacher training. The remit of the organisation is to act as a coordinator for many different and diverse projects which bring more information technology into education. The ITU web-site, designed and implemented by the Scandanavian consultancy Creuna, presents information about the projects and about the people, and organizations involved in those projects as well as maintaining abstracts of and links to various publications produced by those people and organizations.

The collective information of the ITU web-site is stored in an object-oriented database and accessed using the Zope web-publishing system. Creuna cooperated with Ontopia to develop a tool-kit called Zope Topic Maps (ZTM) . ZTM consists of a database schema for Zope which allows topic map objects to be persistently stored in the Zope database, and a Python interface and Zope web-publishing system integration which enables those topic map objects to be included in dynamically generated web pages just like any other object in the Zope system. This tool-kit is being donated to the Open Source community by Creuna and Ontopia (as OpenZTM), who will also continue to sponsor further collaborative development of it.

The ITU web application was then developed around a combination of the Zope Content Management Framework (CMF) and a topic map schema which represents the relationships between projects, organisations, people and publications. On the web-site itself, nearly every page is dynamically generated from a topic in the ZTM database. The page display consists mainly of topic characteristics such as names, occurrences and association roles. The occurrence role and association role types are used to name the headings on the page. For example, from a page describing an individual, links are presented titled "Works for", "Project leader for", "Author of" and so on - these links are not taken from fixed fields in the database but are found by traversing the associations from the topic and using the information about each association to derive the correct title for the link. This means that if at a later date new links are required between the Person topic and some other topics, they can be simply added to the database while the application code itself remains unchanged. Examples of the pages displayed by the web-site are shown below.

/images/papers/03-05-01-fig06.jpg

Figure 6 - ITU Web-site: PLUTO project page

/images/papers/03-05-01-fig07.jpg

Figure 7 - ITU Web-site: Page for Morten Søby

Also of interest on the ITU site is the alternative means of navigating between topics - the topics are displayed in the upper-right hand corner as text of differing sizes "floating" in a rectangle, implemented using Macromedia Flash. The text displayed is the names of the set of topics "most closely linked" to the topic displayed on the page - the system uses a mechanism of weighted traversal of associations to determine the topics to be displayed. Mousing over a text string displays the type of the topic (so it is possible to distinguish between a document name and a project name, for example).

Stian Danenbarger of Creuna notes that although ITU was the first site developed with the ZTM framework, their initial investment is quickly paying off - "Being our first Zope project, and our first Topic Map project, we still delivered on time and on budget. The customer can maintain and rearrange content, information structure and template layout through the web. The customer is very satisfied.". A second topic-map-driven web-site is due to go live in November 2001, with a third, more complex site in the pipeline. In fact, the existing topic map framework meant that, once the design and static HTML was done the second site was built in just one and a half days. Danenbarger attributes part of this reduced delivery time to the conceptual element of topic maps - "We couldn't come up with a better concept in the couple of weeks we could spend on analysis and design.", he says.

When discussing the problems of adopting a new technology such as topic maps, Danenbarger comments that the relatively immature state of the existing tools is still a hurdle to overcome, as is the lack of "best practices" on topic map design. However, by developing their web-site development framework on an open, standards-based technology, Creuna also hope to benefit from synergy with future topic map tools; competence; and from being able to interchange web-site information easily.

CSIR iWorks Ideabank

The Council for Scientific and Industrial Research (CSIR) is South Africa's largest independent research institution. As a research organisation, funded by business and government, CSIR's ability to serve the needs of their stake-holders relies heavily on their ability to innovate and thus on the ability to communicate ideas between people. The CSIR iWorks team were tasked with the creation of an environment within which all research and communication efforts of the information technology division of CSIR could aggregated, and the Ideabank application has been designed as a collaborative web application to facilitate the sharing of knowledge and ideas within that environment.

The iWorks team started the system design with a set of domain objects that they wished to model with the Ideabank. These domain objects included entities such as Person, Idea and Category. This initial design was done without reference to topic maps or any other form of representation of the domain objects. However, the iWorks team found that the topic map paradigm could be used to easily express the kinds of properties and relationships that they had envisioned for the domain objects and that the possibilities of the topic map structure lead to the expression of new relationships at the domain object level. Richard Watson, one of the developers responsible for the system says:

The process was very iterative. We would think about what we required from our domain objects. We would then look at Topic Maps and see how they could be used to solve each problem, and looked for things we had not thought of. This led to the immediate surfacing of both potential problems and opportunities to mature our design.

However, the domain objects are not only elements of the Ideabank "ontology" but are also manifested as programming objects which "wrap" an underlying topic map implementation. So properties of the entities are mapped to topic map constructs (for example the surname property of a Person object is mapped as a base name of the topic which represents the person).

As with other on-line publishing applications described earlier in this paper, the topic map structure provided the developers at CSIR with a solid foundation upon which to build their application structures. "Using topic maps gave us a great platform to work on, enabling us to concentrate on higher-level issues." says Watson, "We knew that the TM standards designers knew much more than we did about structure, so we built an extremely light layer to leverage that".

The Ideabank application was built upon a framework of open-source software, using TM4J (described later in this paper) as the topic map processing engine, while maintaining the topic map information itself in a collection of XTM files. The development team have found that their initial efforts have paid off very quickly with end users quickly grasping the advantage of the application and getting started with it easily. "Consider that our initial project was related to idea generation.", says Watson, "In terms of demonstrating a new way of thinking, topic maps have already paid off. Almost every demonstration has resulted in someone saying "I can use this", and at least a few very good ideas have resulted.".

Patrimoine - Financial Documentation

The project described here has been undertaken on behalf of a large financial institution by Patrimoine, a provider of professional and legal documentation in collaboration with Mondeca, making use of the Intelligent Topic Manager (ITM) product developed by the latter. The client is a major financial institution which provides a wide range of financial products and services. The project itself is an intranet site for company employees which enables them to quickly access a wide range of information including:

  • Professional and legal documentation provided by Patrimoine.
  • Information about the products and services offered by the institution, and of those offered by their competitors.
  • Business process documentation, such as the procedures for opening and closing accounts.
  • Financial simulators - Java applets used to model financial scenarios.

The documentation itself if held in a content management system (in this case, Documentum). The ITM system defines several different organisational "views" of the data, including a thesaurus, a hierarchical classification scheme and an index of business subjects (such as products, corporate entities and people). The business subjects are further organised into small knowledge bases, recording the relationships between them - such as the modules which comprise a larger financial product; the organisation of business processes and the physical arrangement of documentation (in books, chapters and sections). Much of the classification and thesaurus information is derived directly from the documentation sources. In the prototype system, the business knowledge links are entered by hand, but in the final system, these links will come from a separate database.

The ITM keeps track of all of the topics and the relationships between topics using an internal representation of the topic map graph which represents all of these indexes. Each topic is also stored within the content management system in its XML representation, enabling the client to use standard XML editing tools, or the standard programming interface to the content management system to edit and update the information resources referenced from each topic.

To the end user, all of this information is made accessible via a portal. This portal makes use of a text-based user-interface to the internal topic map managed by ITM which is very similar to the other text interfaces presented in this section (see Figure 8 and Figure 9)

/images/papers/03-05-01-fig08.jpg

Figure 8 - Topics Associated With "Personal Insurance"

/images/paperts/03-05-01-fig09.jpg

Figure 9 - Third-party Legal Documentation

In addition to being able to browse the topic map through the portal, the user can also edit the topic being browsed, either to add some new information to it or to make an association link to one or more other topics. The system is implemented to restrict the different kinds of associative links that can be made between topics, but those restrictions are simply configuration options - the underlying topic map-based system can support any number of different types of relationships.

For the end users in the financial institution, this portal system brings all of the most recent and most relevant data quickly to hand - improving productivity in sales and consulting. For Patrimoine, the benefit comes from the ability to easily integrate all of their existing documentation indexes into a single database and to have that integration done in such a way that the system can be quickly tailored to the specific needs of each client.

Application Development

Just as web applications can benefit from the fact that a topic map stores the application schema as data, so may any other application which must deal with structured information. Again, for the developer the advantages are:

  • A data model which typically matches the decomposition of application design into a set of interacting objects.
  • A data model which can be modified simply by altering the data which provides the application schema, and without the need to re-compile or re-populate database tables.
  • A single API for accessing the data.

These advantages can be applied in two ways. Firstly, even when the complete data model of the application is known at design time, the use of topic maps makes it easier to modify the design as development progresses and to later refine the design without requiring users to upgrade their database. However, the flexibility of the topic map approach also enables applications to be developed in which the complete application data model is not known at design time but is configured or created by the end user.

Bravo - Knowledge Management with Topic Maps

Bravo, developed by GlobalWisdom Inc. is a knowledge management tool which combines both explicit and automated categorization of documents with user-feedback to determine the relevance of documents to any given query. The operation of Bravo is more complex than an automated categorisation of documents, as the system learns what documents are relevant to a particular query by both explicit feedback from the users (where a user rates a result for its relevance) and by implicit feedback (where the system determines the document's relevance to the user's query based upon what actions they do or do not perform on the document). Additionally the Bravo system enables individual users to be recognised as experts in particular subject areas and for other users to modify their results sets to promote the documents that the 'expert' claims to be most relevant. What is different in Bravo is that this expert may never have explicitly marked documents as relevant, instead the user is filtering the result set according to the expert's profile, which, in turn, is built through the expert's interactions with the software. This would include many documents that the expert may never have seen, but that Bravo has, using the profile, identified as relevant.

The information about documents; the concepts related to the documents; the 'experts' in the different conceptual areas; and the relationship between concepts are all stored by Bravo using a topic map. Bryan Thompson, CEO of GlobalWisdom, says "We use a topic map engine to provide a sophisticated and scalable information architecture. This serves as a differentiator for us and helps us to address more upscale markets.". GlobalWisdom chose not to implement their own topic map engine but instead to make use of the K42 engine from Empolis. "You could consider us a value-add for a topic map engine, and visa-versa" comments Thompson. Naturally, other means of representing the information structures required by Bravo are available and were considered - including directory servers and database systems, but topic maps were chosen because of the richness of the basic information architecture that they provide.

For GlobalWisdom, using topic maps as their underlying architecture and choosing to use a third-party engine has enabled them to bring a complex, scalable solution for to market more rapidly than they would otherwise have been able to. For customers of GlobalWisdom, although they need never be aware of the topic map technology that hides beneath the Bravo application, the topic or concept-centric organisation provides faster, more accurate access to relevant information, reducing the amount of time that people need to spend searching for the information that enables them to do their work.

Application Integration

One of the major challenges facing an application developer when confronted with the requirement to integrate two or more information systems is the diversity of means of representing meta data that different systems use. Even though two systems may refer to the same entity - they may each provide different meta data about that entity in different forms, possibly with some overlaps and possibly not. With just two systems to integrate, this can be a hard problem. With three, four or more systems, it rapidly becomes infeasible unless the meta data is first mapped into some common meta data set. Topic maps provide just the kind of flexible data structure that is required for representing this meta data and can supplement the meta data with links back into the data sources themselves.

When applying topic maps to this particular problem, an effective design pattern is to create a common meta data definition as a topic map schema and then write one 'connector' application for each information system. The connector application simply maps the meta data from the information system into the appropriate topic map constructs. The integrator can now treat each information system as a provider of topic map data using a well-known topic map schema and can develop applications against a single API which accesses this topic-mapped information. Starbase

Starbase Corporation is a leading provider of collaboration solutions for business application management. Starbase offers a family of user-friendly software products that enable teams of people to collaborate in the development and management of Web sites, e-commerce and business critical applications. The flagship product of Starbase, is the StarTeam repository which provides source code control, defect tracking, change management and task management and workflow functionality. Starbase is using topic map technology as a central element in a new product architecture which focuses on extracting value from information. As a repository, Starbase's software can hold a huge amount of information - some customers have StarTeam repositories in excess of 30GB for a single server. Exponential growth in hardware capabilities, such as processing power, storage capacity, and network connectivity, will offer still greater c apacity to create information. All of this available information is described by Starbase as an "Information Tsunami".

Starbase's new information architecture creates a unified view into a "technical collaboration space", providing access to all information resources, regardless of the product and project repository where they physically reside, be it a StarTeam repository, CaliberRM (a web-based system for managing project requirement information), Microsoft Exchange, custom information systems, and so on. The technical implementation is based upon topic map technology delivered by Ontopia. Within the system architecture, topic map technology is used as a core infrastructure technology to enable collaboration between diverse information systems. The use of topic maps enables Starbase to develop functionality such as:

  • finding information based on semantic queries
  • guiding traditional text search engines
  • providing intuitive navigation of information spaces
  • providing a framework for "knowledge events"

For Starbase, the choice to use topic map technology was made after careful evaluation of many different technologies and standards. Their decision was influenced by a number of factors. Most importantly, topic maps are a standards-based technology making use of XML, this providing advantages in interoperability and the ability to leverage existing implementations. In addition, Starbase believes that the flexibility and simplicity of the topic map architecture permits applicability of this technology far beyond its original design objectives - thus ensuring its status as a long-lived technology.

Of course, being on the "bleeding edge" created challenges getting corporate buy-in - especially as topic maps, being a new technology, has yet to provide many demonstrable cases of significant business value being derived from its application. The team responsible for developing the new system architecture worked hard to produce a proof-of-concept system in order to demonstrate value to their management.

Being a supplier of mission-critical software development support systems, the primary aim of Starbase is to deliver benefits to their customers in the form of more efficient knowledge-working. The new, topic map based, information architecture should provide their customers with a return on investment right away, as users benefit from being able to find existing information with much less effort than is required today. The topic map system also provides a framework for supporting "knowledge event" detection, which allows relevant information to be delivered to users without requiring an explicit search. Future developments will further integrate other development tools, allowing these tools to make better use of the domain knowledge held in the system. Although the primary objective of this approach is to increase the personal and team efficiency for software development processes, the number of dependencies between those individuals and teams and the rest of the organization means that the benefits will spill over into many parts of the enterprise.

Open Source Efforts

The previous sections of this paper have concentrated specifically on commercial and, in the most part, closed-source applications. However, this author believes that it is important for a newcomer to topic maps not to be led to think that there is no application of topic maps in the open-source world. As with many open source projects, commercial reasons alone are not the justification for the production of open source topic map software or open source software using topic maps - in many cases, these projects "scratch an itch" felt by the author or authors. This section briefly presents a selection of open source and free software projects which make use of topic map technology.

TM4J

TM4J is a topic map processing tool-kit written in Java. The basic TM4J tool-kit provides a programming interface to parse and load topic maps into memory or into persistent storage in an object-oriented database; to manipulate the topic map, including searches against a variety of different indexes of the topic map structure and finally to serialise the topic map or parts of the topic map in XTM syntax.

The major focus of the TM4J project has been on developing the back-end processing systems for higher-level applications, however as the project matures and expands, plans are being made for developing tool-kits for producing web-sites from topic maps with little or no coding effort, and for developing a full-featured topic map editing environment.

TM4J is developed and maintained by the members of the TM4J project and is distributed under the Apache Software Foundation license. For more information, see http://www.tm4j.org/.

The GooseWorks Topic Map Toolkit

The GooseWorks topic map tool-kit provides a topic map processing "engine" and API in C, with a wrapper API also available in Python. The tool-kit implements the graph-based topic map processing model proposed by Newcomb and Biezunski [PMTM4] ; and supports persistent storage of the processed topic map in variety of relational databases. The toolkit is designed not only to import XTM files, but to be capable of processing any kind of markup which contains inherent topic map information, such as NewsML, DocBook, RDF with Dublin Core, if the appropriate processing model is provided. Additionally, the tool-kit provides a suite of command line tools for basic topic map manipulations such as merging and filtering.

The tool-kit has been developed by Jan Algermissen (algermissen@acm.org) and Sam Hunting (gw@etopicality.com). The project home page is http://www.goose-works.org/. An HTML topic map browser based on GWTK is available online at http://www.topicmapping.com/v/V.cgi.

TMTab

TMTab is a plug-in for Protégé-2000, an ontology creation tool, which enables an ontology created with Protégé to be exported using XTM syntax. By following some simple rules in the creation of the ontology, the ontology designer can control the way in which instances of the classes of the ontology and relationships between the instances are mapped into the generated topic map. Additionally, Protégé provides the user with a simple form-based interface for each class in the ontology - making the task of manually creating topic maps far simpler.

TMTab is developed and maintained by Kal Ahmed (kal@techquila.com) and is freely downloadable from http://www.techquila.com/tmtab.html

Nexist

The Nexist project is an experimental test-bed for the development of an effective API for client-server architectures within the vision of Douglas Engelbart's Open Hyperdocument System [OHS] . Nexist is experimenting with an API that is based on the XTM XML topic maps specification. Along the way, Nexist includes projects which explore a constructivist approach to education, one in which learning occurs while exploring propositions made to answer focus questions, and presenting arguments both supporting and countering the proposed ideas.

Nexist is developed and maintained by Jack Park (jackpark@thinkalong.com). For more information, find the project at http://nexist.sourceforge.net/.

SemanText

SemanText is an open-source semantic network application developed in Python. The application uses the topic map structure to hold the knowledge managed and processed by the system - topics represent nodes in the semantic network and associations represent links between the nodes. The topic types are also used to construct a class-hierarchy which enables the application to apply inference rules to the semantic network and so derive further knowledge which can be added to the knowledge base simply by creating more topics and associations.

SemanText includes the ability to harvest topics and associations from structured text such as XML and RDF files. A natural language processing component allows flowing text to be processed in a similar fashion.

SemanText is developed and maintained by Eric Freese and the project home-page is http://www.semantext.com/.

Summary

Although a relatively new technology, topic maps are already finding a number of practical uses in both information publication, and application development and integration. Some of the benefits of using topic maps, as expressed by respondents to the research for this paper, include:

  • Expressive data structure - makes it easy to represent the structure of the information in the problem domain.
  • Intuitive model - Both developers and users find the concept-centric approach to structuring information to be a natural one. For developers this makes the data design portion of the application simpler. For users it can make navigation and searching operations more effective.
  • Data-driven, flexible schema - Both the data and the schema information for the data are kept in the same information structure. Because the schema is data driven and not the result of a compilation process (as is the case with RDBMS tables), the schema can be refined or radically altered at run-time.
  • Simple serialized format - XTM is sufficiently simple that it can be readily processed with XSLT, either for publishing a topic map to a set of HTML pages for browsing; or for generating topic maps from other sources of XML-based meta data.
  • Explicit rules for combination - Topic maps are designed to be merged - enabling sharing of information structures not only between businesses but between information systems. Thus, topic maps can be used as an effective information systems integration technology.
  • Implementations available - engine, creation, editing and publishing applications are all available as either commercial or non-commercial applications. For a systems integrator or application developer, this provides the ability to begin working at the "high level" with the information structures themselves rather than having to get involved with the implementation of serialization or parsing for example.

As with any new technology, it can be difficult to make the initial business case for applying topic maps, it is to be hoped that some of the case-studies provided here show that an initial investment can quickly deliver a return on investment and, more importantly, open up new technical avenues for the creation of additional business value.

Contact Information

Each of the applications described in this paper are 'real-world' applications. Many of the people responsible for those applications are present at this conference. For your future reference however, there now follows an alphabetical listing of the companies involved in the solutions described above.

Company Web-site Contact Person Contact Email
Cogitech, Inc. http://www.cogx.com Nikita Ogievetsky nogievet@cogx.com
Creuna http://www.creuna.com Stian Danenbarger stian.danenbarger@creuna.no
CSIR http://www.csir.co.za Richard Watson, Johan Eksteen rwatson@csir.co.za, jeksteen@csir.co.za
Empolis http://www.empolis.co.uk Graham Moore gdm@empolis.com
GlobalWisdom Inc. http://www.globalwisdom.org   sales@globalwisdom.org
Michel Biezunski http://www.coolheads.com Michel Biezunksi mb@coolheads.com
Ontopia http://www.ontopia.net   info@ontopia.net
Starbase http://www.starbase.com   elmer@starbase.com