Site Mapping

Back to:  VRC Categories  |  VRC Tool Box

Change Detection Icon

View tools in the Site Mapping category

Site Mapper Category Description

Summary
Site Mapping tools examine the structure of a Web site and create a map depicting each page in the site and the links that connect the pages. For the purpose of VRC, site mapping utilities can potentially be useful in each of the risk management modules, although they primarily suitable to generate site characterizations (Analysis module) and to detect structural-level changes to a site (Detection module).

Introduction
As its name implies, the Site Mapper category is concerned with utilities that create a “map” of a Web site. Much like a typical map from which the metaphor is taken, a Web site map provides a representation (graphical and/or text) of each page that resides in the site and depicts the links that a user may take to navigate from page to page, and out to the Web.

The origin of site mapping tools can be found in information visualization research. With the advent of the World Wide Web, site maps became a popular tool for navigation. Shareware programs for creating site maps were developed at least as far back as 1998. Several instances of site maps were uncovered using the Internet Archive Wayback Machine at least as far back as 1996. A few mentions of using site maps for Web site navigation can be found in Usenet archives in 1995. As sites grew in size and complexity, site mapping software evolved to streamline the process. Early site maps were intended to index a web site for website visitors, but the popularity of site maps decreased as usability studies criticized them, and the cost of maintaining them became too high. Instead of only creating a simple user-oriented site map, software began to offer additional features, such as the ability to check links, and check spelling.

Most site mapping tools are intended for use by Web site developers. Indeed the ability to create site maps is a common component of modern HTML editors, such as Dreamweaver, GoLive and FrontPage. Mapping tools and services are useful in the planning phase of site design and redesign in order to visualize and optimize the flow of information (e.g. information architecture), especially for very large sites. A site map is also a common entity (a.k.a. index, guide, or table of contents) within an existing site to provide users a navigational overview of the content of the site. Some mapping utilities can also produce navigational aids for use in off-line distribution, such as CD-ROM.

Other specialized mapping tools create site maps in the context of knowledge mapping for visual learning (e.g. The Brain-www.thebrain.com, Inspiration-www.inspiration.com, Visio-www.microsoft.com/office/visio). Web visualization tools—complex kin to site mappers—use many graphic styles to capture hierarchical sets of link relationships, often packing in more data by using color (e.g. to distinguish link directionality) or size (e.g. to represent the amount of traffic along a link). These tools may be of more experimental utility, as many are not particularly polished or dependable for everyday use. Many visualization tools are designed to map an entire region of the Web, and show connections between websites. This is an ambitious goal, and is difficult to put into practical application.

Site mappers are closely related to other VRC tools. Generally speaking, a site mapper is a specialized web crawler. A typical site mapper shows links between pages, acting indirectly as a link checker. If a site is mapped periodically it may function as a crude Web site monitor and capable of detecting change at the page structure level. Site map capabilities are sometimes included as a component of Web management applications.

Site Mapping for VRC
For the purposes of remote control, site mappers and site visualization tools may be used to view and understand the full content and structure of a website. Some mappers can diagram any Web site with a valid URL without violation of the end user license agreement. This capability is essential for remote control.

Site mapping and visualization utilities that do allow crawling of a remote site can potentially be useful in each of the risk management modules, although these tools are primarily suitable for generating site characterizations (Analysis module) and to detecting structural-level changes to a site (Detection module).

Identification
Examining the links into and out of a site could be a beneficial when determining which Web sites need to be monitored.

Analysis
It can be difficult and time intensive to explore even a moderate sized site by hand-linking page to page. A site map may prove most valuable by providing a less-cluttered and more succinct image of the extent and layout of the site.

Appraisal
As a Web site developer might employ site mapping to design effective flow of information, a preservationist can similarly use a map to gauge the extent to which the site was established and maintained with sound architectural principles. Also, the depth and breath of site may in some instances be a factor in assessing the value of the site from a preservation standpoint.

Strategy
Data about the size and complexity of a site may be necessary to develop effective management strategies.

Detection
Site mapping utilities can be used periodically to detect changes at the site structure level. Excessive and frequent changes in structure might flag a potential risk to information loss. Infrequent changes to structure may be an indication that a site is at risk of obsolescence. While most site mapping applications cannot detect a change to the content of a page, the page level of resolution may be sufficient and even more instructive in some contexts (for example, detecting a change in number pages of a site that maintains a large number of pdf files).

Response
The visual display of a site map can be a useful way to document and present a Web site.

Site Mapper Features
Since site mappers are a specific type of Web crawler, they share many of the potential uses for VRC—at least to the extent that the mapping tool generates valuable data about the remote files as a Web crawler. However, many common core features of site mapping software are less useful for VRC (e.g., drawing tools, ability to upload maps to the Web site). Currently, there are two main types of site mapping tools, those that primarily create maps to be used on websites, and those that create maps to be used by developers. In the context of remote control, we will concentrate on mapping tools aimed at developers, since these tools are more fully functional.

Certain features seem to be common among all site mappers and site visualization tools, including the ability to crawl a website or a local directory, the ability to display all HTML files linked within a site, and the ability to display the links between HTML files. The following additional features were determined to be most important for VRC:

Site mapping and visualization tools are most commonly stand-alone applications, or distributed within other software packages. These range from freeware to high-end (> $1000) commercial products. Most, though not all, appear to be well-maintained and supported. Most fall into the $30–150 price range.

Other modes include Perl scripts and Java applets primarily used to generate a server-side (or even client side) map or index of the Web site as the browser downloads it. Perl-based offerings are generally covered by the GNU General Public License. Java applets are typically commercial products.

Site mapping may also be a component of larger site management services (site maintenance) or information architecture (site development) services. These service providers generally market to large e-commerce sites; features are mostly oriented to local sites. Other subscription—and even free—services are available to create site maps and search capability, but these are also restricted to local sites.

Overall there are a fair number of types of site mapping tools available although not all are suited for remote monitoring. Nearly all offerings are Windows compatible or OS independent; few products are available for the Mac platform with the exception of more graphics oriented software (e.g. ConceptDraw, Inspiration).

Tool Evaluation Selection Process
A list of site mapping and visualization tools and services was generated through:

The list was first narrowed to tools that were currently available. Tools that did not enable remote mapping were then eliminated. Of the remaining possibilities, tools were selected based on features that would best suit them for VRC (see ideal features list above). One aspect that presented difficulty in tool selection was that Site Map functionality for the purpose of remote control may overlap or be combined with tools in other categories.