Quick Tutorial: Transform AnoHAT HXS Files into JavaHelp TopicsConverting legacy help formats into modern, maintainable help systems is a common task for documentation teams. This tutorial walks through converting AnoHAT HXS files (compiled HTML Help eXtended sources often used in older Windows help systems) into JavaHelp topics. You’ll learn how to extract content, prepare assets, map structure, and generate JavaHelp-compatible HTML and XML so the resulting help set integrates cleanly with Java applications.
What you’ll need
- A Windows machine (or VM) with tools to extract HXS/HHC content, or access to the original source HTML.
- Decompiler/extraction tools (examples: 7-Zip, Microsoft HTML Help Workshop, HxS decompiler tools).
- A text editor or IDE (VS Code, IntelliJ, or similar).
- JavaHelp SDK (if you plan to build and test a JavaHelp set locally).
- Basic knowledge of HTML, CSS, and XML.
Background: HXS vs JavaHelp
- HXS: A compiled help file format from Microsoft (used by HTML Help and components like Help 2 / HxS); content often includes HTML pages, images, a contents/index file (.hhc, .hhk), and compiled metadata.
- JavaHelp: An Oracle/Sun help system for Java applications; uses a contents map (map.xml), TOC (toc.xml), search indices, and topic HTML/XHTML pages. JavaHelp expects a specific directory structure and XML configuration (helpset file).
Step 1 — Extract HXS contents
- If you have the original source files (HTML, images, CSS), start with those. If only the compiled HXS is available:
- Try simple extraction with 7‑Zip (right-click → Open archive). Some HXS files can be opened like archives.
- If that fails, use specialized HXS extraction tools or an HxS decompiler. Search for “HxS decompiler” or “extract HXS contents” to find community tools.
- Result: a folder containing topic HTML pages, images, style sheets, and navigation files (for example, .hhc for table of contents, .hhk for index).
Step 2 — Audit and cleanup extracted HTML
- Open several topic files to check HTML quality:
- Look for non‑standard HTML (framesets, ActiveX, script-heavy navigation).
- Check encoding and character set declarations (convert to UTF‑8 if necessary).
- Normalize doctype and structure:
- Convert legacy markup into well-formed HTML5 or XHTML (JavaHelp works with HTML; XHTML often yields fewer quirks).
- Remove or adapt deprecated tags (font, center) and inline event handlers that may break in JavaHelp.
- Consolidate CSS:
- Collect all stylesheets and move them into a single styles folder. Adjust relative paths.
Step 3 — Plan the JavaHelp structure
JavaHelp requires a particular file layout and several XML files:
- helpset (.hs)
- map file (map.xml) — maps topic IDs to files
- table of contents (toc.xml)
- index (index.xml) — optional, but useful
- search index files — generated by JavaHelp tools
- /docs — folder with topic HTML/XHTML and assets
Decide how the existing HXS TOC (often .hhc) will map into JavaHelp’s TOC. Keep topic IDs consistent and meaningful (for example, using original HXS anchor names).
Step 4 — Convert TOC and Index
- Extract structure from .hhc (TOC):
- The .hhc file is an HTML-like file with nested
- Create toc.xml for JavaHelp:
- For each TOC entry create a
with title and target. - Example snippet:
<toc version="1.0"> <tocitem text="Introduction" target="docs/intro.html"/> <tocitem text="Getting Started" target="docs/getting_started.html"> <tocitem text="Installation" target="docs/install.html"/> </tocitem> </toc>
- For each TOC entry create a
- For index (.hhk), extract keywords and target pages and convert into JavaHelp index format (index.xml). Each
should include terms and the target topic.
Step 5 — Create map.xml (topic ID mapping)
JavaHelp’s map.xml links IDs used by URLs or the TOC to actual files:
- Choose unique IDs, preferably derived from original anchors or filenames (e.g., intro, install).
- Example:
<?xml version="1.0" encoding="UTF-8"?> <map version="1.0"> <mapID target="docs/intro.html" name="intro"/> <mapID target="docs/install.html" name="install"/> </map>
- Ensure the TOC and index reference these IDs where appropriate.
Step 6 — Adapt topic pages for JavaHelp
- Titles and anchors:
- Ensure each topic has a clear
and, if linking by anchor, include stable id attributes on heading elements.
- Ensure each topic has a clear
- Relative links:
- Convert absolute or HXS-specific links to relative links inside the /docs folder.
- Replace unsupported scripting navigation with plain HTML anchors or JavaHelp-specific linking (using map ID references).
- CSS and assets:
- Update stylesheet links: (adjust relative paths).
- Copy images and other media into the help set’s assets folder.
Step 7 — Build the helpset (.hs) file
Create a helpset file (XML) that references your TOC, map, and index:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE helpset PUBLIC "-//Sun Microsystems Inc.//DTD JavaHelp Helpset Version 2.0//EN" "http://java.sun.com/products/javahelp/helpset_2_0.dtd"> <helpset version="2.0"> <title>Your Product Help</title> <maps> <homeID>intro</homeID> <mapref location="map.xml"/> </maps> <view> <name>TOC</name> <label>Contents</label> <type>javax.help.TOCView</type> <data>toc.xml</data> </view> <view> <name>Index</name> <label>Index</label> <type>javax.help.IndexView</type> <data>index.xml</data> </view> </helpset>
Place this helpset file at the root of the help set folder.
Step 8 — Generate search indices and test
- If you have the JavaHelp SDK or tools, run the indexer to create search indices for full‑text search.
- Load the helpset in a Java application or a simple JavaHelp viewer:
- Use HelpSet class to load by file URL and open the help broker.
- Navigate the TOC, search, and ensure links open correct topics and assets load.
Troubleshooting common issues
- Broken images or CSS: Fix relative paths; ensure assets copied into help set.
- Encoding problems: Convert files to UTF‑8; declare correct charset in meta tags.
- Missing anchors: Add id attributes to headings or create HTML anchor elements.
- Scripted navigation failing: Replace with static links or JavaHelp map‑based linking.
Automation and scripting tips
- Use small scripts (Python, Node.js) to:
- Parse .hhc/.hhk and generate toc.xml and index.xml.
- Batch-convert HTML encoding and fix links.
- Copy and reorganize assets into the JavaHelp folder structure.
- Example Python approach: use BeautifulSoup to parse .hhc and create XML templates.
Final checklist before deployment
- All topics present, titles correct, and metadata accurate.
- map.xml IDs match TOC and index references.
- Assets (CSS, images) load correctly.
- Search index generated and working.
- Helpset tested inside target Java application.
Converting AnoHAT HXS to JavaHelp requires careful extraction, cleanup, and mapping of structure. With a reproducible process (scripts for parsing .hhc/.hhk and fixing links) you can convert large help sets reliably and preserve navigation and indexing for Java applications.
Leave a Reply