BaseX: Enabling Search and Link Management
The BaseX system is an XQuery-based XML database that provides powerful search features for XML.
The DFST model uses an XQuery database to provide link management and DITA-aware search features. Any XQuery database can be used bu the DFST project provides materials for using the BaseX database. Note that in the DFST model, the XQuery database is used only as a read-only database to enable search and link management, it is not used to manage the source documents. Source document management is done in a separate source code repository (e.g., git) that provides the required version control and distributed access features.
BaseX is a light-weight, pure-Java XQuery database that is easy to install and use, making it appropriate for use on individual Authors' workstations. It provides a built-in Web server as well as a general database server that can be accessed using special BaseX clients.
DITA content is loaded to BaseX through the use of git commit hooks (provided by the DITA for Small Teams project) that load the DITA content to the BaseX repository whenever it is committed to the git repository. These commit hooks keep the BaseX repository in sync with the main content repository.
The DFST uses the BaseX HTTP server, which provides both a standard BaseX server and an HTTP server that enables Web access to the database.
- Install the BaseX HTTP server package
- Set the BaseX configuration file so DITA documents are parsed correctly against the DITA DTDs managed by a DITA Open Toolkit.
- Start the BaseX HTTP server
- Set up the BaseX git hooks to automatically copy documents to the BaseX database
- Test your setup to make sure everything is hooked up correctly
Once you have BaseX running you can test the configuration using a temporary database. It's easy to create a new database and add documents to it using the BaseX command-line client or the BaseX Web adminstration client.
Install BaseX
Installers are available for Windows (.exe download) and Mac (via Homebrew). You can also download the Zip package and simply unzip it somewhere and add the bin/ directory to your PATH environment variable.
The build-in BaseX user is "admin" with a password of "admin". If security is a concern you should at least change the administrator password if not create a separate user account for use by the git commit hooks.
One-Time BaseX Setup
In order to manage DITA documents properly, BaseX must be configured to use an XML catalog and to turn on DTD parsing. For DITA use you will normally use the catalog-dita.xml file maintained by the DITA Open Toolkit.
To configure BaseX to parse DITA documents, update the BaseX configuration file .basex in the BaseX installation directory:
- Find the location of the DITA Open Toolkit you will use to manage the master XML catalog file for your DITA documents.
If this is the Open Toolkit integrated with oXygenXML, the toolkit will be in the frameworks/dita/DITA-OT directory under the oXygenXML installation directory.
You will need the absolute path of the Open Toolkit directory, e.g., /Applications/oxygen/frameworks/dita/DITA-OT or C:\Program Files\Oxygen XML Editor 16\frameworks\dita\DITA-OT.
- Find the BaseX installation directory.
The exact location will depend on your operating system and how you installed it. The default location on Windows is C:\Program Files (x86)\BaseX. For OS X and Linux the command which basexhttp should show the location (the value returned will include the bin/ directory—you want the parent of the bin/ directory).
- Edit the file .basex in a text editor and add the following lines at the end of the file:
CATFILE = OT directory/catalog-dita.xml DTD = true CHOP = false
Where OT directory is the Open Toolkit directory you got in Step 1.
On OS X, my .basex file looks like this:# General Options DEBUG = false DBPATH = /Users/ekimber/apps/basex/data REPOPATH = /Users/ekimber/apps/basex/repo LANG = English LANGKEYS = false GLOBALLOCK = false # Client/Server Architecture HOST = localhost PORT = 1984 SERVERPORT = 1984 EVENTPORT = 1985 USER = PASSWORD = SERVERHOST = PROXYHOST = PROXYPORT = 0 NONPROXYHOSTS = TIMEOUT = 30 KEEPALIVE = 600 PARALLEL = 8 LOG = true LOGMSGMAXLEN = 1000 # HTTP Services WEBPATH = /Users/ekimber/apps/basex/webapp RESTXQPATH = HTTPLOCAL = false STOPPORT = 8985 AUTHMETHOD = Basic # Local Options CATFILE = /Applications/oxygen/frameworks/dita/DITA-OT/catalog-dita.xml DTD = true CHOP = false
On Windows, the CATFILE option looks like this:CATFILE = C:\Program Files\Oxygen XML Editor 16\frameworks\dita\DITA-OT\catalog-dita.xml
- Save the file.
To test this configuration you'll need to have a database, add a DITA document to it, and verify that all the default attributes were expanded on load. One indication that the configuration is correct is if it takes noticeable time to load DITA documents: BaseX has to fetch the DTDs and parse the documents with respect to them, which is much slower than just loading the XML without validating first.
There is one remaining setup task for which you need a running BaseX server: installing the DFST XQuery modules.
Start the BaseX HTTP Server
The DFST setup uses BaseX in two ways: through git commit hooks that update XML documents in the BaseX database and through a Web application that enables search and provides DITA link management services.
To support this dual use of BaseX you must run the BaseX HTTP server. The BaseX HTTP server then provides both the BaseX server (accessed through BaseX clients, such as the BaseX command-line client) and the BaseX HTTP server. You can run the server as a background service.
See the BaseX command-line options documentation for details.
- OS X or Linux: Run the command basexhttp -S to start the server as a background service.
- Windows: Use the "Start BaseX server" item in the BaseX start menu or run the command basexhttp -S
Once you have the HTTP server running you can connect to it in serveral ways, including using WebDAV, either from OxygenXML or from another WebDAV client. You should be able to set up the database as a WebDAV shared drive under all operating systems. See the BaseX documentation for details.
The DFST git commit hooks use the BaseX command-line client to connect to the BaseX server. The DFST DITA link management Web application is accessed via normal HTTP through a Web browser.
Create a BaseX Database and Load Some DITA Content
The BaseX server can manage any number of databases. A database is simply a named collection of XML documents.
- Open a command window and run the command basexclient.
You should be prompted to enter the BaseX user ID and password (e.g., "admin", "admin").
If the command does not work, check your PATH environment variable or change directories to the BaseX bin/ directory.
Once you have logged in you should see the BaseX command prompt:c:\>basexclient Username: admin Password: BaseX 8.0.3 [Client] Try help to get more information. >
- Create a new database to hold the sample documents:
> create database samples Database 'samples' created in 23.95 ms. >
- Use the list command to list the databases available:
> list Name Resources Size Input Path ------------------------------------ samples 0 4532 1 database(s). >
- Open the "samples" database to make it the current database:
> open samples Database 'samples' was opened in 0.02 ms. >
- Add the samples documents to the database:
> add to samples c:\Program Files\Oxygen XML Editor 16\frameworks\dita\DITA-OT\samples Resource(s) added in 2092.7 ms. >
Set the directory to wherever your Open Toolkit actually is.
The "to samples" part of the command adds the documents in the samples folder to a directory named "samples" in the database.
It should take a few seconds to load the documents. If the load was instantaneous, then the catalog and DTD parsing are not set up correctly.
- List the files in the database:
> list samples Input Path Type Content-Type Size ------------------------------------------------------------------------- samples/ant_sample/sample_all.xml xml application/xml 57 samples/ant_sample/sample_docbook.xml xml application/xml 42 samples/ant_sample/sample_eclipsehelp.xml xml application/xml 42 samples/ant_sample/sample_htmlhelp.xml xml application/xml 42 samples/ant_sample/sample_javahelp.xml xml application/xml 42 samples/ant_sample/sample_odt.xml xml application/xml 42 samples/ant_sample/sample_pdf.xml xml application/xml 42 samples/ant_sample/sample_tocjs.xml xml application/xml 42 samples/ant_sample/sample_troff.xml xml application/xml 42 samples/ant_sample/sample_wordrtf.xml xml application/xml 42 samples/ant_sample/sample_xhtml.xml xml application/xml 54 samples/ant_sample/sample_xhtml_plus_css.xml xml application/xml 73 samples/ant_sample/template_docbook.xml xml application/xml 38 samples/ant_sample/template_eclipsehelp.xml xml application/xml 38 samples/ant_sample/template_htmlhelp.xml xml application/xml 38 samples/ant_sample/template_javahelp.xml xml application/xml 38 samples/ant_sample/template_odt.xml xml application/xml 38 samples/ant_sample/template_pdf.xml xml application/xml 38 samples/ant_sample/template_wordrtf.xml xml application/xml 38 samples/ant_sample/template_xhtml.xml xml application/xml 37 samples/concepts/garageconceptsoverview.xml xml application/xml 17 samples/concepts/lawnmower.xml xml application/xml 20 samples/concepts/oil.xml xml application/xml 27 samples/concepts/paint.xml xml application/xml 27 samples/concepts/shelving.xml xml application/xml 27 samples/concepts/snowshovel.xml xml application/xml 27 samples/concepts/toolbox.xml xml application/xml 32 samples/concepts/tools.xml xml application/xml 67 samples/concepts/waterhose.xml xml application/xml 27 samples/concepts/wheelbarrow.xml xml application/xml 17 samples/concepts/workbench.xml xml application/xml 24 samples/concepts/wwfluid.xml xml application/xml 20 samples/tasks/changingtheoil.xml xml application/xml 69 samples/tasks/garagetaskoverview.xml xml application/xml 17 samples/tasks/organizing.xml xml application/xml 17 samples/tasks/shovellingsnow.xml xml application/xml 47 samples/tasks/spraypainting.xml xml application/xml 56 samples/tasks/takinggarbage.xml xml application/xml 39 samples/tasks/washingthecar.xml xml application/xml 68 Resources. >
- Verify that the @class attributes were correctly expanded:
> xquery /*/@class class="- topic/topic concept/concept " class="- topic/topic concept/concept " class="- topic/topic concept/concept " class="- topic/topic concept/concept " class="- topic/topic concept/concept " class="- topic/topic concept/concept " class="- topic/topic concept/concept " class="- topic/topic concept/concept " class="- topic/topic concept/concept " class="- topic/topic concept/concept " class="- topic/topic concept/concept " class="- topic/topic concept/concept " class="- topic/topic task/task " class="- topic/topic concept/concept " class="- topic/topic task/task " class="- topic/topic task/task " class="- topic/topic task/task " class="- topic/topic task/task " class="- topic/topic task/task " Query executed in 2.57 ms. >
This XQuery simply returns all the @class attributes of all the root elements in the database. If those elements have @class attributes then all the elements will.
You have now verified that your BaseX server is correctly configured to manage DITA documents and is ready to get updates from your git repository.
> drop database samples Database 'samples' was dropped. > list Name Resources Size Input Path --------------------------------- 0 database(s). >
Install DFST XQuery Modules
The DFST XQuery modules provide the DITA-specific link management and DITA-aware searching features you need.
- Open a command window and navigate to the basex/scripts/ directory:
c:\>cd c:\projects\dita-for-small-teams\basex\scripts
Or
c:\projects\dita-for-small-teams\basex\scripts>
- Run the install-modules script:
c:\projects\dita-for-small-teams\basex\scripts>install-modules.bat admin admin Installing DFST XQuery packages: Name Version Type Path ----------------------------------------------------------------------------------------------------------------------------------- org.dita-for-small-teams.xquery.modules.dita-utils - Internal org/dita-for-small-teams/xquery/modules/dita-utils.xqm org.dita-for-small-teams.xquery.modules.relpath-utils - Internal org/dita-for-small-teams/xquery/modules/relpath-utils.xqm 2 package(s). c:\projects\dita-for-small-teams\basex\scripts>
Set Up Git Commit Hooks for BaseX Update
The DFST git commit hooks for BaseX keep BaseX in sync with your git repository.
- The BaseX bin/ directory must be in your PATH or Path environment variable. You should be able to type "basex" or "basexclient" on the command line. This allows the git hook scripts to run the required BaseX commands. You should have set this up when you installed BaseX.
- The BaseX server connection properties must be set in the .basex file. This file can be in the main BaseX installation directory or in your user home directory. This allows the basexclient command to connect and authenticate to the BaseX server without the need for a separate configuration file.
- The DFST git hooks must be copied to or linked from the .git/hooks/ directory in the git repositories you want to manage. These hooks keep the BaseX databases in sync with your git repositories.
- Edit the .basex configuration file, either in the main BaseX installation directory or in your home directory, and add the following settings:
USER = admin PASSWORD = admin HOST = localhost PORT = 1984
These values are the BaseX default. Your configuration must reflect any changes you made from default after you installed BaseX.
For OS X: If you want to put the .basex configuration file in your home directory, you must delete the .basexhome from the BaseX installation directory, otherwise BaseX will not read the .basex file in your home directory.
- Copy the files from the DFST project commit-hooks/git/ directory to the .git/hooks directory under the root directory of your git repository.For OS X and Linux: Instead of copying the files you can link to the files in the DFST project. This makes it easier to keep the hooks updated. To do this, use the ln -s command like so:
ekimber:project-01$ cd .git/hooks ekimber:hooks$ ln -s ~/dita-for-small-teams/commit-hooks/git/basexLoadOrUpdateBranch ekimber:hooks$ ln -s ~/dita-for-small-teams/commit-hooks/git/post-checkout ekimber:hooks$ ln -s ~/dita-for-small-teams/commit-hooks/git/post-commit ekimber:hooks$ ln -s ~/dita-for-small-teams/commit-hooks/git/post-merge ekimber:hooks$ ln -s ~/dita-for-small-teams/commit-hooks/git/recordGitStateDetails ekimber:hooks$ ln -s ~/dita-for-small-teams/commit-hooks/git/updateBaseXForCOmmitOrMerge
For OS X and Linux: Make sure all the commit hook files are executeable. They should be as they come from the DFST git repository but they may not be for whatever reason. To make them executable, apply this command to the directory that contains the scripts:ekimber:dita-for-small-teams$ chmod a+x commit-hooks/git/*