System Maintenance Document
$Id: SystemMaintenance.html,v 1.41 2002/05/16 05:13:27 jt96 Exp $
CS501 Project of LII PDF conversion

Table of Contents

  1. Introduction
  2. Installation
  3. Server Configuration
    1. Directory Structure
    2. Properties and File Configuration
  4. Running the Programs
    1. Scripts
    2. Crontab
  5. Troubleshooting
    1. Order Fulfillment Engine
    2. PDG Catalog Generation
    3. XSL
    4. Permission Problem
    5. JVM
    6. PDG Error Code -3
  6. Continuous Development
    1. Extensibility
      1. On-the-fly PDF Generation
      2. Generation and Pricing Policies
      3. Logging to Database
    2. Development
      1. Build Script and CVS
      2. Problems with LDMS
  7. Security Issues

Introduction

This document describes issues involved in the maintenance and continued development of the PDF versions of the US code system.

The first section deals with how to install and deploy the system using set configurations and outlines which programs to run. The next talks about troubleshooting the system, followed by issues involved with extensibility of the system and its continued development.

If you are an administrator,
read the Running the Programs section to how to use the system. Then proceed to the Server Configuration section.
If you are a developer,
read the whole document.

Installation

  1. Obtain the copy of the workspace. See the Continuous Development section of this document to see how to do this.
  2. Run the "build.sh" script with the package target. This will compile the whole program and package them into a set of jar files.
    $ build.sh package
    
  3. If you are installing (or upgrading) lula, run the deploy target. This will copy all the necessary files into the appropriate directories on lula so that you are prepared to run the programs.
  4. Otherwise, first set up a directory structure by following the Directory Structure section. We will call it the stage directory.
  5. The packages directory in your workspace should contain jar files that are produced by build.sh package. Copy all of them into the binary directory of the stage directory.
  6. Copy everything in the yourworkspace/bin directory to the binary directory of the stage directory, where you just put jar files. These shell scripts allow you to quickly launch LIIPDF programs. See the Running the Programs sections for the details about how to run them.
  7. Set up PDG shopping cart (see their manuals for details.) Also, see PDG Catalog Generation in the Troubleshooting section.
  8. Copy the yourworkspace/website directory to some directory where an HTTP server can serve them. Those are mainly images and some other static files.
  9. Schedule OFE to run periodically. See crontab section for details. Depending on the configuration of the target environment, you can use other approaches, such as procmail.
  10. A mailbox is necessary for OFE. So create an e-mail account for it, then configure PDG shopping cart so that vendor notification e-mail shall be sent to this address.
  11. To further reduce human intervention, a batch process can be set up to fecth updates of U.S. Code and generate contents automatically. See /usr/local/uscode/scripts/transformRange.pl for the example of such a configuration.
  12. Finally, configure the liipdf.properties file in the binary directory of the stage directory. See the server configuration section for details.

Server Configuration

Directory Structure

The directories described in the table below are the essential components of the system like output targets, script/source file directories, PDG shopping cart directory, etc. These directories are used in configuring the properties used in the build.sh script mentioned above.

Directory Information
/usr/local/uscode Main directory of US Code System. All directories in uscode were inherited from the LDMS Project except /pdf .
/usr/local/uscode/xml XML generated by the LDMS Project and to be used by the PDF versions of US Code Project.
/usr/local/uscode/pdf Main directory of PDF versions of US Code Project. Contains logfile.log, which records all orders received by the OFE and corresponding PDF request information. These are just snapshots of CVS. So do not edit them directly.
/usr/local/uscode/pdf/docs Documents for the project including presentations. Contains feasibility study, requirements, system design, Javadoc, PDG shopping cart documentation, and this document. Do not modify these documents directly, rather modify them in the CVS and then redeploy them using build.sh deploy in the CVS.
/usr/local/uscode/pdf/xml
/usr/local/uscode/pdf/html (/var/www/html/uscode/pdf/)
/usr/local/uscode/pdf/pdf
Storage directories for the respective xml, html, and pdf files. More information is provided in the Properties and File Configuration section.
/usr/local/uscode/pdf/scripts (/usr/local/uscode/scripts/pdf) Contains scripts and java package files.
/usr/local/uscode/html/walkthrough Static web pages outlining help for purchasing PDFs.
/usr/local/uscode/pdf/cgi-bin (/var/www/pdf-cgi/) Directory containing CGI files.
/usr/local/uscode/pdf/cgi-bin/pdg Directory containing PDG Shopping Cart config files.
/usr/local/uscode/pdf/cgi-bin/pdg/PDG_Cart Main directory containing PDG Shopping Cart catalog files.

Properties and File Configuration

The configurable variables are set in the liipdf.properties file in the binary directory in the CVS. This file is at /usr/local/uscode/scripts/pdf/liipdf.properties.

Here is the content of this file. We can see the variables and their explanations.


# LIIPDF configuration file
#
# All of the LIIPDF related programs read this configuration file.
#
# For details about this file, see the system design document.



# Path name of the directory where division XML files are stored.
# A sub-directory will be created for each title under this directory,
# and actual division XML files are stored inside that directory.
# The path name must ends with the path separator '/'.
LIIPDF.common.path.xml = /usr/local/uscode/pdf/xml/


# Path name of the directory where HTML files are stored. The same
# rule and constraint applies as the "LIIPDF.common.path.xml" property.
# Different path parameters can have the same path name (in that case
# different file formats are stored in the same directory.)
LIIPDF.common.path.html= /usr/local/uscode/pdf/html/


# Path name of the directory where PDF files are stored. The same rule
# and constraint applies as the "LIIPDF.common.path.xml" property
LIIPDF.common.path.pdf = /usr/local/uscode/pdf/pdf/

# Path name of the directory where PDG catalog files are stored.
# Usually, this should be the PDG_Cart sub-directory where you
# deploved PDG shopping cart.
LIIPDF.xml2catalog.catalogpath=/usr/local/uscode/pdf/cgi-bin/pdg/PDG_Cart


# This key will be used to scramble PDF URLs. It is a hexadecimal
# representation of 64-bit DES key. An administrator can always generate
# a fresh key by using a supplementary DESKeyGen tool or any other publicly
# available tool that can generate 64-bit DES key.
LIIPDF.common.DESKey = 04B915BA43FEB5B6


# Fully qualified name of the Java class that implements the Logger
# interface. This class will be used to write log messages. We don't
# expect a system administrator to casually modify this parameter.
LIIPDF.util.logger = edu.cornell.law.liipdf.util.FileLogger


# Used by FileLogger, the default logger. This property specifies
# the full path name of the log file to which log messages are sent.
LIIPDF.util.FileLogger.logfile = /usr/local/uscode/pdf/logfile.log



# Used by OrderEmailReader. This property specifies
# the email server and addresses needed for ofe.
#=======================================================

# The name of the pop3 server from which OFE reads order verification
# e-mail.
LIIPDF.ofe.pop3server=pop.kohsuke.org

# Username and password of the mailbox. PDG shopping cart must be
# configured so that this mailbox will receive order notification e-mail.
LIIPDF.ofe.pop3user=ofe
LIIPDF.ofe.pop3password=ofe

# The name of the SMTP server which OFE uses to send e-mail. This server
# needs to allow e-mail to be sent from the following support e-mail
# address. (Depending on the domain name of the mail address,
# your mail server might reject the relay operation.
LIIPDF.ofe.smtpServer=localhost
LIIPDF.ofe.supportEMail=cs501@yahoogroups.com

# The name of the class that implements PDFGenerationPolicy.
# This object controls whether PDF files are generated for a given division.
LIIPDF.xml2pdf.genPolicy=edu.cornell.law.liipdf.xml2pdf.policies.LowerMostTwoPolicy

# The name of the class that implements PDFPricingPolicy.
# This object determined the price for PDF files.
LIIPDF.xml2catalog.prcPolicy=edu.cornell.law.liipdf.xml2catalog.policies.FixedPricePolicy


# EOF

Running the Programs

Scripts

This section outlines the procedure for typical use of the scripts (in /usr/local/uscode/pdf/scripts) and the dataflow involved. Once the system is deployed, the following programs will be used to generate contents and deploy them to the web server. The diagram shows how the scripts (contents generation sub system) fit into the overall data flow. For more information on the other elements, see the system design document.

Procedure

LDMS: Generate XML
Once changes are made to the ASCII version of the US Code, LDMS should be run to generate new xml files for each title.
XML Splitting: xmlsplt.sh
Split the title xml files into smaller division xml files for manageability.
PDF Generation: xml2pdf.sh
Convert xml US Code to pdf version
HTML Generation: xml2html.sh
Generate the HTML from XML with modifications from LDMS project to incorporate PDF links and Shopping Cart
PDF Shopping Cart Catalog Generation: xml2catalog.sh
Generates catalog for the shopping cart.

All the programs run for individual titles. Therefore, for example, to process title 5 and 39, you need to run programs twice; and you cannot run programs just for a particular section of a title. Everything has to be done for a whole title.

If you invoke programs without any option, it will display the help screen that describes the syntax of the command line parameters. Consult this help screen for details about the usage. Most of the program reads configuration from a config file liipdf.properties.

Also, the order you run these programs is crucial, because oftentimes ouputs of a program is used as inputs to another. The following table summarizes the dependencies.

Program Input Output
LDMS US Code in the ASCII format title XML
xmlsplt.sh title XML division XML
xml2pdf.sh division XML PDFs
xml2html.sh division XML and PDF HTMLs
xml2catalog.sh division XML and PDF PDG catalog files

xml2html.sh is depending on PDF because it needs to check the existence of a PDF file to determine whether it should produce a PDF icon on the web page. This allows the admin to delete problematic PDF files before running xml2html.sh so that the user will not be able to order them.

In addition, to active the newly created catalog files in PDG shopping cart, the admin needs to go to the PDG shopping cart config screen and click the "make changes live" button at the bottom of the screen. Consult the PDG shopping cart manual for details.

Crontab

OFE program needs to be invoked periodically to process orders. One easy way to do this is through crontab. To run OFE through crontab, you should launch ofe.cron.sh instead of ofe.sh so that various configurations are done appropriately.

The following file is a sample crontab confiugration. It runs OFE for every hour.

: crontab -l
# DO NOT EDIT THIS FILE - edit the master and reinstall.
# (crontab.file installed on Fri Apr 26 13:19:19 2002)
# (Cron version -- Id: cronofe,v 1.16 2002/04/28 02:36:31 jt96 Exp )
22 * * * * /usr/local/uscode/scripts/pdf/ofe.cron.sh

Troubleshooting

In this section are listed some common problems and how to address them.

Order Fulfillment Engine

PDG Catalog Generation

Catalog files must be only four characters in length, thus each product category in PDG is four characters as well. As a result, the project has chosen PD to prefix all title numbers from 01 - 50. This convention means all catalog files are of the form PDXX.pdg .

There are bad characters, namely < > / : | ; that the catalog files do not accept. These characters have been escaped in the source code but should be kept in mind if catalog file editing is done.

Upon generation of the catalog files, in order to insert catalog files from the back-end of PDG, one must modify the shopper.conf file in the /usr/local/uscode/pdf/cgi-bin/pdg/PDG_Cart as follows.

ProductCategory=A000:Gadgets & Widgets:|Templates/A000_storebuilder.html|
ProductCategory=B000:Sample Items:|Templates/B000_storebuilder.html|
ProductCategory=PD01:USCode Title 01:||
ProductCategory=PD02:USCode Title 02:||
ProductCategory=PD03:USCode Title 03:||
ProductCategory=PD04:USCode Title 04:||

...

ProductCategory=PD47:USCode Title 47:||
ProductCategory=PD48:USCode Title 48:||
ProductCategory=PD49:USCode Title 49:||
ProductCategory=PD50:USCode Title 50:||

This action basically involves adding product categories to PDG so the GUI will recognize the files. Once that is done, changes can be made live in the GUI.

XSL

XSL programs are located in the xml2pdf and xml2html Java packages.