Feasibility Study
for alternative compatibility, try the XML version
if you have problems consult the support page
justin's homesite
/ resources/cs501feasibility.html

CS 501 - Software Engineering
Feasibility Report

Project for the Cornell Legal Information Institute
PDF versions of the United States Code

Version 1.1 (revised)
November 3 2003
Project Homesite

Key stakeholders/organizations :
Developer Team: Tsung-Yueh Chiou, Tsee Yuan Lee, Kohsuke Kawaguchi, Soyeon Kim, Justin Tung
Project sponsors: Thomas Bruce, Professor William Arms


Table Contents
I. Introduction

  1. Project Outline
  2. Scope
  3. Project Relations
  4. Benefits

II. Software Development Plan

  1. Constraints
  2. Obstacles and Risk
  3. Deliverables
  4. Project Process
  5. Resources


I. Introduction

1. Project Outline

The United States Code is released to the general public by the US House of Representatives on its Web site at http://uscode.house.gov/download.htm. This code is a plain-vanilla ASCII version to which the Legal Information Institute (http://lii.law.cornell.edu/) adds value. This website is at:
http://www4.law.cornell.edu/uscode/.

An earlier CS 501 team (Legal Data Markup Software Project) and a later student project developed programs for the Legal Information Institute (LII) that convert the raw ASCII output of the House of Representatives to XML, for subsequent reuse in various settings. This XML is then converted to HTML producing the current website. This conversion was facilitated by XSLT scripts. The existing web user interface supports a search engine, HTML legal code, cross-referencing, and notes on each section of the code. The US Code currently gets about half a million hits daily.

The new project, overseen by Thomas Bruce, Co-Director of the Legal Information Institute, is to create PDF versions of code from the existing XML. In the current website, users navigate through HTML links to see the leaves of the legal code tree, which are called 'sections'. Our goal is to implement a system where user can pick sections and more larger divisions (called as 'chapter' or 'part') via a shopping cart system. The new PDF format adds printability and portability to the legal code, which was hard to achieve with the HTML version. The cart functionality will be integrated into the existing website and will be present in each HTML page of legal code. The idea is then to possibly charge the user for the service followed by on-the-fly XML to PDF conversion using XSL. Then the generated PDF documents containing the sections of code requested by the user along with additional information (e.g. table of contents, notes, cross-referencing) will be delivered to the user.

2. Scope

Required

  • Pre-generate PDFs for bottom elements of legal code tree
  • Pre-generate PDFs for larger divisions
  • Modify existing XSLT script to incorporate PDF functionality in existing website

Desired
PDF generation of US legal code for either:

  • on-the-fly PDF generation
  • an extensible caching-queuing framework
  • Shopping cart payment system and charging scheme for PDFs

Optional

  • Database to support queuing and to track user statistics

3. Project Relations

Another team is working at LII on improving XML format of US code but does not directly affect our project.

4. Benefits

  • Convenience of legal code in a PDF format available to public as supposed to website format
  • Ability to provide an organized format for legal code in PDF for printing purposes
  • Shopping cart payment system for LII


II. Software Development Plan

1. Constraints

The project is funded by Red Hat, therefore the client expressed a preference to rely solely on open-source software. Given the number of visitors to the website (approx. 9 million per week), the system must be able to handle the large volume. Experience-wise, the project team must learn new programming languages and environments.

2. Obstacles and Risk

i. Scalability
At this moment we are not sure whether the currently deployed shopping cart system can handle a large number of code sections which we will need to handle. Also, it is not clear how many users will use this PDF service. Given the number of visitors to the website, the load of the system can be huge.

ii. Lack of Experience
The developers' need to learn several systems and technologies is apparent in several areas. The production system is running on Linux, but none of the team members has good Linux background. The project needs to use various XML-related technologies and CVS fairly extensively, but most of the team members need to learn those technologies and CVS. In addition, none of the team members has experience working with the shopping cart system and the payment system currently deployed. Regarding legacy systems, the existing system that produces HTML from US code has to be changed, therefore the team must spend some resources to become knowledgeable about it.

iii. Integration with existing systems
The new system might need to coordinate with the currently deployed shopping cart system. However, at this moment we are not sure if this system provides such a coordination mechanism.

iv. US Legal Code Complexities
The code contains many irregularities in structure and data organization. Also, the DTD does not capture the precise semantics of the code.

3. Deliverables

  • Feasibility Report
  • Requirements Analysis
  • Prototype
  • System and Program Design
  • Modified website
  • PDF Conversion Engine

4. Project Process

Following feasibility and requirements analysis, we plan to produce an initial prototype of the system to show to the client to confirm developer thoughts on the project. After the prototype, we will move on to systems and program design and coding phases. Unfortunately, due to the nature of the legal data, there is not comprehensive test plan. At the end, we will conduct documentation and debugging as well as acceptance testing.

5. Resources

Software:

  • XSLT Engine
  • PDG Shopping cart
  • XML
  • Verisign Pay Flow
  • XSL/FO Processor (PDF Generation)
  • Apache server
  • CVS
  • Linux

Hardware:

  • 1 server (lula.law.cornell.edu) used for project development and as a server for the concurrent version control system (CVS).

Facilities and Tools

  • Regular meetings Mondays 5-7 in Upson Hall or the Law School
  • Yahoo groups: cs501 (http://groups.yahoo.com/group/cs501/) egroup for message board and limited file sharing capabilities
  • CVS Repository on lula.law.cornell.edu will be used for development


ARTEMIS | resources/cs501feasibility.html by justin tung generated using Apache Software Foundation's Xalan-J version 2.7.2

artemis on tower