Obigatory Joke.
The presenter describes
code development practices
that have minimized conversions for Year 2000 compliance. The
single most important step was the development and disciplined
use of the sub-routine SYSDATE. All date conversions and formatting
are performed via SYSDATE, thus eliminating date logic from application
programs. The speaker describes several date formats used by his
clients, and describes how they were "modified" to support
the Year 2000. The presenter concludes with suggestions which
will help you assess your own environment, and how you might "retrofit"
to use similar techniques to minimize your own conversion.
The speaker has been developing COBOL code since 1968. Other "languages" he has used include: Univac 1050 assembler, IBM 360 assembler, CDC Cyber assembler, AutoCoder, Algol, PL/I, Fortran, Pascal, C, C++, Java, and HTML!
Given that YP has been in almost universal
use by my current clients since 1985, most of their YEAR 2000
conversion issues have been nicely side stepped. Dates sort nicely,
and years 2000 and beyond are easily accommodated with no changes
to field length or sort positions. Since all date outputs have
used GEMDATE for formatting since 1975, data output has required
no changes except where the client wants to see the century.
Even in those cases, it is simply a matter of changing the output
length from 8 to 10. GEMDATE has supported this conversion automatically
since 1990.
The only work we have left to do is to apply
windowing logic to two digit input dates. This has not yet been
done, but the effort is estimated to be trivial.
The X(14) date format has been adopted for
new files, and for any file as it is converted from VSAM to SQL
format. We currently have several files that live on both the
mainframe, and in client/server databases. Since client/server
architectures do not support packed decimal, the X(14) format
has been adopted.
The concept of performing all date formatting
through only two sub-programs was a logical fallout of a programming
philosophy adopted by the Texas Education Agency in 1972, and
fully implemented in 1975. In those ancient times this practice
was called modular programming. In 1972 TEA adopted the following
principles for program development:
In 1975, the TEA completed the development
of a report writer and data extract system called Generalized
Extract System (the report writer part), and Generalized Extract
and Maintenance System (the date update part), know as GES/GEMS.
Over time, the GES part has had the most pay-back in terms of
reduced time to develop reports and data extracts. The GEMS part
was intended to eliminate update programming, but that promise
was never fulfilled.
Hand written report programs virtually disappeared.
The GES tool, developed at a time when programs were coded with
a key punch, is still in use by several of my clients twenty two
years later. In fact, in 1997 the GES report writer and dictionary
maintenance support were converted from COBOL 68 standard to COBOL/LE
(Language Environment), IBM's most recent COBOL compiler.
In 1985, the concept of a log file was added.
This log file was not for data recovery (it contains only before
images), but its use has allowed one of my clients to be able
to produce any report AS OF any date after 1985, without having
to reload any historical files. This was primarily done by adding
AS OF logic to our standard I/O routines. The I/O routines read
the master file and the log file until the AS OF date is reached.
The record returned is the record that existed at the AS OF date
and time, even records which have been deleted.
An interactive tool was developed which generates
an I/O routine which supports log file processing. The only input
required is the key length, the offset to the key, the file ddname,
and the record length.
The 1985 design uses a single log file to store
before images from all production files. The key of this file
is a file ID, the data key and two byte version #. The date/time
the master record was created or updated and the date/time the
log record was created are also stored in the record. From this
information, the record as it existed in any point in time can
be easily recovered. As far as I know, this design is unique,
I am certain that none of my current clients competitors have
it.
In 1997 this design was extended to store the
log records in the same physical file as the "master"
records, and to adopt the X(14) date format. This format has
the advantage that the file can easily be converted to a SQL database
format, standard SQL logic can be used to obtain records AS OF
a certain time, and update locking issues are simplified.
Code | Format | Comments | Conversion Difficulty |
CA | mmddyy | Inconvenient for sorting | Moderate, must pack yy part. |
CY | yymmdd | Convenient for sort | Moderate, convert to YP + filler. |
JL | yyjjj | Year and Julian date, not packed. | Moderate, convert to YP + filler. |
JP | yyjjj | Packed decimal (three bytes) | Difficult, must shift right into sign nibble, or convert to binary. |
YP | 0yymmdd | Packed decimal (four bytes) | Very easy, let 0 = 1900, 1=2000, etc. |
YL | yyyymmddhhmmss | All external decimal | None |
TP | hhmmss | Note: YP+TP make log timestamp | None |
The basic philosophy which I suggest, based
on years of experience, is to locate date field references, then
replace the date field references with calls to subroutines.
The subroutine will detect the date format, make any necessary
conversions, and return the same logical result as the original
code.
This approach has the advantage that even if
source code is not available, your legacy code can be modified.
See www.computerworld.com, August 11, 1997 issue "COBOL
pioneer pitches year 2000 fix". The product described is
Vertex 2000, from BMR Systems, Dallas, Texas. "Vertex 2000
examines a mainframe program's object code,
finds every
possible date instance,
and patches the code to run a separate
subroutine for handling the date
The subroutine uses extra
bits in the date field to indicate the century".
This suggested approach does not completely eliminate testing, but it greatly reduces the scope of testing. The only side effects are invalid date computations, "garbage" on date output, or data exceptions from code that was not modified. The data exception may sound bad, but is actually the friendliest thing that can happen, as it immediately (from the dump address) identifies the location of the object code that needs to be modified. Garbage on date output should be fairly easy to spot.
The most difficult omission to detect would
be invalid date computations. In my experience, date computations
are rare, most dates references are either to print or display
a date, or to compare a date to another date for high or low.
Of course, your industry may be different, for example any industry
that sets rates based on some period of time would have many date
computations.
Task | Source - least difficult to most difficult. |
Identify date fields |
|
Determine format conversions |
|
Develop subroutines | Ideally the subroutines should "recognize" both old and new formats. |
Convert the data | This will be necessary only if any sorts are done with the date field as part of the sort. |
Convert the programs | See notes for identify date fields. |
Run tests |
|
Put into production |
|