SHARE Session 1514, Feb 23, 1998<BR> Year 2000 Conversion Simplified through Code Design Practices

Abstract:

Obigatory Joke.

The presenter describes code development practices that have minimized conversions for Year 2000 compliance. The single most important step was the development and disciplined use of the sub-routine SYSDATE. All date conversions and formatting are performed via SYSDATE, thus eliminating date logic from application programs. The speaker describes several date formats used by his clients, and describes how they were "modified" to support the Year 2000. The presenter concludes with suggestions which will help you assess your own environment, and how you might "retrofit" to use similar techniques to minimize your own conversion.

Steve Ryder, Senior Director
JSR Systems (a.k.a TXA State of Texas)
2 Margranita Crescent
Austin, Texas 78703-1717

sryder@jsrsys.com
www.jsrsys.com
512.917.0823

The speaker has been developing COBOL code since 1968. Other "languages" he has used include: Univac 1050 assembler, IBM 360 assembler, CDC Cyber assembler, AutoCoder, Algol, PL/I, Fortran, Pascal, C, C++, Java, and HTML!

Clients who have benefited from these code practices include:

	The University of Texas at Austin	1965-1966
	Texas Education Agency	1966-1995
	Houston Ind. School District	1975-1981
	Conroe Ind. School District	1980-1993
	United States Navy	1966-1994
	Capitol Appraisal Group, Inc.	1981-1998

Significant Historical Events:

1975 - Create SYSDATE for all date conversions, and GEMDATE to format internal dates. All date processing is done by SYSDATE and/or GEMDATE. Support for dates from 1900 through 1999.
1985 - Adopted YP (4 byte packed format 0yymmdd) as "standard" date.
1990 - Adopted year convention yyy where yyy+1900 = year. Thus 0yy is 19yy, 1yy is 20yy.
1997 - Adopt input via "windowing" strategy, convert two digit input to yyy based on knowledge of field (past = -90 to +10), future = (-10 to +90).
1997 - LOG9SYSD gets current system date in new installation defined fourteen byte format (to be used in SQL databases and future files) yyyymmddhhmmss.

Given that YP has been in almost universal use by my current clients since 1985, most of their YEAR 2000 conversion issues have been nicely side stepped. Dates sort nicely, and years 2000 and beyond are easily accommodated with no changes to field length or sort positions. Since all date outputs have used GEMDATE for formatting since 1975, data output has required no changes except where the client wants to see the century. Even in those cases, it is simply a matter of changing the output length from 8 to 10. GEMDATE has supported this conversion automatically since 1990.

The only work we have left to do is to apply windowing logic to two digit input dates. This has not yet been done, but the effort is estimated to be trivial.

The X(14) date format has been adopted for new files, and for any file as it is converted from VSAM to SQL format. We currently have several files that live on both the mainframe, and in client/server databases. Since client/server architectures do not support packed decimal, the X(14) format has been adopted.

1975 GES/GEMS Development

The concept of performing all date formatting through only two sub-programs was a logical fallout of a programming philosophy adopted by the Texas Education Agency in 1972, and fully implemented in 1975. In those ancient times this practice was called modular programming. In 1972 TEA adopted the following principles for program development:

All non-temporary files will be defined to a central dictionary.
All COBOL copy text will be generated from the dictionary.
All update I/O will be done thru "standard" I/O routine architecture.
I/O routines will not perform any application program logic. Their interface (read API) will be standard. Programs will be bound thru GDTLINK via Record-ID, only READ/WRITE/DELETE operations will be supported. File keys may not contain embedded blanks. The type of read depends on the input key: all spaces = read next, no spaces = direct read, partial key (trailing spaces) defines a logical group, and subsequent read nexts will return EOF after last record of the logical group.

In 1975, the TEA completed the development of a report writer and data extract system called Generalized Extract System (the report writer part), and Generalized Extract and Maintenance System (the date update part), know as GES/GEMS. Over time, the GES part has had the most pay-back in terms of reduced time to develop reports and data extracts. The GEMS part was intended to eliminate update programming, but that promise was never fulfilled.

Hand written report programs virtually disappeared. The GES tool, developed at a time when programs were coded with a key punch, is still in use by several of my clients twenty two years later. In fact, in 1997 the GES report writer and dictionary maintenance support were converted from COBOL 68 standard to COBOL/LE (Language Environment), IBM's most recent COBOL compiler.

In 1985, the concept of a log file was added. This log file was not for data recovery (it contains only before images), but its use has allowed one of my clients to be able to produce any report AS OF any date after 1985, without having to reload any historical files. This was primarily done by adding AS OF logic to our standard I/O routines. The I/O routines read the master file and the log file until the AS OF date is reached. The record returned is the record that existed at the AS OF date and time, even records which have been deleted.

An interactive tool was developed which generates an I/O routine which supports log file processing. The only input required is the key length, the offset to the key, the file ddname, and the record length.

The 1985 design uses a single log file to store before images from all production files. The key of this file is a file ID, the data key and two byte version #. The date/time the master record was created or updated and the date/time the log record was created are also stored in the record. From this information, the record as it existed in any point in time can be easily recovered. As far as I know, this design is unique, I am certain that none of my current clients competitors have it.

In 1997 this design was extended to store the log records in the same physical file as the "master" records, and to adopt the X(14) date format. This format has the advantage that the file can easily be converted to a SQL database format, standard SQL logic can be used to obtain records AS OF a certain time, and update locking issues are simplified.

Some popular date formats

Code	Format	Comments	Conversion Difficulty
CA	mmddyy	Inconvenient for sorting	Moderate, must pack yy part.
CY	yymmdd	Convenient for sort	Moderate, convert to YP + filler.
JL	yyjjj	Year and Julian date, not packed.	Moderate, convert to YP + filler.
JP	yyjjj	Packed decimal (three bytes)	Difficult, must shift right into sign nibble, or convert to binary.
YP	0yymmdd	Packed decimal (four bytes)	Very easy, let 0 = 1900, 1=2000, etc.
YL	yyyymmddhhmmss	All external decimal	None
TP	hhmmss	Note: YP+TP make log timestamp	None

Avoid creating any unnecessary side effects:

changes to record length,
changes to field length,
changes to field offsets.
Prefer to change format so existing sorts continue to work (i.e., don't change CA to YP, instead convert YY part to packed yyys, so sorts will still work).

What does this all mean to your organization?

What you need to look for to assess your own situation?

I hope I have convinced you that:

Adding two digits to all year fields is not a practical solution.
Developing subroutines to handle all date processing is a practical method to handle Year 2000 support. One code change to two sub-routines was all that was needed to provide century support for YP dates. Similar changes can be made for CY and CA dates. Once the conversion is made to call subroutines for date support, similar extensions can be added when necessary without having to resort to code inspection.

The basic philosophy which I suggest, based on years of experience, is to locate date field references, then replace the date field references with calls to subroutines. The subroutine will detect the date format, make any necessary conversions, and return the same logical result as the original code.

This approach has the advantage that even if source code is not available, your legacy code can be modified. See www.computerworld.com, August 11, 1997 issue "COBOL pioneer pitches year 2000 fix". The product described is Vertex 2000, from BMR Systems, Dallas, Texas. "Vertex 2000 examines a mainframe program's object code,… finds every possible date instance,… and patches the code to run a separate subroutine for handling the date… The subroutine uses extra bits in the date field to indicate the century".

This suggested approach does not completely eliminate testing, but it greatly reduces the scope of testing. The only side effects are invalid date computations, "garbage" on date output, or data exceptions from code that was not modified. The data exception may sound bad, but is actually the friendliest thing that can happen, as it immediately (from the dump address) identifies the location of the object code that needs to be modified. Garbage on date output should be fairly easy to spot.

The most difficult omission to detect would be invalid date computations. In my experience, date computations are rare, most dates references are either to print or display a date, or to compare a date to another date for high or low. Of course, your industry may be different, for example any industry that sets rates based on some period of time would have many date computations.

Tasks to be accomplished

Task	Source - least difficult to most difficult.
Identify date fields	Data dictionary Source code library Multiple source code libraries Object code library Multiple object code libraries
Determine format conversions	YP, use leading "0" nibble for century CY, convert to YP + filler JL, convert to YP + filler CA, convert yy (external) to 0yys (packed) JP, shift right to sign nibble, use leading "0" for century or convert to binary. Dates that are part of a key present a special difficulty depending on how the key is used and references. The worst example I can think of is a JP date that is part of a larger packed field. This is the one case would recommend a conversion that changes the record length and field length, I would do anything to avoid packed decimal keys! "Heartache by the number!" If you have a different format, let's discuss it.
Develop subroutines	Ideally the subroutines should "recognize" both old and new formats.
Convert the data	This will be necessary only if any sorts are done with the date field as part of the sort.
Convert the programs	See notes for identify date fields.
Run tests	Best case, no record or field length changes, year at same offset location (for sorts). Worst case, some of the above had to be changed.
Put into production	Make backups! Save all input if possible. Be vigilant! You will not find everything, no matter how thorough or complete you think you were!

Steve Ryder, Senior Director
JSR Systems
2 Margranita Crescent
Austin, Texas 78703-1717

For more information please contact: Steve Ryder Telephone: 512.917.0823