|
That's our business! |
By default, numeric values in COBOL files are stored in display, or character, format. That is, the value is stored as a base-ten number, with each digit represented by the corresponding EBCDIC (or ASCII) character. For example, the value 1234 is stored in four bytes which contain "1", "2", "3", and "4" (F1, F2, F3, F4 Hex).
But because computers perform computations with binary numbers, it is more efficient to store values in their native binary form than to store them in human readable base ten. If the number is stored in its native binary format it can be input from the file and used directly. If it's stored in a base ten format it needs to be converted to binary before performing calculations on it, then converted back to base ten for storage. Binary is faster -- typically about 8 times -- and usually requires less storage space.
Some of the differences between platforms that create this situation are:
|
The register size of the CPU is typically some binary multiple of 8 bits; 8, 16, 32, or 64 bits. The computer is more efficient when working in its native register size. |
|
Some CPUs store the most-significant-byte of the value first (big endian), while some store the least-significant-byte of the value first (little endian). |
|
The minimum unit of computation on some machines is 16 bits, so the smallest size of a binary value on many machines is 2 bytes. But others can use 1 byte. Sometimes this is a compiler option. |
|
Most binary values are stored in either 2, 4, or 8 bytes. But some COBOL compilers permit one byte increments: 1, 2, 3, 4, 5, 6, 7, or 8 bytes. |
|
Many vendors use the IEEE floating point standards, but others, notably IBM, don't. |
|
Most pure binary integers use 2's-complement, but each vendor is free to chose his own method for all types. |
The following common COBOL data types are discussed below:
05 BALANCE-DUE PIC S9(6)V99 USAGE IS COMPUTATIONAL-3.says to store the field in the computational-3 format. The "usage is" part is optional and generally left off, and "computational" can be abbreviated "COMP", so you will more commonly see this written:
05 BALANCE-DUE PIC S9(6)V99 COMP-3.
The number of bits, bytes, or words that are stored for any given field usually depends on the number of digits given in the COBOL PIC. For binary numbers, 8 bits, or 1 byte, will store unsigned values from 0 to 255 or signed values from -128 to +127. This is enough to store values up to two digits (99), but not up to three digits (999). So a PIC 9 or PIC 99 would require 1 byte, but a PIC 999 would require 2 bytes.
In addition, most compilers have some minimum requirements for comp storage. For example, the smallest unit of storage may be 2 bytes, so even if you specify PIC 9 (only 1 digit), the compiler will reserve two bytes. Also see Synchronization and Alignment below.
Floating point numbers, however, follow standard binary formats and as such their sizes are not determined by a PIC, and no PIC is used in the field definition.
Comp-3 stores two digits
per byte, in BCD form.
If you specify a comp field
in the middle of a record, and it doesn't happen to begin on a 32 bit (4
byte) boundary, the compiler will "align" it to a 32 bit boundary to "synchronize"
it. What's actually stored in the file is not the same as the PICs
on the layout. This is not a very common problem, partly because
binary and comp fields are not very common in files, but you should be
aware of it.
For more articles on data conversion, see our TechTalk Index.
Our COBOL Conversion ServicesDISC can convert
most COBOL numeric data types, including all the IBM mainframe EBCDIC data types.
Our library of conversion routines permits
us to handle those difficult jobs that standard COBOL compilers can't convert.
With over 24 years of experience with thousands of files,
we have the knowledge to catch problems with the data before they cause you grief.
|
|||||
|
Disc Interchange Service
Company, Inc.
Media Conversion Specialists
15 Stony Brook Road
Westford, MA 01886
(978) 692-0050