This tutorial on how to read
a COBOL layout was written specifically for our customers who have had a conversion
performed at Disc Interchange and have received a COBOL layout with the data.
It is intended to give you enough information to read most
simple layouts. It does not cover all topics or everything you would
find in a complex layout, and it is intended to explain COBOL layouts only
so you can use your converted data, not so you can write COBOL programs.
This article begins here: Reading COBOL Layouts where you will also find a topic index.
Contents of this section:
That's our business!
The layout specifies at-least the name of each field, its type, size, and position in the record. A layout may give a detailed description of the use of each field and the values found in it, but that information is often contained in the data dictionary. A COBOL layout usually pertains to a single disk or tape file, as opposed to a table within a database.
A COBOL layout is comprised of a line for each field or group. A COBOL field definition gives the level (discussed later), field name, and a "picture", or PIC clause, which tells you the data type or data category of the field, and its size. The three data types you are likely to see are:
05 ZIP-CODE PIC 99999.This could also be written:
05 ZIP-CODE PIC 9(5).Where the 9 means the field type is numeric, like the first example, and the (5) says there are five digits. The 9(5) and the 99999 are identical field specifications. The parentheses are usually used when it makes the definition shorter or clearer, as in 9(11) vs: 99999999999. The period at the end separates this field definition from the next one.
A character field such as last name could be written:
05 LAST-NAME PIC A(15).Meaning it's a 15 character alphabetic field. But it's actually more common to see character fields specified as PIC X, like:
05 LAST-NAME PIC X(15).PIC X allows any character, including numbers, punctuation, and binary codes.
Like the numeric example above, a PIC X field specification could be written as either multiple Xs or a count in parentheses, like these two identical field specifications:
05 LAST-NAME PIC X(15).
05 LAST-NAME PIC XXXXXXXXXXXXXXX.
Although not commonly seen in COBOL files, you can mix types in a field. For example,
05 ZIP-PLUS-9 PIC 99999X9999.permits a dash (or anything) between ZIP and ZIP+4, like 01886-2001.
A decimal point in a PIC, like "PIC 999.99" separates the integer portion from the decimal portion. This is discussed in more detail later, along with implied decimal.
Let's practice one more, just to get the point across. The following are different ways of specifying the same thing:
05 AMOUNT PIC 999.99. 05 AMOUNT PIC 9(3).9(2). 05 AMOUNT PIC 9(3).99. 05 AMOUNT PIC 999.9(2).
Filler can also be used to
create a field, or place holder, that you will never need to refer to by
name, so you might find it contains actual data, not just blank space.
It's also common for a vendor to use fields for some internal purpose,
for example as a key field, but to mark those fields as FILLER when the data
is sent outside the company. So FILLER fields can contain anything, including
binary data. You should not expect them to be neatly filled with spaces.
It's also common for a vendor to use fields for some internal purpose, for example as a key field, but to mark those fields as FILLER when the data is sent outside the company. So FILLER fields can contain anything, including binary data. You should not expect them to be neatly filled with spaces.
1. A literal in a field causes that character to appear in that location. For example,
05 ZIP-PLUS-9 PIC 99999-9999.specifies a field with five digits, a dash, and four more digits. The dash is not part of the variable data -- it is a literal character.
2. A decimal point in a numeric field does two things: it places an actual decimal point into the file, and determines the location of the decimal for calculations. The following field is six bytes wide and has a "real decimal" in the file:
05 AMOUNT PIC 999.99.If you view a record containing the value 123.45 in this field, you will see "123.45"
3. A "V" in the PIC clause specifies the location of an implied decimal. This is discussed later, in the section on numeric fields. The following field is five bytes wide and has an "implied decimal" at the location of the V:
05 AMOUNT PIC 999V99.If you view a record containing the value 123.45 in this field, you will see "12345" .
4. A minus sign, "-", reserves a byte in the record for an actual sign, and puts a "-" in negative values, and a space in positive values.
5. Similarly, a "+" in the PIC puts a "-" in negative values and a "+" in positive values. See the section below on "signed fields" for the representation of PIC S9 fields.
6. A "P" in a PIC
clause scales the value. This is seldom seen, so we will be brief,
via two examples:
|The three 9s cause this field to be three bytes in size, and the three Ps scale it UP by 1000. If the field contains the digits 123, the actual value represented is 123,000.|
|This scales the value DOWN. If the field contains the digits 123 the actual value is 0.000123|
COBOL layouts are divided into "areas", and there are many rules for what data may be found in which area, but one you should remember is that an asterisk, *, in column 7, the "indicator area" turns the entire line into a comment, which is ignored by the COBOL compiler. Even if that line contains a field specification, it will be ignored if there is an * in column 7.
There are variations on COBOL layouts that discard columns 1-6, shifting the entire layout left. And, some printed documentation may not show these columns. You can usually find your place in the layout from the 01 level, which normally starts in column 8. All other levels should start in column 12 or above.
A COBOL field definition does not need to be entirely on one line. A line-ending has no significance to the compiler; it's the period at the end that's the COBOL separator, not the carriage-return.
05 CUSTOMER-NAME. 10 LAST-NAME PIC X(15). 10 FIRST-NAME PIC X(8).Notice that CUSTOMER-NAME does not have a PIC, since it's a group, not a field. Also notice that the two fields within the group are at a lower level, level 10, than the 05 group. Lower levels are normally indented further for clarity, but this is not required, and in fact the compiler doesn't care.
For the rest of this tutorial we will use levels 05, 10, and 15 to be consistent. Just remember these choices are arbitrary; we could have used 02, 03, and 04, or any other numbers between 02 and 49.
There can be many levels. Here is a brief example of a record with three levels:
01 MAILING-RECORD. 05 COMPANY-NAME PIC X(30). 05 CONTACTS. 10 PRESIDENT. 15 LAST-NAME PIC X(15). 15 FIRST-NAME PIC X(8). 10 VP-MARKETING. 15 LAST-NAME PIC X(15). 15 FIRST-NAME PIC X(8). 10 ALTERNATE-CONTACT. 15 TITLE PIC X(10). 15 LAST-NAME PIC X(15). 15 FIRST-NAME PIC X(8). 05 ADDRESS PIC X(15). 05 CITY PIC X(15). 05 STATE PIC XX. 05 ZIP PIC 9(5).Most of the fields in this record (company, address, city, state, zip) are simple fields that need no comment. But there are some interesting things about the contact fields:
There is a group called CONTACTS at the 05 level. Within this group are three 10 level groups. The first one is PRESIDENT, and within this group are the LAST-NAME and FIRST-NAME fields for the president. So far this is similar to the previous example, with one more level. This group is 23 bytes (15 + 8).
Next we have a group to contain the name of the VP of Marketing. This group is also 23 bytes. Notice it uses the same field names, LAST-NAME and FIRST-NAME, as used in the president's group. Although this isn't commonly seen, it is permitted in COBOL. They are considered different fields because they are within different groups. In COBOL you distinguish them by referring to "LAST-NAME OF PRESIDENT" for the president's name, and "LAST-NAME OF VP-MARKETING" for the name of the VP of Marketing.
The last group in the CONTACTS
group is for "alternate contacts". This one contains a field called
TITLE which contains the title of the alternate contact (e.g., CEO). Like
the others, it contains LAST-NAME and FIRST-NAME fields. This group is
You are likely to see the 88 level, though. The 88 level simply equates a value with a name. Here's a simple example:
05 SEX PIC X. 88 MALE VALUE "M". 88 FEMALE VALUE "F".This equates the value "M" with "MALE", and the value "F" with "FEMALE" for the field SEX. (This allows your COBOL program to, for example, test IF MALE rather than having to say IF SEX IS EQUAL TO "M"). Since we are not teaching COBOL programming, this is incidental to us, but here's what is important to know about the 88 level:
88 ODD-NUMBERS VALUE 1, 3, 5, 7, 9. 88 PRE-SCHOOL VALUE 0 THROUGH 4.
For more articles on data conversion, see our TechTalk Index.
Our COBOL Conversion Services
We can read nearly all IBM mainframe tapes and convert the IBM EBCDIC files.
DISC also has extensive support for VMS, UNIX, and PC tapes, and can convert most
COBOL files from those systems.
Disc Interchange Service
Media Conversion Specialists
15 Stony Brook Road
Westford, MA 01886