Reading COBOL Layouts

This tutorial on how to read a COBOL layout was written specifically for our customers who have had a conversion performed at Disc Interchange and have received a COBOL layout with the data.  It is intended to give you enough information to read most simple layouts.  It does not cover all topics or everything you would find in a complex layout, and it is intended to explain COBOL layouts only so you can use your converted data, not so you can write COBOL programs. 

This article begins here: Reading COBOL Layouts where you will also find a topic index.

Part 1: COBOL Basics

We will look at some basic COBOL rules, then at some examples, expanding on the concepts as we go.  Newly introduced terms are listed in bold.

Contents of this section:
Need to convert COBOL files? Request a COBOL quote
That's our business!

   Record Layouts
   Fields and the PIC Clause
   Special Formatting Characters
   Columns, Line Numbers, and Comments
   Levels and Groups
   COBOL 's 66 and 88 Levels

Record Layouts

A record layout is a description of all the individual fields that comprise each record in the data file. COBOL layouts follow specific rules. Since we are not teaching you how to program in COBOL, we will only discuss the rules you need to know to read layouts.

The layout specifies at-least the name of each field, its type, size, and position in the record.  A layout may give a detailed description of the use of each field and the values found in it, but that information is often contained in the data dictionary.   A COBOL layout usually pertains to a single disk or tape file, as opposed to a table within a database.

Fields and the PIC clause

The lowest level data item in a COBOL layout is a field, also called an elementary item.  Several fields can be associated to form a group.  All the fields together form a record.

A COBOL layout is comprised of a line for each field or group.  A COBOL field definition gives the level (discussed later), field name, and a "picture", or PIC clause, which tells you the data type or data category of the field, and its size.  The three data types you are likely to see are:

  1. "A" for alpha (A-Z, a-z, and space only).
  2. "9" for a numeric field (numbers 0-9, but no letters).
  3. "X" for any character, (including binary).
For example, the following field (elementary item) is called ZIP-CODE and is 5 digits wide, as specified by the five 9s.  i.e., the "picture" of the field is "99999".
  05  ZIP-CODE      PIC 99999.
This could also be written:
  05  ZIP-CODE      PIC 9(5).
Where the 9 means the field type is numeric, like the first example, and the (5) says there are five digits.  The 9(5) and the 99999 are identical field specifications. The parentheses are usually used when it makes the definition shorter or clearer, as in 9(11) vs: 99999999999.  The period at the end separates this field definition from the next one.

A character field such as last name could be written:

   05  LAST-NAME     PIC A(15).
Meaning it's a 15 character alphabetic field.  But it's actually more common to see character fields specified as PIC X, like:
   05  LAST-NAME     PIC X(15).
PIC X allows any character, including numbers, punctuation, and binary codes.

Like the numeric example above, a PIC X field specification could be written as either multiple Xs or a count in parentheses, like these two identical field specifications:

   05  LAST-NAME     PIC X(15).

Although not commonly seen in COBOL files, you can mix types in a field.  For example,

   05  ZIP-PLUS-9      PIC 99999X9999.
permits a dash (or anything) between ZIP and ZIP+4, like 01886-2001.

A decimal point in a PIC, like "PIC  999.99" separates the integer portion from the decimal portion.  This is discussed in more detail later, along with implied decimal.

Let's practice one more, just to get the point across.  The following are different ways of specifying the same thing:

   05  AMOUNT          PIC 999.99.
   05  AMOUNT          PIC 9(3).9(2).
   05  AMOUNT          PIC 9(3).99.
   05  AMOUNT          PIC 999.9(2).


There is a special type of COBOL field called FILLER.  This reserves space in a COBOL record, commonly for future expansion or to fill a gap created by a redefined field.  FILLER is a reserved word, and you can have as many FILLER fields in a record as you want -- the name does not have to be unique as field names generally must be.

Filler can also be used to create a field, or place holder, that you will never need to refer to by name, so you might find it contains actual data, not just blank space.

It's also common for a vendor to use fields for some internal purpose, for example as a key field, but to mark those fields as FILLER when the data is sent outside the company. So FILLER fields can contain anything, including binary data. You should not expect them to be neatly filled with spaces.

Special Formatting Characters

There are a number of special characters that cause specific actions on the data, such as leading zeros or spaces, floating signs, leading or trailing signs, decimal points, etc.  We will mention only a few common ones:

1. A literal in a field causes that character to appear in that location. For example,

   05  ZIP-PLUS-9     PIC 99999-9999.
specifies a field with five digits, a dash, and four more digits.  The dash is not part of the variable data -- it is a literal character.

2. A decimal point in a numeric field does two things: it places an actual decimal point into the file, and determines the location of the decimal for calculations.  The following field is six bytes wide and has a "real decimal" in the file:

   05  AMOUNT      PIC 999.99.
If you view a record containing the value 123.45 in this field, you will see "123.45"

3. A "V" in the PIC clause specifies the location of an implied decimal.  This is discussed later, in the section on numeric fields.  The following field is five bytes wide and has an "implied decimal" at the location of the V:

   05  AMOUNT      PIC 999V99.
If you view a record containing the value 123.45 in this field, you will see "12345" .

4. A minus sign, "-", reserves a byte in the record for an actual sign, and puts a "-" in negative values, and a space in positive values.

5. Similarly, a "+" in the PIC puts a "-" in negative values and a "+" in positive values.  See the section below on "signed fields" for the representation of PIC S9 fields.

6. A "P" in a PIC clause scales the value.  This is seldom seen, so we will be brief, via two examples:
     PIC 999PPP.
The three 9s cause this field to be three bytes in size, and the three Ps scale it UP by 1000. If the field contains the digits 123, the actual value represented is 123,000. 
     PIC PPP999.
This scales the value DOWN.  If the field contains the digits 123 the actual value is 0.000123

Columns, Line Numbers, and Comments

Columns 1-6 in most COBOL layouts are ignored by the compiler, as is everything after column 72.  You will often find line numbers or other comments (such as when a field was added or changed, or where it originated) in these columns.  These may be useful to you in finding your way around a large layout; just be aware they are ignored by the compiler.

COBOL layouts are divided into "areas", and there are many rules for what data may be found in which area, but one you should remember is that an asterisk, *, in column 7, the "indicator area" turns the entire line into a comment, which is ignored by the COBOL compiler.  Even if that line contains a field specification, it will be ignored if there is an * in column 7.

There are variations on COBOL layouts that discard columns 1-6, shifting the entire layout left.  And, some printed documentation may not show these columns.  You can usually find your place in the layout from the 01 level, which normally starts in column 8.  All other levels should start in column 12 or above.

A COBOL field definition does not need to be entirely on one line.  A line-ending has no significance to the compiler; it's the period at the end that's the COBOL separator, not the carriage-return.

Levels and Groups

COBOL layouts have levels, from level 01 to level 49.  These levels tell the COBOL compiler how to associate, or group, fields in the record.  Level 01 is a special case, and is reserved for the record level;  the 01 level is the name of the record. Levels from 02 to 49 are all "equal" (level 2 is no more significant than level 3), but there is a hierarchy to the structure.  Any field listed in a lower level (higher number) is subordinate to a field or group in a higher level (lower number).  For example, LAST-NAME and FIRST-NAME in the example below are part of, or belong to, the group CUSTOMER-NAME, as can be seen by the level numbers of 05 and 10.
       10  LAST-NAME           PIC X(15).
       10  FIRST-NAME          PIC X(8).
Notice that CUSTOMER-NAME does not have a PIC, since it's a group, not a field.  Also notice that the two fields within the group are at a lower level, level 10, than the 05 group.  Lower levels are normally indented further for clarity, but this is not required, and in fact the compiler doesn't care.

For the rest of this tutorial we will use levels 05, 10, and 15 to be consistent.  Just remember these choices are arbitrary; we could have used 02, 03, and 04, or any other numbers between 02 and 49.

There can be many levels.  Here is a brief example of a record with three levels:

       05  COMPANY-NAME            PIC X(30).
       05  CONTACTS.
           10  PRESIDENT.
               15  LAST-NAME       PIC X(15).
               15  FIRST-NAME      PIC X(8).
           10  VP-MARKETING.
               15  LAST-NAME       PIC X(15).
               15  FIRST-NAME      PIC X(8).
           10  ALTERNATE-CONTACT.
               15  TITLE           PIC X(10).
               15  LAST-NAME       PIC X(15).
               15  FIRST-NAME      PIC X(8).
       05  ADDRESS                 PIC X(15).
       05  CITY                    PIC X(15).
       05  STATE                   PIC XX.
       05  ZIP                     PIC 9(5).
Most of the fields in this record (company, address, city, state, zip) are simple fields that need no comment. But there are some interesting things about the contact fields:

There is a group called CONTACTS at the 05 level.  Within this group are three 10 level groups.  The first one is PRESIDENT, and within this group are the LAST-NAME and FIRST-NAME fields for the president.  So far this is similar to the previous example, with one more level.  This group is 23 bytes (15 + 8).

Next we have a group to contain the name of the VP of Marketing. This group is also 23 bytes.  Notice it uses the same field names, LAST-NAME and FIRST-NAME, as used in the president's group.  Although this isn't commonly seen, it is permitted in COBOL. They are considered different fields because they are within different groups.  In COBOL you distinguish them by referring to "LAST-NAME OF PRESIDENT" for the president's name, and "LAST-NAME OF VP-MARKETING" for the name of the VP of Marketing.

The last group in the CONTACTS group is for "alternate contacts".  This one contains a field called TITLE which contains the title of the alternate contact (e.g., CEO). Like the others, it contains LAST-NAME and FIRST-NAME fields. This group is 33 bytes.

COBOL's 66 and 88 Levels

These two levels have special meaning.  The 66 level assigns an alternate name to a field or group. It doesn't add a new field to the record, it just assigns an alternate name to an existing field.  You are not likely to see level 66.

You are likely to see the 88 level, though.  The 88 level simply equates a value with a name.  Here's a simple example:

   05  SEX                   PIC X.
       88  MALE     VALUE "M".
       88  FEMALE   VALUE "F".
This equates the value "M" with "MALE", and the value "F" with "FEMALE" for the field SEX.  (This allows your COBOL program to, for example, test IF MALE rather than having to say IF SEX IS EQUAL TO "M").  Since we are not teaching COBOL programming, this is incidental to us, but here's what is important to know about the 88 level:
  1. The 88 level does not define a field, and does not take space in the record; it is merely a value definition.
  2. The 88 level does not limit the possible codes to only those listed.  There could be other values used in that field;  M and F are not the only values you might find.  (Although a good layout will list them all.)   In this case there might be a "U" (unknown), or a blank.
  3. If the layout is complete, this is a handy list of the values you can expect to find in this field.  Sometimes it's all you have to go on.
88 levels may specify multiple values, or a range of values, such as:
       88  ODD-NUMBERS  VALUE 1, 3, 5, 7, 9.
       88  PRE-SCHOOL   VALUE 0 THROUGH 4.

  Next: Part 2 Simple COBOL Layouts

Additional Information

For more articles on data conversion, see our TechTalk Index.

Our COBOL Conversion Services

Disc Interchange Service Company's primary business is converting mainframe COBOL files.  From the simplest mailing list to the most complex financial data, we have the tools to properly convert and Q.C. your files efficiently and accurately.  With over 32 years of experience with thousands of files, we have the knowledge to catch problems with the data before they cause you grief.

We can read nearly all IBM mainframe tapes and convert the IBM EBCDIC files. DISC also has extensive support for VMS, UNIX, and PC tapes, and can convert most COBOL files from those systems.

Mainframe & AS/400 Conversions
Mainframe & AS/400 Conversion to PC

With 32 years experience, we are the experts at transferring mainframe data to PCs.
Get more information on IBM Mainframe conversions
Request a COBOL quote

Disc Interchange Service Company, Inc.
Media Conversion Specialists
15 Stony Brook Road
Westford, MA 01886

Copyright © 1997 - 2015 by Disc Interchange
All rights reserved. See our copyright page.