ACUCOBOL-GT includes a sorting utility called AcuSort. See
Debugger and Utilities in the
ACUCOBOL-GT User's Guide for details on this utility. There is also a runtime configuration variable that instructs the runtime to use the system's
Quicksort algorithm (if present) instead of the built-in algorithm specified by the SORT statement. See
USE_SYSTEM_QSORT in
Appendix H for more information.
Note: This manual entry includes code examples and highlights for first-time users following the
General Rules section. In the highlights list, item four discusses ways to improve SORT performance.
General Format
SORT sort-file
{ KEY AREA IS key-table }
{ ON {ASCENDING } KEY {key-name} } ...
{DESCENDING}
[ WITH DUPLICATES IN ORDER ]
[ COLLATING SEQUENCE IS alpha-name ]
{ INPUT PROCEDURE IS proc-name }
{ USING {in-file} ... }
{ OUTPUT PROCEDURE IS proc-name }
{ GIVING {out-file} ... }
Note that proc-name has the following format:
start-proc [ {THRU } end-proc ]
{THROUGH}
Syntax Rules
- sort-file names a sort file described by an SD entry in the Data Division.
- key-table must name a data item that is not located in the record for
sort-file.
key-table may not be subordinate to an OCCURS clause, nor may it be reference modified.
- key-table must reference a data item whose size is an even multiple of 7.
key-table is processed as if it had the following structure:
01 KEY-TABLE.
03 SORT-KEY OCCURS N TIMES.
05 KEY-ASCENDING PIC X COMP-X.
05 KEY-TYPE PIC X COMP-X.
05 KEY-OFFSET PIC XX COMP-X.
05 KEY-SIZE PIC XX COMP-X.
05 KEY-DIGITS PIC X COMP-X.
Typically, programs will declare
key-table with a similar format.
- key-name is a data item in the record description associated with
sort-file. It may not be subordinate to an OCCURS clause, nor may it be a group item containing variable occurrence data items. It
may not be reference modified. The maximum number of keys allowed is 23.
- alpha-name is an alphabet-name defined in the SPECIAL-NAMES paragraph of the Environment Division.
- in-file and
out-file are files described by FD entries in the Data Division. They may not be sort files. The maximum number of input and ouput
files is 25.
- start-proc and
end-proc are paragraph or section names in the Procedure Division.
- A SORT statement may not appear in Declaratives or in the input or output procedure of a SORT or MERGE statement.
- If
sort-file contains variable length records,
in-file records must not be smaller than the smallest record in
sort-file nor larger than the largest. If
sort-file contains fixed length records,
in-file records may not be larger than the size of
sort-file’s records.
- If
out-file contains variable length records, sort-file records must not be smaller than the smallest record in
out-file nor larger than the largest. If
out-file contains fixed length records,
sort-file records may not be larger than the size of
out-file's records.
- If
sort-file contains more than one record description,
key-name need appear in only one of them. The character positions referenced by
key-name are used as the key for all the file’s records.
- If
out-file is an indexed file, the first
key-name must be ASCENDING and must specify the same character positions in its record as the primary record key for
out-file.
- THRU is an abbreviation for THROUGH.
General Rules
- The SORT statement sorts records received from the INPUT PROCEDURE or found in the in-files. It then either makes these sorted
records available to the OUTPUT PROCEDURE or writes them to each
out-file.
- Sort records must be at least six bytes in size.
- If
sort-file contains fixed length records, any shorter
in-file records are space-filled on the right to match the record size.
- If
out-file contains fixed length records, any shorter
sort-file records are space-filled on the right to match the record size.
- The first
key-name is the major key, and the next
key-name is the next most significant key. This pattern continues for each
key-name specified.
- The ASCENDING phrase specifies that key values are to be ordered from lowest to highest. The DESCENDING phrase specifies the
reverse ordering. Once ASCENDING or DESCENDING is specified, it applies to each
key-name until another ASCENDING or DESCENDING adjective is encountered.
- Use the KEY AREA option when you do not know the specifics of the sort key until the program is run. You can use this to allow
users to enter sort key specifications, typically in conjunction with some form of data dictionary.
- Your program must fill in a table of information that describes the sort keys. This table,
key-table, should have the format described by Syntax Rule 3 above. The number of sort keys is determined by the number of occurrences
in the table. The keys are listed in order of precedence: table entry 1 describes the highest precedence key, table entry
2 the second highest, and so on. If you need to process a variable number of keys, use a variable-size table (by using OCCURS
DEPENDING ON).
- For each key, you must specify the following information:
KEY-ASCENDING:
|
This should be 0 or 1. Enter 1 to have an ascending sort sequence, 0 for descending.
|
KEY-TYPE:
|
Describes the underlying data format. The allowed values are listed in the next rule.
|
KEY-OFFSET:
|
Describes the distance (in standard character positions) from the beginning of the sort record to the beginning of the key
field. The first field in a sort record is at offset 0.
|
KEY-SIZE:
|
Describes the size of the key field in standard character positions.
|
KEY-DIGITS:
|
This is used only for numeric keys. It describes the number of digits contained in the key (counting digits on both sides
of the decimal point).
|
- The KEY-TYPE field uses a code to describe the type and internal storage format of the data item. Select from the following
values:
0
|
Numeric edited
|
1
|
Unsigned numeric (DISPLAY)
|
2
|
Signed numeric (DISPLAY, trailing separate)
|
3
|
Signed numeric (DISPLAY, trailing combined)
|
4
|
Signed numeric (DISPLAY, leading separate)
|
5
|
Signed numeric (DISPLAY, leading combined)
|
6
|
Signed COMP-2
|
7
|
Unsigned COMP-2
|
8
|
Unsigned COMP-3
|
9
|
Signed COMP-3
|
10
|
COMP-6
|
11
|
Signed binary (COMP-1, COMP-4, COMP-X)
|
12
|
Unsigned binary (COMP-1, COMP-4, COMP-X)
|
13
|
Signed native (COMP-5, COMP-N)
|
14
|
Unsigned native (COMP-5, COMP-N)
|
15
|
Floating point (FLOAT, DOUBLE)
|
16
|
Alphanumeric
|
17
|
Alphanumeric (justified)
|
18
|
Alphabetic
|
19
|
Alphabetic (justified)
|
20
|
Alphanumeric edited
|
21
|
Not used
|
22
|
Group
|
This coding is the same one used by the C interface, and is also used by Acu4GL to interface to relational DBMSs. When specifying
the key type, you may safely use "alphanumeric" for all nonnumeric keys. (The sort rules are the same for each of these types).
For numeric data, however, you must specify the correct type or you may get sorting errors.
- The results are undefined if you provide invalid data in the key-table. If you fail to specify any keys (by specifying a table
whose size is zero), you receive a file error on
sort-file. Under the default file status codes, this is file error 94 with a secondary status of 63.
- For nonnumeric keys, the COLLATING SEQUENCE phrase establishes the ordering. If this phrase is omitted, the NATIVE collating
sequence is used. For numeric keys, the ordering is specified by the algebraic value of the key.
- The DUPLICATES phrase affects the return order for records whose key-name values are equal.
- When there is a USING phrase, the return order is the same as the order of appearance of
in-file names in the SORT statement. Within a given
in-file, the order is that in which the records are accessed from that file.
- When there is an INPUT PROCEDURE, the return order is the same as the order in which records were released. If the DUPLICATES
phrase is not used, the return order for records with equal key values is unpredictable.
- The execution of a SORT statement consists of three distinct phases. These are:
- Records are made available to the
sort-file. This is achieved either by executing RELEASE statements in the input procedure or by implicit execution of READ statements
for each
in-file. When this phase starts, in-file must not be open. When it finishes,
in-file will not be open.
- The sort-file is sequenced according to the KEY phrase and the DUPLICATES clause. No processing of
in-files or
out-files takes place during this phase.
- The records in
sort-file are made available in sorted order. The sorted records are either written to the out-files or are made available to an output
routine through execution of a RETURN statement. When this phase starts, out-file must not be open. When it finishes,
out-file will be closed.
- If the INPUT PROCEDURE phrase is used, the named procedure is executed by the SORT statement according to the rules for the
PERFORM verb. This procedure must make records available to the input phase of the sort operation by executing RELEASE statements.
When this procedure returns, the sort operation proceeds to the sequencing phase. The range of the input procedure may not
cause the execution of a MERGE, RETURN, or SORT statement.
- If the USING phrase is specified, all records in each
in-file are transferred to
sort-file. For each in-file, the following actions occur:
- The file is opened as if it were the object of an OPEN INPUT statement with no options.
- The records are obtained and released to the sort operation. Each record is obtained as if a READ statement with the NEXT
and AT END phrases had been executed. For relative files, the RELATIVE KEY data item is undefined at the end of this phase.
- The file is closed as if it were the object of a CLOSE statement with no options. This occurs prior to the sequencing of
sort-file.
These implicit functions are performed such that any associated USE procedures are executed. These USE procedures must not
access
in-file or its record area.
- If an output procedure is specified, control passes to it after the sort-file has been sequenced. Control passes to the output
procedure according to the rules of the PERFORM statement. The output procedure must execute RETURN statements to retrieve
the sorted records. When the output procedure returns, the SORT statement terminates and control passes to the next executable
statement. The range of the output procedure must not execute any MERGE, RELEASE, or SORT statements.
- If the GIVING phrase is used, all the sorted records are written to each
out-file. For each of these files, the following steps occur:
- out-file is opened as if it were the object of an OPEN OUTPUT statement with no options.
- The sorted records are returned and written to the file. The records are written as if a WRITE statement without any options
had been executed. For a relative file, the value of the RELATIVE KEY data item is updated to reflect the record number written.
- The file is closed as if it were the object of a CLOSE statement without any options.
These implicit functions are performed such that any associated USE procedures are executed. Such a USE procedure may not
refer to out-file or its record area. On the first attempt to write beyond the externally defined boundaries of the file,
any applicable USE procedure is executed. If control is returned from that USE procedure, or no USE procedure is applicable,
the processing of that
out-file is terminated.
- If the SORT statement is in a fixed segment, the range of any input and output procedures must be contained completely in
the fixed segments and no more than one independent segment. If the MERGE statement is in an independent segment, the range
must be completely contained in the fixed segments and the same independent segment.
- The SORT statement updates the value of the
sort-file's FILE STATUS data item.
- Only one SORT may be active at a time. See also
CANCEL SORT.
- If a SORT statement is executed in a wrong context, the runtime displays the error
Illegal SORT. This error belongs to the class of
intermediate runtime errors that, upon occurrence, call installed error procedures. See
CBL_ERROR_PROC for details.
- For compatibility with other COBOLs, ACUCOBOL-GT includes special registers known as SORT-RETURN and SORT-MESSAGE. SORT-RETURN
can be used for two purposes.
- To determine the status of a SORT that's just finished. You can determine the success or failure of a SORT by examining this
variable after the SORT returns. A value of
0 indicates success, and a non-zero value indicates failure.
- To interrupt a SORT that is currently running. By setting this variable in an input or output procedure, you stop SORT processing
immediately after the next RELEASE or RETURN statement is performed. By setting this variable in a DECLARATIVES paragraph
(if you are not using input or output procedures), you stop SORT processing immediately after the next implicit RELEASE or
RETURN is performed.
The special register SORT-RETURN is of type SIGNED-INT. Please note that this register is primarily for compatibility purposes,
and there are better ways to perform these functions in ACUCOBOL-GT. For instance, to get status on a SORT, use the FILE STATUS
variable of the SORT file. This gives more information than just success or failure. And if you are using input procedures,
you can halt a SORT more simply by returning from the procedure as if you had reached the end of file.
When compiling for IBM compatibility (-Cv), the SORT-MESSAGE behaves just like it is declared in every program, for example:
01 SORT-MESSAGE PIC X(8) EXTERNAL.
This variable is used in mainframe environments to help control the SORT operation. In ACUCOBOL-GT, the variable has no particular
effect.
- The SORT statement can be used to sort elements of a working-storage table. The syntax is:
SORT data-name-2 [ ON ASCENDING/DESCENDING KEY data-name-1 ... ]
[ WITH DUPLICATES IN ORDER ]
[ COLLATING SEQUENCE IS alphabet-name ]
Code Examples
SORT PRODUCT-SORT-FILE |temporary SD file
ON ASCENDING KEY MODEL-TYPE, |major sort key
MODEL-NUMBER |minor sort key
USING ATHLETIC-SHOES-LIST, |input data file
DRESS-SHOES-LIST |input data file
GIVING PRODUCT-LIST. |permanent output data file
(An extended version of this example appears after the
Highlights for First-Time Users section.):
SORT PRODUCT-SORT-FILE |temporary SD file
*duplicates sorted in the order acquired
ON ASCENDING KEY MODEL-TYPE |major sort key
ON DESCENDING KEY MODEL-NUMBER |minor sort key
WITH DUPLICATES IN ORDER
INPUT PROCEDURE IS WEED-PRODUCT-LIST
OUTPUT PROCEDURE IS UPDATE-PRODUCT-LIST.
Highlights for first-time users
- SORT is used to order records according to a set of key fields (sort keys). Records may be stored in sequential, relative,
or indexed files, or records may be acquired by use of an INPUT PROCEDURE. Once ordered, the record set may be further processed
by use of an OUTPUT PROCEDURE, or the records may be written directly to the named output file(s).
- SORT is most often used to order records stored in disk files. However, by using an INPUT PROCEDURE you can acquire records
from other input sources such as output from batch processes, internal application data structures, or screen input.
- SORT creates a special temporary disk file (the sort file) as a work space for collecting, sorting, and holding ordered records.
The sort file is defined by an SD entry in the DATA DIVISION. The sort file record definition must immediately follow the
SD entry and must include definitions for each sort key used, except when the KEY AREA phrase is used. You can place temporary
files used by the SORT verb in a specified directory. The sort file is removed when the SORT statement completes. See
SORT_DIR for more information.
-
Runtime performance: Most SORT procedures involve the reading, sorting, and writing of records stored in disk files. These disk I/O processes
can be relatively slow and, therefore, the SORT process can take a lot of time. However, you can tune performance. To get
the best runtime performance, give the process as much memory as possible, without adversely affecting other processes running
on the system. Ideally, you should allocate enough memory, using the
SORT_MEMORY configuration variable, so that all records fit into the allocated memory; this way, the records are sorted and moved to
the output phase without requiring the use of temporary files. Determining the optimal value depends on the number and size
of the records being sorted, the amount of available memory, and the needs of other processes on the system. Some experimentation
may be necessary.
For larger SORT operations, you will more than likely not be able to allocate enough memory in which all records can fit,
and so the SORT operation will use temporary sort files. Use the
A_SORT_REGIONS,
A_SORT_REGIONS_FINAL and
A_SORT_FILE_MEMORY configuration variables to fine-tune the use of temporary files.
By default, the SORT routine uses a built-in sort function. Alternatively, if your system has a qsort() function, you can
specify its use by setting the runtime configuration variable called
USE_SYSTEM_QSORT to the value of
1. Some systems have qsort() functions that perform better than the built-in function. Consider experimenting with this variable
to determine if this option yields better performance on your system. Pay particular attention to the number of comparisons
done during the sort, which can be seen in the runtime trace output.
- The three basic steps of the SORT procedure are:
- Acquiring and placing the records to be sorted in the sort file:
When the USING phrase is used, SORT opens each named input file, reads the data records, one at a time, into the sort file
and closes the input file. Input files must not be open before the SORT statement begins.
When an INPUT PROCEDURE is used the RELEASE verb is used to pass records to the sort file. If records are acquired from disk
files, it is the responsibility of the input procedure to open, read, process, RELEASE each individual record, and close the
files. For more information, see the
RELEASE StatementRELEASE Statement .
- Sorting the records:
Using the sort file, and a set of temporary files, records are sorted according to the key phrase, the DUPLICATES clause,
and the COLLATING clause. See
SORT_DIR and
SORT_MEMORY for more information.
- Disposition of the sorted records:
When the GIVING phrase is used, the sorted records are written to the named permanent output file(s).
When an OUTPUT PROCEDURE is used, the sorted records are made available to the output procedure for processing and writing
to a permanent file(s). The output procedure uses the verb RETURN to acquire the ordered records from the sort file. It is
the responsibility of the output procedure to open, write, and close the output file(s). For more information see the description
of the
RETURN Statement.
- A SORT statement may not appear in a DECLARATIVES section or in an INPUT or OUTPUT PROCEDURE that is part of a SORT statement
(nesting of SORT statements is not permitted).
- The KEY AREA phrase is a means for defining the sort keys at runtime, as the application is running. When you use KEY AREA,
it is not required that the sort file record descriptor contain entries for potential sort keys. Definition of the sort key(s)
in the sort file is handled internally by the SORT routine using the key table. See syntax rules 2 and 3 and general rules
6 through 10.
- If the KEY AREA phrase is not used, the sort keys must be defined in the record description of the sort file.
- Use of INPUT PROCEDURE or OUTPUT PROCEDURE requires that all file I/O operations and record disposition be handled by the
input or output procedure. This means that the input and output procedures must explicitly perform the OPEN, READ, RELEASE
(input), RETURN (output), WRITE, and CLOSE actions. As with any I/O management, the procedure should consider and account
for the handling of all I/O related errors.
- Use the DUPLICATES phrase when you want duplicate records to be sequenced in the same order that they are read in or RELEASEd.
Duplicate records are those that have identical key values. In the absence of the DUPLICATES phrase sequencing of duplicate
records is not predictable (see General Rule 12).
- Use the COLLATING SEQUENCE phrase to alter the ordering of nonnumeric keys. The named collating sequence must be defined in
the SPECIAL-NAMES paragraph of the ENVIRONMENT DIVISION. In the SPECIAL-NAMES paragraph the user may define a unique character
order, or the user may select one of the four predefined character sequences: STANDARD-1, STANDARD-2, NATIVE, and EBCDIC.
See
Special-Names Paragraph for more information.
If no COLLATING SEQUENCE phrase is used, the default collating sequence is used. The default collating sequence is whatever
is native to the operating system (usually the same as the predefined type NATIVE).
- Use the STATUS variable to hold the execution status of the SORT operation. The status variable is named in the SELECT/ASSIGN
phrase of the FILE-CONTROL paragraph of the INPUT-OUTPUT SECTION. See
Special-Names Paragraph for more information.
For a complete list and description of file status codes, see
Appendix E, in
ACUCOBOL-GT Appendices.
- To specify the disk directory in which SORT will place any temporary files, set the
SORT_DIR runtime configuration variable, located in the runtime configuration file.
For simplicity, only one input file will be used.
IDENTIFICATION DIVISION.
PROGRAM-ID. SAMPLE-FILE-SORT.
ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
*SORT input file
SELECT ATHLETIC-SHOES-LIST
ASSIGN TO ....
*SORT output file
SELECT PRODUCT-LIST
ASSIGN TO ....
*sort file (SD)
SELECT PRODUCT-SORT-FILE
ASSIGN TO ....
DATA DIVISION.
FILE SECTION.
FD ATHLETIC-SHOES-LIST.
01 A-SHOE-RECORD PIC X(38).
FD PRODUCT-LIST.
01 B-SHOE-RECORD PIC X(38).
SD PRODUCT-SORT-FILE.
01 SORT-DATA.
05 MODEL-NAME PIC X(10).
05 MODEL-TYPE PIC X(3).
05 MODEL-NUMBER PIC X(3).
05 STOCK-NUMBER PIC X(7).
05 DESIGN-YEAR PIC 99.
05 UNIT-COST PIC 999V99.
05 UNIT-PRICE PIC 999V99.
05 FACTORY-NUM PIC 999.
WORKING-STORAGE SECTION.
01 FLAGS.
05 SHOE-LIST-EMPTY PIC X VALUE "N".
88 NO-MORE-SHOE-RECORDS VALUE "Y".
05 SORT-FILE-EMPTY PIC X VALUE "N".
88 NO-MORE-SORT-RECORDS VALUE "Y".
01 HONG-KONG-NUMBER PIC 99.
01 TAIWAN-NUMBER PIC 99.
...
PROCEDURE DIVISION.
PRODUCT-LIST-SORT.
*temporary SD file used by sort
SORT PRODUCT-SORT-FILE
*major sort key
ON ASCENDING KEY MODEL-TYPE
*minor sort key
ON DESCENDING KEY MODEL-NUMBER
*duplicates sorted in the order acquired
WITH DUPLICATES IN ORDER
INPUT PROCEDURE IS WEED-PRODUCT-LIST
OUTPUT PROCEDURE IS UPDATE-PRODUCT-LIST.
WEED-PRODUCT-LIST SECTION.
OPEN-LIST-FILE.
OPEN INPUT ATHLETIC-SHOES-LIST.
PERFORM WEED-LIST
UNTIL NO-MORE-SHOE-RECORDS.
CLOSE ATHLETIC-SHOES-LIST.
GO TO EXIT-WEED-PRODUCT-LIST.
WEED-LIST.
READ ATHLETIC-SHOES-LIST NEXT
AT END MOVE "Y" TO SHOE-LIST-EMPTY
NOT AT END
*stock numbers beginning with "X" are obsolete
*do not RELEASE
IF STOCK-NUMBER(1:1) = "X" THEN
NEXT SENTENCE
ELSE
*otherwise release the record to SORT
RELEASE SORT-DATA
END-IF.
EXIT-WEED-PRODUCT-LIST.
EXIT.
UPDATE-PRODUCT-LIST SECTION.
CREATE-PRODUCT-LIST.
OPEN OUTPUT PRODUCT-LIST.
PERFORM UPDATE-RECORD
UNTIL NO-MORE-SORT-RECORDS.
CLOSE PRODUCT-LIST.
GO TO EXIT-UPDATE-PRODUCT-LIST.
UPDATE-RECORD.
RETURN PRODUCT-SORT-FILE INTO SORT-DATA
AT END MOVE "Y" TO SORT-FILE-EMPTY
NOT AT END
IF FACTORY-NUM = HONG-KONG-NUMBER THEN
MOVE TAIWAN-NUMBER TO FACTORY-NUM
END-IF
WRITE B-SHOE-RECORD FROM SORT-DATA.
EXIT-UPDATE-PRODUCT-LIST.
EXIT.