The UNSTRING statement separates a data item into one or more receiving fields. Delimiters may be used to specify the ends
of fields. Substring values are assigned to unique destination data items.
Note: This manual entry includes code examples and highlights for first-time users following the
General Rules section.
General Format
UNSTRING source
[ DELIMITED BY [ALL] delim
[ OR [ALL] delim ] ... ]
INTO { dest [ DELIMITER in delim-dest ]
[ COUNT IN counter ] } ...
[ WITH POINTER ptr-var ]
[ TALLYING IN tally-var ]
[ ON OVERFLOW statement-1 ]
[ NOT ON OVERFLOW statement-2 ]
[ END-UNSTRING ]
Syntax Rules
- Source is an alphanumeric data item. Source may be reference modified.
- dest is a USAGE DISPLAY data item. It may not be edited.
- delim is a nonnumeric literal or an alphanumeric data item. The
ALL literal construct may not be used.
- The compiler allows source and delim to be numeric literals, in which case it treats them as string literals, displaying the
following
Warning at compile time:
Warning: Literal is numeric - treated as alphanumeric
In such cases, leading zeros are stripped from the numeric literal to form the string literal.
- delim-dest is an alphanumeric data item.
- counter,
ptr-var, and
tally-var are integer numeric data items.
- statement-1 and
statement-2 are imperative statements.
- ptr-var must be large enough to contain a value one greater than the size of source.
- The DELIMITER IN and COUNT IN phrases can appear only if there is a DELIMITED BY phrase.
General Rules
- UNSTRING breaks up
source into the various
dest fields.
source is the sending field and dest is the receiving field. Up to 50 dest items are allowed.
- counter represents the count of the number of characters within source isolated by the delimiters for the move to dest. This does
not include a count of the delimiter characters.
- ptr-var represents the relative character position within source to move from. The leftmost position is position "1. If no POINTER phrase is specified, examination begins with the leftmost character position.
- tally-var is a counter which is incremented by 1 for each dest item accessed during the UNSTRING operation.
- Neither
ptr-var nor
tally-var is initialized by the UNSTRING statement.
- Each
delim represents one delimiter. When a delimiter contains two or more characters, all the characters must be present in contiguous
positions in source to be recognized as a delimiter. When delim is a figurative constant, it stands for a single nonnumeric
literal.
- When the ALL phrase is specified, one or more contiguous occurrences of delim in source are treated as if they were only one
occurrence for the remaining General Rules. Only one occurrence of delim is moved to
delim-dest in this case.
- When two or more delimiters are specified, an OR condition exists between them. Each delimiter is compared to the sending
field in the order written. If a match occurs, the characters in the sending field are considered to be a single delimiter.
No characters in source can be considered a part of more than one delimiter.
- When an examination encounters two contiguous delimiters, the current receiving area is space-filled if it is alphabetic or
alphanumeric, or zero-filled if it is numeric.
- When the UNSTRING statement initiates, the current receiving area is the first dest item. Data is transferred from source
to the receiving area according to the following rules:
- Examination starts at the character position indicated by
ptr-var, or the leftmost position if
ptr-var is not specified.
- If the DELIMITED BY phrase is specified, the examination proceeds left-to-right until a delimiter is encountered. If the DELIMITED
BY phrase is not specified, the number of characters examined is equal to the size of the receiving area. The sign character
of the receiving item (if any) is not included in the size. If the end of source is encountered before the delimiting condition
is met, the examination stops with the last character of source.
- The characters examined (excluding the delimiting characters, if any) are treated as an elementary alphanumeric item. These
characters are moved to the current receiving field according to the rules for the MOVE statement, including space filling.
- If the DELIMITER IN phrase is specified, the delimiting characters are moved to
delim-dest as if they were the alphanumeric source of a MOVE statement. If the delimiting condition is the end of source, then
delim-dest is space-filled.
- If the COUNT IN phrase is specified, the number of characters examined (excluding the delimiter) is moved to counter as if
the
count were the numeric source of a MOVE statement.
- If the DELIMITED BY phrase is specified, the source item is further examined beginning with the first character to the right
of the delimiter found. If the DELIMITED BY phrase is not specified, the source item is further examined beginning with the
character to the right of the last character examined.
- The current receiving area is then set to the next dest item and the cycle specified in steps (b) through (g) is repeated
until either all the characters in
source are examined or there are no more dest items.
- The
ptr-var (if any) is incremented by 1 for each character in
source examined.
- An overflow condition occurs in either of the following situations:
- The value of
ptr-var is less than one or greater than the size of source when the UNSTRING statement starts.
- During execution, all
dest items have been acted upon and source contains unexamined characters.
- When the overflow condition exists,
statement-1 (if any) executes and the UNSTRING statement terminates.
- If
statement-2 is specified, it executes after the UNSTRING statement has finished if the overflow condition has not occurred.
Code Examples
Use UNSTRING to decompose strings containing multiple data elements. For example, a string data item might contain a person's
name, using commas to separate the name fields:
last-name,first-name,middle-initial. Using UNSTRING, and specifying "," (comma) as the delimiter, you could separate the name string into three data items, each
containing an element of the full name.
Example 1
Assume the following data items:
01 CUSTOMER-NAME PIC X(40) VALUE ALL SPACES.
01 LAST-NAME PIC X(25) VALUE ALL SPACES.
01 FIRST-NAME PIC X(14) VALUE ALL SPACES.
01 MIDDLE-I PIC X VALUE ALL SPACES.
{ . . . }
PROCEDURE DIVISION.
{ . . . }
DISPLAY 'Enter name: LAST,FIRST,MIDDLE-INITIAL'.
DISPLAY 'Use a comma to separate each name entry'.
ACCEPT CUSTOMER-NAME.
{ . . . }
UNSTRING CUSTOMER-NAME
DELIMITED BY ","
INTO LAST-NAME, |characters to first comma
FIRST-NAME, |characters to second comma
MIDDLE-I |gets only the first character
|of the remaining string. No
|overflow is raised.
|See general rule 12.
ON OVERFLOW
DISPLAY 'OVERFLOW on UNSTRING'
END-UNSTRING.
For code examples 2 and 3 assume the following data items:
01 COLOR-LIST PIC X(22) VALUE "RED:BLUE/GREEN YELLOW".
01 COLOR-1 PIC X(6) VALUE ALL SPACES.
01 COLOR-2 PIC X(6) VALUE ALL SPACES.
01 COLOR-3 PIC X(6) VALUE ALL SPACES.
01 COLOR-4 PIC X(6) VALUE ALL SPACES.
01 DELIMIT-1 PIC X(3) VALUE ALL SPACES.
01 COUNT-1 PIC 9 VALUE 0.
Example 2
UNSTRING COLOR-LIST
DELIMITED BY ":" OR "/" OR ALL SPACE
*ALL SPACE treats contiguous spaces
*as one delimiter.
INTO COLOR-1,
COLOR-2,
COLOR-3,
COLOR-4
END-UNSTRING.
*COLOR-1 = "RED "
*COLOR-2 = "BLUE "
*COLOR-3 = "GREEN "
*COLOR-4 = "YELLOW"
Example 3
MOVE 0 TO COUNT-1.
UNSTRING COLOR-LIST
DELIMITED BY ":" OR "/" OR ALL SPACE
*DELIMIT-1 and COUNT-1 will hold only
*the values associated with COLOR-1.
INTO COLOR-1
DELIMITER IN DELIMIT-1
COUNT IN COUNT-1,
COLOR-2,
COLOR-3,
COLOR-4
ON OVERFLOW
DISPLAY "overflow: unstring colors"
NOT ON OVERFLOW
*do when UNSTRING succeeds.
PERFORM SORT-COLORS
END-UNSTRING.
*COLOR-1 = "RED "
*COLOR-2 = "BLUE "
*COLOR-3 = "GREEN "
*COLOR-4 = "YELLOW"
*DELIMIT-1 = ": "
*COUNT-1 = 3 count-1 holds the number of characters in RED
Example 4
When the string does not contain delimiters between the data elements, but the size and position of each string data element
is known, the string can be deconstructed without a DELIMITED BY phrase.
Assume the following data items:
01 COLOR-LIST PIC X(7) VALUE "REDBLUE".
01 COLOR-1 PIC X(3) VALUE ALL SPACES.
01 COLOR-2 PIC X(4) VALUE ALL SPACES.
{ . . . }
PROCEDURE DIVISION.
{ . . . }
UNSTRING COLOR-LIST
INTO COLOR-1,
*first substring must be three characters.
COLOR-2
*second substring must be four characters.
END-UNSTRING.
*COLOR-1 = "RED"
*COLOR-2 = "BLUE"
Example 5
Use POINTER and a PERFORM loop to extract and process string elements.
Assume the following data items:
01 COLOR-LIST PIC X(21) VALUE "RED BLUE GREEN YELLOW".
01 COLOR-LIST-SIZE PIC 999.
01 COLOR-1 PIC X(6) VALUE SPACES.
01 STRING-PTR PIC 99.
01 FLAGS.
05 COLOR-STRING-EMPTY PIC X VALUE "N".
88 NO-MORE-COLORS VALUE "Y".
{ . . . }
PROCEDURE DIVISION.
{ . . . }
*string pointer must be initialized
MOVE 1 TO STRING-PTR.
SET COLOR-LIST-SIZE TO SIZE OF COLOR-LIST.
PERFORM PROCESS-COLOR UNTIL NO-MORE-COLORS.
{ . . . }
PROCESS-COLOR.
UNSTRING COLOR-LIST
DELIMITED BY ALL SPACE
INTO COLOR-1
POINTER STRING-PTR
ON OVERFLOW
*An OVERFLOW condition will be raised every time
*through the loop, except when extracting the last
*substring. When the overflow is the result of
*having unexamined characters at the end of the
*input string, take no action. When the overflow
*is due to the pointer value exceeding the length
*of the string, set COLOR-STRING-EMPTY.
IF STRING-PTR > COLOR-LIST-SIZE THEN
MOVE "Y" TO COLOR-STRING-EMPTY
END-IF
*process the value
PERFORM STORE-COLOR-1
*initialize COLOR1 before fetching the next color
MOVE SPACES INTO COLOR-1
END-UNSTRING.
Highlights for first-time users
- UNSTRING is best suited for separating string components that share a common delimiter. The delimiter must not appear as an
element of the components' values.
- DELIMITED BY is optional. If it's omitted, each destination data item is completely filled. Effectively, the respective size
of each destination data item is the respective delimiter.
- Assignment to the destination data item is done with an implied MOVE. The MOVE operation will truncate the substring or space
fill the destination data item, as required. Truncation of the substring, or space filling of the destination data item resulting
from the implicit MOVE, does not raise an OVERFLOW condition.
- The OVERFLOW condition is raised if: (a) all destination data items are used and characters still remain in the source data
item; or (b) POINTER is used and the value of the pointer variable is less than 1 or greater than the length of the source
data item.
- Use the ALL option to treat contiguous occurrences of a delimiter, such as spaces, as a single occurrence.
- Use DELIMITER IN to place the delimiting character(s) of the current substring into the named data item.
- Use the COUNT IN option to save the length of the current substring into the named data item.
- Use TALLYING to tally the number of destination data items assigned by the UNSTRING statement.
- Use the POINTER option to specify a numeric holder (ptr-var) for the current position in the source data item. By pre-assigning a value to the pointer variable you can start the examination
of the source data item at any position in the string.
ptr-var is incremented by one for each character in the source data item that is examined. POINTER allows the programmer to use multiple
UNSTRING statements to process the source data item. Note, however, that an overflow condition will be raised if the value
of ptr-var is less than the length of the string when the UNSTRING statement terminates.
You must initialize the tallying and pointer variables or results are unpredictable.
- Use the OVERFLOW option to do special processing when the UNSTRING process does not examine every character in the source
data item, or when the pointer variable has a value of less than one or more than the length of the source data item. When
the overflow condition exists, the associated imperative statement (if any) executes and program execution continues immediately
after the UNSTRING statement.
- Use the NOT ON OVERFLOW option to do special processing when the UNSTRING statement processes the entire source data item.