Button for 1977 Button for 1984 Button for 1990 Button for 1995 Button for MDC Button for notes Button for examples

pattern

Draft MDC Standard

7.2.3 Pattern match pattern

The pattern match operator ? tests the form of the string which is its left-hand operand. S ? P is true if and only if S is a member of the class of strings specified by the pattern P.

A pattern is a concatenated list of pattern atoms.

    pattern::=

patatom ...
@ expratom V pattern


Assume that pattern has n patatoms. S ? pattern is true if and only if there exists a partition of S into n substrings

   S = S1 S2 ... Sn

such that there is a one-to-one order-preserving correspondence between the Si and the pattern atoms, and each Si satisfies its respective pattern atom. Note that some of the Si may be empty.

Each pattern atom consists of a repeat count repcount, followed by either a pattern code patcode, an alternation or a string literal strlit. A substring Si of S satisfies a pattern atom if it, in turn, can be decomposed into a number of concatenated substrings, each of which satisfies the associated patcode, alternation or strlit.

    patatom::= repcount


patcode
patstr
alternation



[ patsetdest ]
    repcount::=

intlit
[ intlit1 ] . [ intlit2 ]


    patcode::= [ ' ]



Y patnonY Y
Z patnonZ Z
patnonYZ
OB charspec CB




...
    patnonY::= any of the characters in ident except Y
    patnonZ::= any of the characters in ident except Z
    patnonYZ::= any of the characters in ident except Y and Z
    charspec::= strconst1 [ : strconst2 ]
    strconst::=

$C [ HAR ] ) L numlit )
strlit


    patstr::= [ ' ] strlit
    alternation::= ( L patgrp )
    patgrp::= patatom ...
    patsetdest::= ( setdestination )

patcodes beginning with the initial letter Y are available for use by M[UMPS] programmers. patcodes beginning with the initial letter Z are available for use by implementors. patcodes are specified in Character Set Profiles.

  1. If a patcode has the form of a charspec, determination of whether a character belongs to the patcode is made as follows: A character belongs to a charspec containing only one strconst if it is contained in the string represented by that strconst. A character belongs to a charspec containing two strconsts if it is (inclusively) between them. Formally, X is a member of S if S [ X, and X is a member of S1:S2 if S1 does not trail X and X does not trail S2, and the check against the value of S2 will be omitted if the value of S2 is the empty string. If S2 is present, then neither S1 nor S2 may contain more than one character.

    If a strconst is of the form $C[HAR]( ... ), then it has the same value as the result of the function $Char called with the same parameters. Use of upper, lower, or mixed case in the name $Char is permitted

  2. Otherwise, patcodes differing only in the use of corresponding upper and lower case letters are equivalent. If the apostrophe is not present in a given patcode, the patcode is satisfied by any single character in the union of the classes of characters represented, each class denoted by its own patcode letter. If the apostrophe is present, the patcode is satisfied by any single character which is not in the union of the classes of characters represented. Whether or not a specific character belongs to a patcode class is determined by a process’ Character Set Profile (charset).

An alternation is satisfied if any one of its patgrp components individually matches the corresponding Si.

Each patstr in which an apostrophe is not present is satisfied by, and only by, the value of strlit. Each patstr in which an apostrophe is present is satisfied by any string of the same length as strlit which is not identical to strlit.

If repcount has the form of an indefinite multiplier ".", patatom is satisfied by a concatenation of any number of Si (including none), each of which meets the specification of patatom.

If repcount has the form of a single intlit, patatom is satisfied by a concatenation of exactly intlit instances of Si, each of which meets the specification of patatom. In particular, if the value of intlit is zero, the corresponding Si is empty.

If repcount has the form of a range, intlit1.intlit2, intlit1 gives the lower bound, and intlit2 the upper bound. If the upper bound is less than the lower bound an error condition occurs with ecode = "M10". If the lower bound is omitted, so that the range has the form .intlit2 , the lower bound is taken to be zero. If the upper bound is omitted, so that the range has the form intlit1. , the upper bound is taken to be indefinite; that is, the range is at least intlit1 occurrences. Then patatom is satisfied by the concatenation of a number of Si, each of which meets the specification of patatom, where the number must be within the expressed or implied bounds of the specified range, inclusive.

If more than one one-to-one order-preserving correspondence between the Si and the pattern atoms exist the following rules are used to select the correspondence used in the two paragraphs following the rules. These rules are applied to each patatom in the pattern, from left to right and recursively in the case of alternations.

  1. If the patatom is not an alternation, select the longest matching substring that produces a match in the pattern as a whole.
  2. If the patatom is an alternation, use the below rules and apply rules A and B recursively to each patatom in the selected patgrp(s) from left to right.
    1. Select the correspondence(s) that use(s) the smallest possible value of the alternation’s repcount.
    2. If multiple correspondences satisfy 1), for each sequential application of the alternation (i.e., each value of the repcount) select the patgrp(s) within the alternation that correspond to the longest possible substring.
    3. If multiple correspondences satisfy 1) and 2), select the leftmost patgrp in the alternation.

Each optional patsetdest, if any, is executed only if S?pattern is true, and only if the associated pattern atom is satisfied by one of the Si in the selected correspondence. If these conditions hold, these (and only these) patsetdests are executed from left to right as follows:

    For each of the substrings Si of S satisfying the pattern atom in the selected correspondence, in the order in which they (the Si) appear in the string, perform all the actions of Set setdestination=Si as defined in section 8.2.30.

The multi-character operator '? is defined by:

   A '? B = '(A ? B)

Button for 1977 Button for 1984 Button for 1990 Button for 1995 Button for MDC Button for notes Button for examples

Copyright © Standard Documents; 1977-2024 MUMPS Development Committee;
Copyright © Examples: 1995-2024 Ed de Moel;
Copyright © Annotations: 2003-2008 Jacquard Systems Research
Copyright © Annotations: 2008-2024 Ed de Moel.

Some specifications are "approved for inclusion in a future standard". Note that the MUMPS Development Committee cannot guarantee that such future standards will indeed be published.

This page most recently updated on 15-Nov-2023, 13:19:37.

For comments, contact Ed de Moel (demoel@jacquardsystems.com)