Button for 1977 Button for 1984 Button for 1990 Button for 1995 Button for MDC Button for notes Button for examples

Pattern Match Operator

M[UMPS] by Example

Relational operator ‘matches pattern’ (?)

Introduced in the 1977 ANSI M[UMPS] language standard.

The pattern-codes to be used in pattern-matching are:
A: the 26 upper and 26 lower-case alphabetic characters
C: the 33 control-characters
E: the 128 characters in the ASCII set
L: the 26 lower-case characters
N: the 10 digits
P: the 33 punctuation-characters
U: the 26 upper-case characters

Modified for internationalization in the 1995 ANSI M[UMPS] language standard:
A: upper and lower-case characters
C: control-characters
E: all characters in the character set
L: lower-case characters
N: digits
P: punctuation-characters
U: upper-case characters

        X?3N1"."2N.U
1 (true) when the value of variable X matches a pattern that "looks like" 3 digits, one point, 2 digits and any number of upper-case symbols, 0 (false) otherwise:

        X?2U1"-"2N1"-"2U
1 (true) when the value of variable X matches the pattern of a 1985 Dutch license plate, 0 (false) otherwise:

        X?3N1"."1.5N
1 (true) when the value of variable X is a positive number, less than 1000 with at least 1 and at most 5 digits following the decimal point, 0 (false) otherwise:

Addition in the 1995 ANSI M[UMPS] language standard

In order to support the Japanese character sets, two new pattern identifiers are added:

KA for Kanji ($Char(161) - $Char(223))
ZEN for JIS ($Char(8481) - $Char(32382))

Additions in the 1995 ANSI M[UMPS] language standard (and correction in a future) M[UMPS] language standard:

The concept of ‘alternation’ is introduced. An ‘alternation’ is a list of possible patterns that each are a valid match for a pattern.

        X?2N1(3P2A,2P3A).E
is equivalent to
        (X?2N3P2A.E)!(X?2N2P3A.E)

X?2N1"-"1(3N1"-"1N,1N1":"4N)
would match "12-345-6" and "12-3:4567".

X?.1(1"("3N1")".1(1"-",1"_"))3N.1(1"-",1"_")4N
would match any of:
        5551212
        555-1212
        555_1212
        (000)5551212
        (000)555-1212
        (000)555_1212
        (000)-5551212
        (000)-555-1212
        (000)-555_1212
        (000)_5551212
        (000)_555-1212
        (000)_555_1212

Approved for addition in a future M[UMPS] language standard:

In order to support the character ISO-8859-1/USA, a new pattern identifier is added:

I: "International" characters (any non-ASCII characters in ISO-8859-1/USA).

It is made possible to exclude certain patterns:

        X?.'C
1 (true) when the value of variable X does not contain any control characters, 0 (false) otherwise:

        X?1"Y".'"Y"1"Y"
1 (true) when the value of variable X starts and ends with the letter "Y", and no other occurrences of that letter are present in that value, 0 (false) otherwise:

The concept of "ranges" is introduced. It is made possible to specify that a pattern is matched when one of a set of specified characters occurs:

Reference   Value
"word"?.["aeiouAEIOU"]   0 (false)
"ff3a"?.["a":"f"]["A":"F"]N   1 (true)

The first pattern would be matched by strings that contain only vowels; the second pattern would be matched by purely hexadecimal numbers.

As a new feature, it has been made possible to extract the substring that matches a specific sub-pattern from the string that is being "matched". When using this new feature, the name of the variable that is to receive the string-segment in question is named between parentheses following the pattern-atom that it is intended to match.

Assume that the value of local variable X matches the following pattern: X?4N1","1.3N, i.e. 4 numeric digits, one comma and then between one and 3 more digits. The code segment:
If '(X?4N(ITEM)1","1.3N(QUANT(ITEM)) Do ...
would cause the values of local variables ITEM and QUANT(ITEM) to be set to ITEM=$Extract(X,1,4) (the part that matches 4N) and QUANT(ITEM)=$Extract(X,6,$Length(X)) (the part that matches 1.3N).

Note that the assignment occurs as the pattern is being matched (strict left-to-right), so that the value of local variable ITEM is well defined when the pattern matching processor will attempt to assign a value to QUANT(ITEM).

Finally, a special case of indirection is pattern indirection:
>Set string="123-44-5678"
>Write string?3N1"-"2N1"-"4N
1
>>Set pattern="3N1""-""2N1""-""4N"
>Write pattern
3N1"-"2N1"-"4N
>Write string?@pattern
1
>

Button for 1977 Button for 1984 Button for 1990 Button for 1995 Button for MDC Button for notes Button for examples

Copyright © Standard Documents; 1977-2024 MUMPS Development Committee;
Copyright © Examples: 1995-2024 Ed de Moel;
Copyright © Annotations: 2003-2008 Jacquard Systems Research
Copyright © Annotations: 2008-2024 Ed de Moel.

The information in this page is NOT authoritative and subject to be modified at any moment.
Please consult the appropriate (draft) language standard for an authoritative definition.

Some specifications are "approved for inclusion in a future standard". Note that the MUMPS Development Committee cannot guarantee that such future standards will indeed be published.

This page most recently updated on 16-Nov-2023, 11:53:03.

For comments, contact Ed de Moel (demoel@jacquardsystems.com)