Sign In | Sign Out | Subscribe to Mailing Lists | Unsubscribe or Change Settings | Help |
smoe.org mailing lists
Introduction to Patterns Patterns are used by various commands and configuration settings: By the archive-sync command, to match archive names. By the lists and rekey commands, to match list names. By the set-pattern, unregister-pattern, unsubscribe-pattern, which, and who commands, to match e-mail addresses. By the access_rules, advertise, bounce_probe_pattern, bounce_rules, delivery_rules, noadvertise, and post_limits settings, to match e-mail addresses. By the admin_body and taboo_body settings, to match lines in the body of a posted message. By the admin_headers and taboo_headers settings, to match lines in the headers of a posted message. By the attachment_filters and attachment_rules settings, to match message content types. By the quote_pattern setting, to count the lines in the body of a posted message that are marked as being written by someone else. By the signature_separator setting, to match the beginning of an e-mail signature. There are four supported types of pattern, described below: Substring Patterns, like "example" Glob Patterns, like %example% Regular Expressions, like /example/ Undelimited Patterns, like example Several examples of regular expressions are illustrated: Example 1 - a list of special characters Example 2 - escaping '.' is required Example 3 - escaping '@' is required Example 4 - matching the beginning and end of string Example 5 - matching anything and everything Example 6 - escaping '*' is required Example 7 - case sensitivity Example 8 - overly safe escaping doesn't hurt Example 9 - matching (or NOT matching) white space Example 10 - negated or inverted matches Majordomo is written in the Perl programming language. Perl regular expressions are a powerful but complicated tool for pattern matching. To eliminate some of the complexity, three simpler forms of pattern matching are provided, in addition to full Perl regular expressions. A pattern is usually enclosed in "delimiters," with optional "modifiers" outside the delimiters. The delimiters indicate where the pattern begins and ends, and the modifiers change how matches are found. For example, in the pattern: "example.net"i the delimiters are quotes, and the 'i' is a modifier. The most common modifier, the letter 'i', makes the matching case-insensitive, meaning that small and capital letters are considered identical. The negation modifier, '!', may be used to invert any of the four kinds of pattern. For example, !edu would match any string of characters that does not contain "edu". The special pattern ALL will match everything. Substring Patterns ------------------ Examples: "example.com" "user@somewhere.example.com"i The delimiter is a double quote. There are no special characters; the pattern matches if the pattern occurs anywhere within the text to be matched. A trailing 'i' specifies that the matching is case-insensitive. For instance, "bsc" would match unsubscribe "bsc" would not match unsuBsCribe "bsc"i would match unsuBsCribe Glob Patterns ------------- Examples: %user@*example.com%i %u-???@*example.com%i The delimiter is a percent sign. These patterns are reminiscent of file-matching patterns from the DOS and Unix command line interfaces. Special characters include: ? matches any single character * matches any number (including zero) of any character. [] are used to define character classes. For instance, [abc] will match any one of the letters a, b, or c. This style of grouping has the same effect as in regular expressions. Regular Expressions ------------------- What follows is a basic discussion of Perl regular expressions. There is one important difference between Majordomo regular expressions and Perl regular expressions: in Perl version 5 and above, the '@' character should be "escaped" with a backslash, \@. Majordomo will compensate if you forget to add the backslash, but for the sake of correctness you should always include it when you are trying to match a literal '@' symbol. Example 1 - a list of special characters A regular expression is a concise way of expressing a pattern in a series of characters. The full power of regular expressions can make some difficult tasks quite easy, but we will only brush the surface here. The character / is used to mark the beginning and end of a regular expression. Letters and numbers stand for themselves. Many of the other characters are symbolic. Some commonly used ones are: ! negates what follows, matching when the expression does NOT \@ the `@' found in nearly all addresses; it must be preceded by a backslash to avoid errors. . (period) any character * previous character, zero or more times; note especially... .* any character, zero or more times + previous character, one or more times; so for example... a+ letter "a", one or more times \ next character stands for itself; so for example... \. literally a period, not meaning "any character" ^ beginning of the string; so for example... ^a a string beginning with letter "a" $ end of the string; so for example... a$ a string ending with letter "a" Example 2 - escaping '.' is required /foo\.example\.com/ Notice that the periods are preceded by a backslash so that they are interpreted as periods, rather than wildcards. This matches any string containing: foo.example.com such as: foo.example.com bar.foo.example.com user@bar.foo.example.com users%bar.foo.example.com@example.com Example 3 - escaping '@' is required /johndoe\@.*foo\.example\.com/ The `@' has special meaning to Perl and should be prefixed with a backslash to avoid errors. The string ".*" means "any character, zero or more times". So this matches: johndoe@foo.example.com johndoe@terminus.foo.example.com ajohndoe@terminus.foo.example.com But it doesn't match: johndoe@example.com brent@foo.example.com Example 4 - matching the beginning and end of string /^johndoe\@.*cs\.example\.org$/ This is similar to Example 4.3, and matches the same first two strings: johndoe@foo.example.org johndoe@terminus.foo.example.org But it doesn't match: ajohndoe@terminus.foo.example.org ...because the regular expression says the string has to begin with letter "j" and end with letter "g", by using the ^ and $ symbols, and neither of those is true for ajohndoe@terminus.foo.example.org@example.com. Example 5 - matching anything and everything /.*/ This is the regular expression that matches anything (any character, zero or more times). Example 6 - escaping '*' is required /.\*johndoe/ Here the * is preceded by a \, so it refers literally to an asterisk character and not the symbolic meaning "zero or more times". The '.' still has its symbolic meaning of "any one character", so it would match: a*johndoe s*johndoe Because the . by itself implies one character, it would not match: *johndoe Example 7 - case sensitivity Normally all matches are case sensitive; you can make any match case insensitive by appending an `i' to the end of the expression. /example\.com/i This would match example.com, EXAMPLE.com, ExAmPlE.cOm, etc. Removing the `i': /example\.com/ ...would match example.com but not EXAMPLE.com or any other capitalization. Example 8 - overly safe escaping doesn't hurt To be on the safe side put a \ in front of any characters in the regular expressions that are not numbers or letters. In order to put a / into the regular expression, the same rule holds: precede it with a \. Thus, with \ in front of the / and = characters, this: /\/CO\=US/ ...matches /CO=US and may be a useful regular expression to those of you who need to deal with X.400 addresses that contain / characters. Example 9 - matching (or NOT matching) white space Normally, all whitespace within a pattern is matched verbatim, but it is sometimes desirable to add some additional space within a pattern to make it more readable. For instance, here is a pattern matching some common quoting characters in email: /^(-|:|>|[a-z]+>)/i This can be a bit difficult to follow, so we can space it out a bit: /^( - | : | > | [a-z]+> )/xi The 'x' modifier specifies that whitespace is to be ignored, and makes the pattern a bit easier to read. If you want to match actual whitespace, use '\s'. Note that the 'x' modifier provides additional functionality to Perl code relating to comments, but because Majordomo requires patterns to lie all on a single line, this is not significant here. Example 10 - negated or inverted matches Negated matches (like !/^sub/) work in places where they have meaning, such as the taboo expression matcher which has lots of complicated logic to handle them, but not all places. Majordomo patterns just get sent through a function that turns them into regular expressions... which may or may not make sense in the context you want to use them. For example who-regexp listname !/xxx\.com/ will produce a list of subscribers to "listname" that are NOT from the 'xxx.com' domain. Be careful to escape the period, which otherwise will match any character, not just a period. Undelimited Patterns -------------------- In the previous sections, all of the patterns were considered to be enclosed in quotes, slashes, or percent signs. It is legitimate to use patterns without enclosing them in those delimiters in some cases. However, the kind of matching done will depend upon where the pattern is used. In the archive-sync command, an exact match. In the lists and rekey commands, an exact, case-insensitive match. In the which and who commands, a case-insensitive substring match. In the attachment_filters setting, an exact, case-insensitive match. In the attachment_rules setting, an exact, case-insensitive match. In the post_limits setting, a case-insensitive substring match. In all of the other cases mentioned in the first section, pattern delimiters are required. Using a pattern without delimiters will cause an error. See Also: help admin help archive help configset_access_rules help configset_advertise help configset_admin_body help configset_admin_headers help configset_attachment_filters help configset_attachment_rules help configset_bounce_probe_pattern help configset_bounce_rules help configset_delivery_rules help configset_noadvertise help configset_post_limits help configset_quote_pattern help configset_signature_separator help configset_taboo_body help configset_taboo_headers help lists help overview help rekey help set help unregister help unsubscribe help which help who This is the "patterns" help document for Majordomo 2, version 0.1200401130. For a list of all help documents, send the following command: help topics in the body of a message to mj2@smoe.org.
For assistance, please contact the smoe.org administrators.
Sign In | Sign Out | Subscribe to Mailing Lists | Unsubscribe or Change Settings | Help |