Formal syntax for PATTM searches
Centre for the New OED and Text Research, University of Waterloo
Edited for HTML access by Frank Tompa
Examples of PATTM searches
[See also names of regions to be used with docs when accessing the OED from PAT.]
command | matches... |
---|---|
"universit" | words that start as specified |
universit | words that start as specified by non-reserved alphanumeric string |
"to be or not to be" | phrases that start as specified |
"1965".."1974" | strings within the range specified |
war near peace | words near (within 80 characters) another word |
war fby.30 peace | words followed within 30 characters by another word |
war not near.20 peace | words not followed within 20 characters by another word |
signif.4 "university " | the most frequent 4-word phrase starting with a word |
signif "univers" | the most frequent word starting with a prefix |
signif.-5 "to be or" | the fifth most frequent phrase starting with a prefix |
colleg within docs T | words within regions of given type |
(docs LF) not within docs HG | regions not within regions of given type |
docs DEF incl colleg | regions of given type containing specified words |
docs DEF incl.5 colleg | regions of given type containing 5 occurrences of specified words |
docs " | (smallest) regions starting and ending with specified "words" |
(docs Q incl knuth) + (docs Q incl wiederhold) | union of two sets of regions |
"university " + "university<" | looking for a word |
heb - hebrides | difference between two sets of words |
shakespeare = docs EQ incl "shaks" | naming a resulting set |
docs E incl *shakespeare | using a named set |
docs HG within % | using the results of the previous set |
Formal Syntax for the PATTM Text Searching Language
Originally produced May 10, 1990Centre for the New OED and Text Research, University of Waterloo
Edited for HTML access by Frank Tompa
PATTM is a registered trademark of Open Text Corporation
The following grammar describes the formal syntax of a subset of the query language accepted by PATTM. When PATTM is accessed through the Web, certain language features are meaningless and/or disabled. Thus, this grammar omits language features addressing index control, environment control, stopping, querying history, lrep, and printing.
The notation <string;> refers to either an alphanumeric sequence of characters starting with an alphabetic character or an arbitrary character string surrounded by double quotes (and within which a double quote character must be preceded by a back slash). <number> refers to a sequence of digits, and <empty> is an empty string.
Statement : Query | <string> "=" Query | <empty> Query : Boolean | OpCode OptParam OptQuery Boolean : Boolean "+" ABoolean | ABoolean ABoolean : ABoolean "^" RBoolean | NxtBoolean | RBoolean "not" "nxt" Proxim RBoolean | FbyBoolean | RBoolean "not" "fby" Proxim RBoolean | DiffBoolean | RBoolean FbyBoolean : RBoolean "fby" Proxim FbyBoolean | RBoolean "fby" Proxim RBoolean NxtBoolean : RBoolean "nxt" Proxim NxtBoolean | RBoolean "nxt" Proxim RBoolean DiffBoolean : DiffBoolean "-" RBoolean | RBoolean "-" RBoolean | RBoolean "including" OptParam RBoolean | RBoolean "within" RBoolean RBoolean : PrimBoolean | <string> ".." <string> | "docs" | "docs" <string> | "docs" PrimBoolean ".." PrimBoolean PrimBoolean : <string> | "(" Query ")" | <number> | "[" <number> "]" | "*" <string> | "%" Proxim : <empty> | "." <number> OpCode : "sample" | "signif" | "shift" OptParam : <empty> | "." <number> | "." "-" <number> OptQuery : <empty> | Query