Formal syntax for PATTM searches

Examples of PATTM searches

[See also names of regions to be used with docs when accessing the OED from PAT.]

"universit"words that start as specified
universitwords that start as specified by non-reserved alphanumeric string
"to be or not to be"phrases that start as specified
"1965".."1974"strings within the range specified
war near peacewords near (within 80 characters) another word
war fby.30 peacewords followed within 30 characters by another word
war not near.20 peacewords not followed within 20 characters by another word
signif.4 "university "the most frequent 4-word phrase starting with a word
signif "univers"the most frequent word starting with a prefix
signif.-5 "to be or"the fifth most frequent phrase starting with a prefix
colleg within docs Twords within regions of given type
(docs LF) not within docs HGregions not within regions of given type
docs DEF incl collegregions of given type containing specified words
docs DEF incl.5 collegregions of given type containing 5 occurrences of specified words
docs ""..""(smallest) regions starting and ending with specified "words"
(docs Q incl knuth) + (docs Q incl wiederhold)union of two sets of regions
"university " + "university<"looking for a word
heb - hebridesdifference between two sets of words
shakespeare = docs EQ incl "shaks"naming a resulting set
docs E incl *shakespeareusing a named set
docs HG within %using the results of the previous set

Formal Syntax for the PATTM Text Searching Language

Originally produced May 10, 1990
Centre for the New OED and Text Research, University of Waterloo
Edited for HTML access by Frank Tompa

PATTM is a registered trademark of Open Text Corporation

The following grammar describes the formal syntax of a subset of the query language accepted by PATTM. When PATTM is accessed through the Web, certain language features are meaningless and/or disabled. Thus, this grammar omits language features addressing index control, environment control, stopping, querying history, lrep, and printing.

The notation <string;> refers to either an alphanumeric sequence of characters starting with an alphabetic character or an arbitrary character string surrounded by double quotes (and within which a double quote character must be preceded by a back slash). <number> refers to a sequence of digits, and <empty> is an empty string.

   Statement    :  Query
                |  <string> "=" Query
                |  <empty>

   Query        :  Boolean
                |  OpCode OptParam OptQuery

   Boolean      :  Boolean "+" ABoolean
                |  ABoolean

   ABoolean     :  ABoolean  "^"  RBoolean
                |  NxtBoolean
                |  RBoolean "not" "nxt" Proxim RBoolean
                |  FbyBoolean
                |  RBoolean "not" "fby" Proxim RBoolean
                |  DiffBoolean
                |  RBoolean

   FbyBoolean   :  RBoolean "fby" Proxim FbyBoolean
                |  RBoolean "fby" Proxim RBoolean

   NxtBoolean   :  RBoolean "nxt" Proxim NxtBoolean
                |  RBoolean "nxt" Proxim RBoolean

   DiffBoolean  :  DiffBoolean "-" RBoolean
                |  RBoolean "-" RBoolean
                |  RBoolean "including" OptParam RBoolean
                |  RBoolean "within" RBoolean

   RBoolean     :  PrimBoolean
                |  <string> ".." <string>
                |  "docs"
                |  "docs" <string>
                |  "docs" PrimBoolean ".." PrimBoolean

   PrimBoolean  :  <string>
                |  "(" Query ")"
                |  <number>
                |  "[" <number> "]"
                |  "*" <string>
                |  "%"

   Proxim       :  <empty> | "." <number>

   OpCode       :  "sample" | "signif" | "shift"

   OptParam     :  <empty> | "." <number> | "." "-" <number>

   OptQuery     :  <empty> | Query