OED Tagging
Tags:
$a
$bl
$cb
$cf
$db
$e
$eq
$et
$etym
$hg
$hm
$hw
$il
$la
$lc
$pqp
$pr
$ps
$q
$qd
$qp.
$qt
$rx
$s0
$s1
$s2
$s3
$s4
$s5
$s6
$s7a
$s7n
$sq
$sub
$vd
$vf
$vfl
$w
$x
$xd
$xi
$xr
Tagging Structural Elements in the OED
Donna Lee Berg,
Centre for the New Oxford English Dictionary and Text Research, University of Waterloo
(Edited for HTML access and updated to revised tagging by Frank Tompa)
OED regions are structures that allow you to restrict your search
to specific dictionary components, such as etymologies, definitions,
or quotations. In the database, each such component is defined by
descriptive tags which delimit its text. For example, etymologies are
preceded by a "begin" tag "<etym>" and followed by a matching "end" tag
"</etym>", and the resulting units comprise the region set named "$etym".
A highly simplified
outline
indicates the prototypical organization of an entry.
Remember that the OED search engine considers all letters and symbols,
including tags and spaces as characters, and groups them into units that may be words, tags, series of punctuation marks, etc. These factors, combined with the fact that the search engine
interprets the left angle bracket of a tag as a character, have implications if you wish to locate exactly what you type, and nothing
else, within a region. For instance, you may wish to
restrict an "Author" search to "Blake" by using "<a>Blake</a>",
avoiding matches to "F.R. Blake" and "O. Blakeston".
For additional information, see:
-
Berg, D.L.,
The Research Potential of the Electronic OED2 Database at the University of Waterloo: a Guide for Scholars. UW Centre for the New OED, 1989. (Gives examples of searches.)
-
Berg, D.L., A User's Guide to the Oxford English Dictionary.
Oxford University Press, 1993. (Includes a guide to reading entries and
a "Companion" section defining technical terms.)
Author (tag=<a> region=$a)
An author's name normally appears in the printed
text in large and small roman capitals as the second element in a
quotation following the date and preceding the title of the work.
Most authors are cited by initials followed by surname, but surname
only is used for well known authors such as Chaucer and Milton
(note that Shakespeare is usually abbreviated as "Shaks."). A
single author's name may be cited in several forms. Sir Walter
Scott, for instance, appears most frequently as "Scott", but also as
"W. Scott" and "Sir W. Scott". In addition, the OED cites several
W. Scotts. Dates and titles are useful in such instances; for
example, a W. Scott with a publication date of 1635 is obviously
not the Victorian author. A number of other OED author conventions
may affect searches, including: "Bible" (notably the 16ll King
James version) sometimes appears as author, journal quotations
frequently do not give authors' names, and translators are deemed
authors of the English words with the original author's name included
as part of the work, e.g., "Marx's Capital".
Note the effect of name variations and tags (<a>...</a>) on searches.
Querying "W. Scott" within the author structure will locate not
only that form, but "Sir W. Scott" and an "E.W. Scott". Further,
since the search engine considers a hyphen as a token separator, it matches "W. Scott-Taggart".
Hitting the space bar after "Scott" (the usual way of specifying a
word ending) results in no matches because authors' surnames are
followed by the left angle bracket of the "end" tag which the search engine sees
as a character; thus one could specify "W. Scott</a>". Similarly,
to exclude matches such as "E.W. Scott", the "begin" tag <a> would
also be needed.
Bold Sub-Headword (tag=<bl> region=$bl)
One of two types of subordinate headword
included within entries; so called because they appear in bold type
in the printed text. These word forms are commonly either derivatives
(typically formed by adding a suffix to the headword), or combinations
(separate, hyphenated or single words that combine the headword,
usually as the first element, with another existing word). Note,
however, that many derivatives and combinations have entries of their
own because they have developed meanings and histories distinct from
their main word. Bold Sub-Headwords are usually defined and illustrated
by quotations, sometimes grouped within psuedo quotation paragraphs.
(See Subentry and compare with Italic Sub-Headword).
Cited Form (tag=<cf> region=$cf)
A word form in a foreign language, or in an earlier
or regional form of English, that is referred to in an etymology, usually
in the context of its role in explaining the history or origin of a
headword. These forms appear in italics in the printed text preceded
by the language of origin (often abbreviated as L., Gr., OF, etc.).
Words or phrases in this category that occur in the context of elements
other than etymologies or sub-etymologies are somewhat problematic,
since they represent automatic tagging of italicized forms which do not
necessarily conform to this definition. A few such anomalies will be
found in the etymological texts as well. (See also Language.)
Combinations Block (tag=<cb> region=$cb)
This structure consists of one or more Subentries
containing combinational forms usually together with their definition
text.
(See also Derivations Block.)
Cross-Reference (tag=<xr> region=$xr)
Cross-references are widely used in the OED
to refer readers to other entries or to another part of the same entry.
This structure includes four categories of cross-reference elements:
Cross-Reference Headword, Cross-Reference Italic Headword,
Cross-Reference Date, and Relative Cross-Reference.
Cross-Reference Date (tag=<xd> region=$xd)
Appears within cross-references and
is found primarily in definition texts where users are referred to a
quotation by date in the supporting quotation paragraph which follows.
Such quotations usually supply supplementary information about the
meaning of the word, or sometimes provide the entire explanation of
meaning.
Cross-Reference Headword (tag=<x> region=$x)
Frequent references are found in
OED entries to the headwords of other entries, especially in
etymologies. In the printed text, the cross-referenced headword is
printed in small roman capitals, followed by a homonym number,
if relevant, and sometimes by the specific location within the target
entry (see Sense Number).
Cross-Reference Italic Sub-Headword (tag=<xil> region=$xi)
These forms are primarily
italicized combinations cited in entries which are found in another
entry. They are frequently followed by a Cross-Reference Headword,
indicating the main entry in which they will be found. The abbreviation
"s.v." sometimes precedes the headword reference, meaning the combination
is found "sub voce", or "under the word".
Date (tag=<qd> region=$qd)
Normally the first element in an illustrative quotation.
The date given is usually the year in which the cited work was
first published, although there are some discrepancies, especially
in the dating of texts prepared for the first edition of the OED.
Where precise dates could not be established, the date may be qualified
by "C." ("c" in the printed text meaning "circa" or "about") or "A."
("a" in the printed text meaning "ante" or "before"), or by replacing
the last one or two digits with dots, e.g., 17.. Date of composition
is usual for letters, journals, and diaries, while lectures and speeches
are assigned the date of their first appearance in print. Although most
quotations specify some form of date, there are a few exceptions, the
most notable being the many quotations from the Old English epic poem
"Beowulf".
Definition
Generally, a statement explaining the meaning of a
headword, sense, or sub-sense, although definitions in the OED take
several other forms, including cross-references to another sense within
the same entry or within another entry. In addition, a definition may
simply describe the way the word functions in some grammatical or
syntactical context. Definitions should always be read in conjunction
with supporting quotations, since in a historical dictionary, the
latter play an important role in establishing meaning and context.
In fact, in some cases, the actual explanation of the meaning of a
sense is contained in a quotation (see Cross-Reference Date).
This component is not explicitly available in the current version.
Derivations Block (tag=<db> region=$db)
This structure consists of one or more Subentries
containing derivational forms usually together with their definition
text.
(See also Combinations Block.)
Earliest Quotation (region=$eq)
The first non-subsidary quotation, and thus usually the one having a date which is chronologically
earliest, in an OED entry. While this facility can be useful, many
words have multiple senses and sub-senses, either in current use or in
their historical development. The first quotation in the
entry must therefore be viewed in the context of the sense which it
supports.
Entry (tag=<e> region=$e)
Entries are the major structural components of most
modern dictionaries. In the printed OED, entries are arranged
alphabetically by their headwords (the "subject" of the entry)
which appear in dark bold type. There are two types of entries in
the OED: main entries ($e) and cross-reference entries ($ve). Main entries
contain comprehensive information about the history and meaning of
"main form" headwords. The primary function of cross-reference
entries is to direct the user from an obsolete or variant spelling
of a word to its relevant main entry (see also Status). Specifying
Entry ($e) as the region in you wish to search for a word or phrase, or
as a match point for "combining and comparing" two or more sets,
means, therefore that the search engine
searches the entire Dictionary and identifies in which entries your
results are located.
Etymology (tag=<etym> region=$etym)
Etymologies trace the origin or derivation of
headwords and are enclosed in square brackets in the printed text,
normally following a variant form list, if included. Since the
the OED was conceived as a history of the English language, the
original policy was to trace non-native words to the foreign word
or word element from which they were immediately adopted or formed,
and native words to their earliest English form. In practice, however,
OED etymologies sometimes exceed these guidelines.
Some etymologies include as their final element a paragraph in small
print, tagged as "<note>" in the database. These are referred to as
"etymological notes" by OED editors and include supplementary comments
or information of an unsubstantiated nature such as "folk" or popular
theories. ("<note>" tags are also used to identify various editorial
comments in small print in other entry elements.) Etymologies are
sometimes attached to individual senses or sub-senses
(see Sub-Etymology).
Headword (tag=<hw> region=$hw)
The subject of a Dictionary entry which appears
in dark bold type in the printed text. An OED headword can be a word,
combination, derivative, phrase, prefix, suffix, combining form,
abbreviation, acronym, letter of the alphabet, or other lexical entity.
Headwords of main entries are usually the most common form of a word in
current use, or the most typical of the later forms of an obsolete word.
Headwords are sometimes preceded by symbols indicating their status
in the language (see Status).
Note that it cannot be concluded that a word form is not defined or
its use illustrated in the OED if it does not appear as a headword.
Many other forms are defined and/or illustrated within entries for
their "main" words (see also Bold Sub-Headword and
Italic Sub-Headword).
Headword Group (tag=<hg> region=$hg)
Defines the initial group of elements in an entry
and includes headword, pronunciation,
part of speech, and homonym number.
Note that, with the exception of the headword, not all of these
elements necessarily appear in every entry.
Homonym Number (tag=<hm> region=$hm)
Homonym numbers are used to distinguish between
or among headwords with the same spelling and part of speech, but
which warrant separate entries because of their distinct meanings and
histories. The number appears in the text as a superscript attached
to a part-of-speech designation, or in the case of some nouns, to the
headword itself. The number gives each headword a specific "address"
which can be used in Dictionary cross-references
(see Cross-Reference Headword).
Italic Sub-Headword (tag=<il> region=$il)
One of two types of subordinate headwords
which are included within entries, and so called because they appear
in heavy italics in the printed text. This category consists primarily
of minor combinations (separate, hyphenated or single words that combine
the headword of the entry, usually as their first element, with another
word form, but which do not require definition since their meaning is
obvious), although it may also include phrases and idioms. Groups of
combinations are usually listed alphabetically within one or more senses
and are followed by a pseudo quotation paragraph, containing quotations
illustrating their use in the same order.
(Compare with Bold Sub-Headword.)
Label (tag=<la> region=$la)
In the printed OED, labels are italicized designations,
usually abbreviated, which inform Dictionary readers of the boundaries
within which a word or sense is, or was, used. In current OED
terminology, there are five categories of labels: status (obsolete,
rare, colloquial, etc.); regional (indicating a geographical area
of usage, such as the U.S.); grammatical (describing the syntactical
role of the word or sense, such as plural or collective); semantic
(indicating the interpretation given to a word or sense in a
particular context, such as figurative, transferred, specific, etc.);
and subject (specifying the discipline, profession, trade, etc. in
in which a word or sense is used).
It is important to note that subject labels in particular are not
consistently used and their specificity may vary, often because of
historical change. For instance, the label "Natural History" (Nat.
Hist.) is found in a number of older entries. Since this discipline
has been largely superseded and sub-divided, labelling of more
current entries reflects these changes.
(For definitions and explanations of terms used in OED labels,
see D.L. Berg, 1993, and for an example of a search for words used
in a particular subject field, see D.L. Berg, 1989.)
Language
This structure contains language references in
etymologies and sub-etymologies. OED lexicographers identified
over 1,000 different language forms (including abbreviations and regional
variations) used in these contexts. While the structure is of
considerable assistance in extracting languages that played a part
in the origin or history of a word, care must be exercised in using this
facility to identify the language from which a word passed directly
into English (for examples of problems and techniques associated
with such searches, see D.L. Berg, 1989.) Also, some further
identification refinement is necessary since automatic tagging of
forms includes instances where language names appear attributively
as adjectives specifying nationality, e.g., Italian wine-makers.
Note that language forms are usually abbreviated, not always
consistently, and full forms can be found in the "List of Abbreviations" which appears at the front of each Dictionary volume.
This component is not explicitly available in the current version.
Latest Quotation
This term refers to the quotation in an entry
for an obsolete word which exemplifies the last located use of the
form. In other words, the criterion used for the category is the
chronologically most recent date in entries preceded by a "dagger"
status symbol indicating that the headword is an obsolete form
(see Status).
This component is not explicitly available in the current version.
Location (tag=<lc> region=$lc)
Refers to the location within the work that was the source for a
quotation. The location usually appears in roman font following the title
of the work and preceding the actual quotation text. It normally
designates the specific chapter, page, act, scene, etc.
where the cited quotation can be found.
Part of Speech (tag=<ps> region=$ps)
A grammatical category (verb, adjective,
adverb, etc.). In print, in the case of headwords, the part of
speech normally appears in abbreviated form following the
pronunciation. A part-of-speech identification may also be used to
describe a sense or subordinate headword (see Subentry). Where
no part of speech is included, the form may be assumed to be a
noun in most cases. Note that in all instances, the OED employs the
term "substantive" (abbreviated "sb.") instead of "noun", in keeping
with the tradition in early grammars of distinguishing between a
"noun substantive" and a "noun adjective". In general, the term
"sb." is only applied when it is necessary to differentiate a noun
entry from an entry for a word of the same spelling, but with a
different part of speech, or sometimes in instances where there are
several noun homonyms, in which case a homonym number is added.
The more usual convention for noun homonyms is to add the number
to the headword itself.
Pronunciation (tag=<pr> region=$pr)
The second edition of the OED employs the
International Phonetic Alphabet for transcribing pronunciation, in
contrast to the first edition which used a system invented by its
primary editor, James Murray. In print, pronunciation, when given, appears in brackets
immediately following the headword. The Dictionary gives the
pronunciation of most current, "main" headwords, with the exception
of some derivatives and combinations, and some single-syllable words,
where pronunciation is self-evident. Stress-marks, indicating
emphasis, are sometimes included for these exceptions as well as for
obsolete words for which pronunciation is not normally supplied.
Pronunciation is, in most cases, in accordance with standard
southern British speech, although alternative British or non-British usages may sometimes be included. A special parallels
symbol precedes some foreign pronunciation alternatives (see
Status).
Pseudonym
Where the author of a quotation used an assumed
or pen name, he or she is usually cited by the pseudonym which
appears in print in the OED within single quotation marks. The
latter are eliminated in the case of certain well-know pseudonymous
authors such as George Eliot. For authors who have used both their
real names and one or more pseudonyms, the name under which the
particular cited work was published is normally given.
This component is not explicitly available in the current version.
Pseudo Quotation Paragraph (tag=<pqp> region=$pqp)
Identifies paragraphs of
quotations that illustrate a number of word forms, rather than a
single word or sense. These forms are usually Bold Sub-Headwords
or Italic Sub-Headwords included within entries and often listed in alphabetical sequence within a single sense, e.g., "television announcer,
audience, broadcast, commercial, crew, critic, discussion, ...".
The accompanying so-called "pseudo" quotation paragraph usually
organizes citations in the same order. As an aid to readers, an
asterisk often precedes the initial, i.e., chronologically first,
quotation in each grouping.
Quotation (tag=<q> region=$q)
The second edition of the OED contains nearly two
and a half million quotations which perform the important function of
illustrating the use, form, history, and meaning of word forms in a
given sense. Normally quotations pertaining to a particular sense
are organized in a quotation paragraph in chronological order by
date of publication or composition. Citations typically include the
include the following elements: date;
author; work
(i.e., title), location within the work, such as chapter, page,
act, scene, etc.; and the quotation text. Quotations are drawn
from all forms of written and published works, including books,
manuscripts, journals, newspapers, letters, and diaries, and represent
both literary and popular sources.
The policy of the first edition, which dealt with most of the "core"
words in the English language, was to include at least one example
of use per century. This ratio, however, was increased considerably
for entries added in the 1972-86 Supplement and the second edition.
Occasionally, in entries compiled for the first edition, no
examples of contemporary usage could be found and illustrations
were "made up". Such quotations are introduced by the abbreviation
"Mod." (for "modern") and usually appear without a date. (See also
Subsidiary Quotation.)
Quotation Paragraph (tag=<qp> region=$qp)
Definitions of words and senses are
generally followed by a paragraph in smaller print which lists
illustrative quotations in chronological sequence (earliest date
first). Occasionally, when a sense covers both the literal and
figurative use of a word, more than one quotation paragraph is
used. (For an exception to these conventions, see Pseudo Quotation
Paragraph.)
Quotation Text (tag=<qt> region=$qt)
This structure contains the actual phrase or
passage extracted from the text, as compared to the full citations
included in the Quotation region, of all the Dictionary's
illustrative quotations. The texts are printed and spelled as they
appear in the source edition used. Occasionally, a portion of a
quotation text is eliminated and the omission is indicated by two
dots (..), or three (...), if the elision includes a period. Sometimes
an explanatory word may be inserted in square brackets, and the insert
may be preceded by the abbreviation "sc." for "scilicet", meaning
"understand" or "supply". In instances where the text quoted is a
song title, advertisement or other unusual source, this information
is usually given in brackets.
Relative Cross-Reference (tag=<rx> region=$rx)
The OED contains a number of
cross-references which use the terms "prec." (preceding) or "next"
to indicate to Dictionary users that they should refer to the
preceding or next entry, or, in some cases, to the preceding or
next sense in the same entry. A frequent use of "prec.", for
example, is found in etymolologies of entries for derivatives
or combinations which combine the headword of the previous or
"preceding" entry with a suffix, combining form, or another word.
The Dictionary distinguishes this particular type of reference by
tagging all the occurrences of "prec." and "next" within
cross-references as "relative cross references."
Sense Level 0 (tag=<s0> region=$(s0))
The various senses and sub-senses in the OED are organized
in a hierarchical scheme utilizing numbers and letters to distinguish steps
in a headword's development. Sense development is usually chronological,
starting with the earliest sense, except for some entries which follow
"logical order". The simplest form of identifying senses is linear (1, 2, 3, ...),
but often further subdivisions are required which are ordered a, b, c, ... (with
the letters in bold type). Further subdivisions are made by italicized
series (a), (b), (c), ... or (i), (ii), (iii), ... , or, occasionally, small Greek
letters (alpha, beta, ...).
When a word's development is not straightforwardly linear (for example, when
groups of senses developed simultaneously or diversely), a second level of
numbering and lettering employing upper case roman numerals (I, II, III, ...)
identifies branches. Sometimes two parts of speech, such as noun and
adjective, are included in one entry, and each "fork" is then identified
by the highest level of the scheme, upper case letters (A, B, C, ...). The
two upper levels may be integrated in one entry, and are also occasionally
used for other purposes, such as organizing groups of senses syntactically
or semantically.
Sense levels 1, 2, 4, 6, and 7 identify groups and senses numbered
according to this scheme. Level 1 refers to A, B, ... groups; Level 2 to
the I, II, ... groupings; Level 4 to structures numbered 1, 2, ...; Level 6
to the a, b, ... sub-senses; and Level 7 to the italicized bracketed
sub-division of sub-senses - (a), (i), or Greek letters. The remaining
numbers are used as follows: Level 0 (zero) identifies unnumbered sense
sections, such as initial over-arching text preceding a regular sense
numbering, or unnumbered final paragraphs beginning with the word "hence" that
usually contain one or more derivatives. Levels 3 and 5 contain increasing
numbers of asterisks (*, **, ***, ...) that provide another means of grouping
senses by semantic or syntactical headings in lengthy entries.
The tagging structure consistently places the closing tag before the final quotation
paragraph that logically belongs to that sense (or its last sub-sense). Therefore
when searching for a sense at level i, it is usually preferable to use the region name
$(Sensei) (created for convenience) rather than $(si).
Sense Level 1 (tag=<s1> region=$(s1))
Identifies groups of senses lettered A, B, C, ... and is
primarily used to separate two (or more) parts of speech (e.g., noun
adjective) when they are included in a single entry.
For a further explanation of sense structure and groupings, see
Sense Level 0.
Sense Level 2 (tag=<s2> region=$(s2))
Used to identify groups of senses numbered I, II, ... ,
representing branches of meanings which developed simultaneously or
diversely.
For a further explanation of sense structure and groupings, see
Sense Level 0.
Sense Level 3 (tag=<s3> region=$(s3))
A structure which takes the form in the printed text
of an increasing number of asterisks (*, **, ***, ...), and is sometimes
used in complex and lengthy entries to group senses under semantic or
syntactical headings. Sense
Level 5 is also sometimes used for the same purpose.
For a further explanation of sense structure and groupings, see
Sense Level 0.
Sense Level 4 (tag=<s4> region=$(s4))
The most common type of sense development
structure in which senses are numbered consecutively 1, 2, 3, ...
For a further explanation of sense structure and groupings, see
Sense Level 0.
Sense Level 5 (tag=<s5> region=$(s5))
A structure which takes the form in the printed
text of an increasing number of asterisks (*, **, ***, ...), and
is sometimes used in complex and lengthy entries to group senses under
semantic or syntactical headings. Sense
Level 3 is also sometimes used for a the same purpose.
For a further explanation of sense structure and groupings, see
Sense Level 0.
Sense Level 6 (tag=<s6> region=$(s6))
Identifies the lower-case bold letter structure
(a, b, c, ...) used to subdivide senses. For a further explanation of
sense structure and groupings, see Sense Level 0.
Sense Level 7 (tag=<s7a>,<s7n> regions=$(s7a),$(s7n))
Identifies the structure using italicized and bracketed
letters (a), (b), ..., or numbers (i), (ii), (iii), ..., or, rarely, lower case
Greek letters (alpha, beta, ...) attached to sub-divisions of sub-senses,
and usually found in lengthy and complex entries.
For a further explanation of sense structure and groupings, see Sense Level 0.
Sense Number (attribute sn="val")
A sense is a numbered and/or lettered entry
component which includes as its major elements a definition and
supporting quotation paragraph. The number or letter enclosed
by sense number tags not only serves to organize the senses, it
also provides a unique address for each sense, an important feature
for cross-referencing. Sense identification is especially important
in the OED since some entries contain 100 or more senses; for
example, the verb "run" has 82 main senses and over 350 sub-senses.
(For an explanation of how senses are structured, see Sense Level 0,
and also compare with Cross-Reference Sense Number.)
Status (attribute st="val")
A status attribute specifies several types of symbols that
usually precede a headword or sense and indicate the form's
status in the language. These include the dagger symbol which identifies
an obsolete entry or sense (also usually further identified by a label
"Obs." following the headword); parallels signifying non-naturalized
words or pronunciations; and the so-called "catachrestic" symbol (a reversed
paragraph symbol) identifying a confused or erroneous sense. Within the
status attribute, these symbols are reprsented by the values "obs"
(for "obsolete"), "ali" (for "alien), and "err" (for "erroneous").
In addition, status tagging identifies two types of entries:
1. the numerous cross-reference entries, the headwords of which represent
obsolete or variant spellings of main words, and which refer the user
from these forms to the relevant "main" entry. These are identified
by the abbreviation "xref".
2. a small number (387) of "spurious" entries which are entirely enclosed
in square brackets. All of these entries were compiled for the first
edition and consist of words that are erroneous, false, or could not
be authenticated. Their purpose was primarily to correct errors found
in earlier dictionaries resulting from copyists' or translators' errors,
misprints, or misreadings of the text. These are identified by the
abbreviation "spu".
Stressed Form
The full form of main headwords, bold sub-headwords,
and italic sub-headwords. "Full form" means that each form
incorporates diacritics, diphthongs, punctuation, stress marks, etc. as
they appear in the printed Dictionary. In the database, each of these
typographical elements is tagged, although not all headwords contain such
elements, e.g., monosyllabic words and most combinations and derivatives,
for which stress is self-evident.
In the current version, all lemmas are shown in their stressed forms only,
but the search engine ignores stress marks and diacritics, and it expands dipthongs,
such as "&ae;", to the corresponding letter pair, i.e., "ae".
However, there are some combinations included within entries that are less easily
located because the OED sometimes lists minor forms in a style similar
to the following example from the entry for "orange": "orange-bloom,
-grove, -juice, kernel, leaf, -pip..". A computer program inserted the
first element in front of (or, in some cases, following) hyphens. Thus,
"orange-grove" and "orange-juice" will be located, but further refinement
of the program is needed in order to find unhyphenated minor combinations
such as "orange kernel" and "orange leaf". These combinations can often
be located by searching quotation texts.
Subentry (tag=<sub> region=$sub)
This structure consists mainly of Bold Sub-Headwords
(i.e., defined and illustrated combinations and derivatives included
within the entry for their main word) together with their definition
text. Corresponding quotations can often be found in
pseudo quotation paragraphs.
(See also Headword and Italic Sub-Headword).
Sub-Etymology (tag=<et> region=$et)
An etymology attached to a particular sense of
a headword. These subordinate etymologies appear in square brackets
in the printed text, and normally contain historical information
relating to the sense of a word which does not lend itself to inclusion
in the etymology at the head of the entry.
its use illustrated in the OED if it does not appear as a headword.
Many other forms are defined and/or illustrated within entries for
their "main" words (see also Bold Sub-Headword and Italic Sub-Headword).
Subsidiary Quotation (region=$sq)
This structure contains quotations in square
brackets which are occasionally found in quotation paragraphs, usually
as the first citation(s). The convention is used when a quotation does
not actually employ the word in context, but is in some way relevant to
its history. For example, in the case of a word borrowed from another
language, the quotation may document its use in the language of origin.
Superscript (tag=<su> region=$su)
Typographical tagging in this category is attached
to most text in the Dictionary which appears in superscript, with the
exception of homonym numbers. Superscript text includes miscellaneous
typographical conventions used in printing Murray pronunciations (see
Pronunciation), mathematical functions, etc. In addition, it contains
two special superior numbers preceded by a dash (-0 and -1) that sometimes
further define the label "rare". In the first instance, the -0 indicates
the word was found only in an earlier dictionary rather than a contextual
quotation, while -1 means that only one quotation from a text other than
a dictionary was found.
Variant Date (tag=<vd> region=$vd)
Earlier forms of spelling, irregular inflexions,
etc. included in variant forms lists, are assigned century ranges
indicating when their usage was prevalent. Centuries appear in abbreviated
form, for example, "5-6" indicates fifteenth to sixteenth century.
Variant Form (tag=<vf> region=$vf)
The OED attempts to include all documented
earlier spellings, irregular inflexions, unusual plurals, etc. of
headwords, where appropriate or known. These are contained in a
Variant Forms List preceding the etymology. Regional labels are
sometimes included to indicate the geographic area in which the
particular form prevailed (or prevails). Many of these forms also
appear as headwords in cross-reference entries (see Status).
Variant Forms List (tag=<vfl> region=$vfl)
Lists of documented historical, or sometimes
contemporary, variants of a headword's spellings, irregular inflexions,
and unusual plurals that normally appear in the printed text immediately
before the etymology. Forms are further identified by the century range
in which they prevailed. Lists are arranged in chronological order with
the earliest variant(s) first. In some cases, two or more branches of
forms may have developed simultaneously and these are grouped by
lower case italic Greek letters. Illustrative quotations that follow
are often referenced by the same Greek letters. (See also Variant
Date for conventions used for century ranges.)
Work (tag=<w> region=$w)
Refers to the title of the work which was the source for a
quotation. The title usually appears in italics following the author's
name and preceding the specific location within the work and the actual quotation text.
Titles are frequently abbreviated and the definite articles "the" and "and", as well as the
preposition "of", are routinely omitted. Abbreviations used for a
single work can vary; for example, Shakespeare's "Comedy of Errors"
appears as "Com. Err.", "C. Err." and "Err." (for an example of a
search by title, see D.L. Berg, 1989). Some works, such as anonymous
early texts like "Beowulf" and "Cursor Mundi" are cited by title only.
The Bible is a special case; for example, books of the Bible are
sometimes tagged <w> with "Bible" as author, especially for the 1611
King James version (for a discussion of the numerous variations in
citing translations of the Bible, see D.L. Berg, 1993).
Identification of early and obscure works is frequently difficult and
can be aided by reference to the Bibliography which appears at the
end of Volume 20 of the printed text, and which includes most, but
not all, of the titles cited. A notable exception, since this is a
bibliography of English works, are the many foreign dictionaries and
other word books often referred to in etymologies. (For a discussion
of problems associated with matching citations in the Dictionary text
to the Bibliography, see G.V.J. Townsend, "Citation Matching in the
Oxford English Dictionary". UW Centre for the New OED, 1989.)