OED Tagging

Tags: $a $bl $cb $cf $db $e $eq $et $etym $hg $hm $hw $il $la $lc $pqp $pr $ps $q $qd $qp. $qt $rx $s0 $s1 $s2 $s3 $s4 $s5 $s6 $s7a $s7n $sq $sub $vd $vf $vfl $w $x $xd $xi $xr

Tagging Structural Elements in the OED

Donna Lee Berg,
Centre for the New Oxford English Dictionary and Text Research, University of Waterloo

(Edited for HTML access and updated to revised tagging by Frank Tompa)

OED regions are structures that allow you to restrict your search to specific dictionary components, such as etymologies, definitions, or quotations. In the database, each such component is defined by descriptive tags which delimit its text. For example, etymologies are preceded by a "begin" tag "<etym>" and followed by a matching "end" tag "</etym>", and the resulting units comprise the region set named "$etym". A highly simplified outline indicates the prototypical organization of an entry.

Remember that the OED search engine considers all letters and symbols, including tags and spaces as characters, and groups them into units that may be words, tags, series of punctuation marks, etc. These factors, combined with the fact that the search engine interprets the left angle bracket of a tag as a character, have implications if you wish to locate exactly what you type, and nothing else, within a region. For instance, you may wish to restrict an "Author" search to "Blake" by using "<a>Blake</a>", avoiding matches to "F.R. Blake" and "O. Blakeston".

For additional information, see:

  • Berg, D.L., The Research Potential of the Electronic OED2 Database at the University of Waterloo: a Guide for Scholars. UW Centre for the New OED, 1989. (Gives examples of searches.)
  • Berg, D.L., A User's Guide to the Oxford English Dictionary. Oxford University Press, 1993. (Includes a guide to reading entries and a "Companion" section defining technical terms.)

Author (tag=<a> region=$a)

An author's name normally appears in the printed text in large and small roman capitals as the second element in a quotation following the date and preceding the title of the work. Most authors are cited by initials followed by surname, but surname only is used for well known authors such as Chaucer and Milton (note that Shakespeare is usually abbreviated as "Shaks."). A single author's name may be cited in several forms. Sir Walter Scott, for instance, appears most frequently as "Scott", but also as "W. Scott" and "Sir W. Scott". In addition, the OED cites several W. Scotts. Dates and titles are useful in such instances; for example, a W. Scott with a publication date of 1635 is obviously not the Victorian author. A number of other OED author conventions may affect searches, including: "Bible" (notably the 16ll King James version) sometimes appears as author, journal quotations frequently do not give authors' names, and translators are deemed authors of the English words with the original author's name included as part of the work, e.g., "Marx's Capital".

Note the effect of name variations and tags (<a>...</a>) on searches. Querying "W. Scott" within the author structure will locate not only that form, but "Sir W. Scott" and an "E.W. Scott". Further, since the search engine considers a hyphen as a token separator, it matches "W. Scott-Taggart". Hitting the space bar after "Scott" (the usual way of specifying a word ending) results in no matches because authors' surnames are followed by the left angle bracket of the "end" tag which the search engine sees as a character; thus one could specify "W. Scott</a>". Similarly, to exclude matches such as "E.W. Scott", the "begin" tag <a> would also be needed.

Bold Sub-Headword (tag=<bl> region=$bl)

One of two types of subordinate headword included within entries; so called because they appear in bold type in the printed text. These word forms are commonly either derivatives (typically formed by adding a suffix to the headword), or combinations (separate, hyphenated or single words that combine the headword, usually as the first element, with another existing word). Note, however, that many derivatives and combinations have entries of their own because they have developed meanings and histories distinct from their main word. Bold Sub-Headwords are usually defined and illustrated by quotations, sometimes grouped within psuedo quotation paragraphs. (See Subentry and compare with Italic Sub-Headword).

Cited Form (tag=<cf> region=$cf)

A word form in a foreign language, or in an earlier or regional form of English, that is referred to in an etymology, usually in the context of its role in explaining the history or origin of a headword. These forms appear in italics in the printed text preceded by the language of origin (often abbreviated as L., Gr., OF, etc.). Words or phrases in this category that occur in the context of elements other than etymologies or sub-etymologies are somewhat problematic, since they represent automatic tagging of italicized forms which do not necessarily conform to this definition. A few such anomalies will be found in the etymological texts as well. (See also Language.)

Combinations Block (tag=<cb> region=$cb)

This structure consists of one or more Subentries containing combinational forms usually together with their definition text. (See also Derivations Block.)

Cross-Reference (tag=<xr> region=$xr)

Cross-references are widely used in the OED to refer readers to other entries or to another part of the same entry. This structure includes four categories of cross-reference elements: Cross-Reference Headword, Cross-Reference Italic Headword, Cross-Reference Date, and Relative Cross-Reference.

Cross-Reference Date (tag=<xd> region=$xd)

Appears within cross-references and is found primarily in definition texts where users are referred to a quotation by date in the supporting quotation paragraph which follows. Such quotations usually supply supplementary information about the meaning of the word, or sometimes provide the entire explanation of meaning.

Cross-Reference Headword (tag=<x> region=$x)

Frequent references are found in OED entries to the headwords of other entries, especially in etymologies. In the printed text, the cross-referenced headword is printed in small roman capitals, followed by a homonym number, if relevant, and sometimes by the specific location within the target entry (see Sense Number).

Cross-Reference Italic Sub-Headword (tag=<xil> region=$xi)

These forms are primarily italicized combinations cited in entries which are found in another entry. They are frequently followed by a Cross-Reference Headword, indicating the main entry in which they will be found. The abbreviation "s.v." sometimes precedes the headword reference, meaning the combination is found "sub voce", or "under the word".

Date (tag=<qd> region=$qd)

Normally the first element in an illustrative quotation. The date given is usually the year in which the cited work was first published, although there are some discrepancies, especially in the dating of texts prepared for the first edition of the OED. Where precise dates could not be established, the date may be qualified by "C." ("c" in the printed text meaning "circa" or "about") or "A." ("a" in the printed text meaning "ante" or "before"), or by replacing the last one or two digits with dots, e.g., 17.. Date of composition is usual for letters, journals, and diaries, while lectures and speeches are assigned the date of their first appearance in print. Although most quotations specify some form of date, there are a few exceptions, the most notable being the many quotations from the Old English epic poem "Beowulf".

Definition

Generally, a statement explaining the meaning of a headword, sense, or sub-sense, although definitions in the OED take several other forms, including cross-references to another sense within the same entry or within another entry. In addition, a definition may simply describe the way the word functions in some grammatical or syntactical context. Definitions should always be read in conjunction with supporting quotations, since in a historical dictionary, the latter play an important role in establishing meaning and context. In fact, in some cases, the actual explanation of the meaning of a sense is contained in a quotation (see Cross-Reference Date).

This component is not explicitly available in the current version.

Derivations Block (tag=<db> region=$db)

This structure consists of one or more Subentries containing derivational forms usually together with their definition text. (See also Combinations Block.)

Earliest Quotation (region=$eq)

The first non-subsidary quotation, and thus usually the one having a date which is chronologically earliest, in an OED entry. While this facility can be useful, many words have multiple senses and sub-senses, either in current use or in their historical development. The first quotation in the entry must therefore be viewed in the context of the sense which it supports.

Entry (tag=<e> region=$e)

Entries are the major structural components of most modern dictionaries. In the printed OED, entries are arranged alphabetically by their headwords (the "subject" of the entry) which appear in dark bold type. There are two types of entries in the OED: main entries ($e) and cross-reference entries ($ve). Main entries contain comprehensive information about the history and meaning of "main form" headwords. The primary function of cross-reference entries is to direct the user from an obsolete or variant spelling of a word to its relevant main entry (see also Status). Specifying Entry ($e) as the region in you wish to search for a word or phrase, or as a match point for "combining and comparing" two or more sets, means, therefore that the search engine searches the entire Dictionary and identifies in which entries your results are located.

Etymology (tag=<etym> region=$etym)

Etymologies trace the origin or derivation of headwords and are enclosed in square brackets in the printed text, normally following a variant form list, if included. Since the the OED was conceived as a history of the English language, the original policy was to trace non-native words to the foreign word or word element from which they were immediately adopted or formed, and native words to their earliest English form. In practice, however, OED etymologies sometimes exceed these guidelines.

Some etymologies include as their final element a paragraph in small print, tagged as "<note>" in the database. These are referred to as "etymological notes" by OED editors and include supplementary comments or information of an unsubstantiated nature such as "folk" or popular theories. ("<note>" tags are also used to identify various editorial comments in small print in other entry elements.) Etymologies are sometimes attached to individual senses or sub-senses (see Sub-Etymology).

Headword (tag=<hw> region=$hw)

The subject of a Dictionary entry which appears in dark bold type in the printed text. An OED headword can be a word, combination, derivative, phrase, prefix, suffix, combining form, abbreviation, acronym, letter of the alphabet, or other lexical entity. Headwords of main entries are usually the most common form of a word in current use, or the most typical of the later forms of an obsolete word. Headwords are sometimes preceded by symbols indicating their status in the language (see Status).

Note that it cannot be concluded that a word form is not defined or its use illustrated in the OED if it does not appear as a headword. Many other forms are defined and/or illustrated within entries for their "main" words (see also Bold Sub-Headword and Italic Sub-Headword).

Headword Group (tag=<hg> region=$hg)

Defines the initial group of elements in an entry and includes headword, pronunciation, part of speech, and homonym number. Note that, with the exception of the headword, not all of these elements necessarily appear in every entry.

Homonym Number (tag=<hm> region=$hm)

Homonym numbers are used to distinguish between or among headwords with the same spelling and part of speech, but which warrant separate entries because of their distinct meanings and histories. The number appears in the text as a superscript attached to a part-of-speech designation, or in the case of some nouns, to the headword itself. The number gives each headword a specific "address" which can be used in Dictionary cross-references (see Cross-Reference Headword).

Italic Sub-Headword (tag=<il> region=$il)

One of two types of subordinate headwords which are included within entries, and so called because they appear in heavy italics in the printed text. This category consists primarily of minor combinations (separate, hyphenated or single words that combine the headword of the entry, usually as their first element, with another word form, but which do not require definition since their meaning is obvious), although it may also include phrases and idioms. Groups of combinations are usually listed alphabetically within one or more senses and are followed by a pseudo quotation paragraph, containing quotations illustrating their use in the same order. (Compare with Bold Sub-Headword.)

Label (tag=<la> region=$la)

In the printed OED, labels are italicized designations, usually abbreviated, which inform Dictionary readers of the boundaries within which a word or sense is, or was, used. In current OED terminology, there are five categories of labels: status (obsolete, rare, colloquial, etc.); regional (indicating a geographical area of usage, such as the U.S.); grammatical (describing the syntactical role of the word or sense, such as plural or collective); semantic (indicating the interpretation given to a word or sense in a particular context, such as figurative, transferred, specific, etc.); and subject (specifying the discipline, profession, trade, etc. in in which a word or sense is used).

It is important to note that subject labels in particular are not consistently used and their specificity may vary, often because of historical change. For instance, the label "Natural History" (Nat. Hist.) is found in a number of older entries. Since this discipline has been largely superseded and sub-divided, labelling of more current entries reflects these changes.

(For definitions and explanations of terms used in OED labels, see D.L. Berg, 1993, and for an example of a search for words used in a particular subject field, see D.L. Berg, 1989.)

Language

This structure contains language references in etymologies and sub-etymologies. OED lexicographers identified over 1,000 different language forms (including abbreviations and regional variations) used in these contexts. While the structure is of considerable assistance in extracting languages that played a part in the origin or history of a word, care must be exercised in using this facility to identify the language from which a word passed directly into English (for examples of problems and techniques associated with such searches, see D.L. Berg, 1989.) Also, some further identification refinement is necessary since automatic tagging of forms includes instances where language names appear attributively as adjectives specifying nationality, e.g., Italian wine-makers.

Note that language forms are usually abbreviated, not always consistently, and full forms can be found in the "List of Abbreviations" which appears at the front of each Dictionary volume. This component is not explicitly available in the current version.

Latest Quotation

This term refers to the quotation in an entry for an obsolete word which exemplifies the last located use of the form. In other words, the criterion used for the category is the chronologically most recent date in entries preceded by a "dagger" status symbol indicating that the headword is an obsolete form (see Status).

This component is not explicitly available in the current version.

Location (tag=<lc> region=$lc)

Refers to the location within the work that was the source for a quotation. The location usually appears in roman font following the title of the work and preceding the actual quotation text. It normally designates the specific chapter, page, act, scene, etc. where the cited quotation can be found.

Part of Speech (tag=<ps> region=$ps)

A grammatical category (verb, adjective, adverb, etc.). In print, in the case of headwords, the part of speech normally appears in abbreviated form following the pronunciation. A part-of-speech identification may also be used to describe a sense or subordinate headword (see Subentry). Where no part of speech is included, the form may be assumed to be a noun in most cases. Note that in all instances, the OED employs the term "substantive" (abbreviated "sb.") instead of "noun", in keeping with the tradition in early grammars of distinguishing between a "noun substantive" and a "noun adjective". In general, the term "sb." is only applied when it is necessary to differentiate a noun entry from an entry for a word of the same spelling, but with a different part of speech, or sometimes in instances where there are several noun homonyms, in which case a homonym number is added. The more usual convention for noun homonyms is to add the number to the headword itself.

Pronunciation (tag=<pr> region=$pr)

The second edition of the OED employs the International Phonetic Alphabet for transcribing pronunciation, in contrast to the first edition which used a system invented by its primary editor, James Murray. In print, pronunciation, when given, appears in brackets immediately following the headword. The Dictionary gives the pronunciation of most current, "main" headwords, with the exception of some derivatives and combinations, and some single-syllable words, where pronunciation is self-evident. Stress-marks, indicating emphasis, are sometimes included for these exceptions as well as for obsolete words for which pronunciation is not normally supplied.

Pronunciation is, in most cases, in accordance with standard southern British speech, although alternative British or non-British usages may sometimes be included. A special parallels symbol precedes some foreign pronunciation alternatives (see Status).

Pseudonym

Where the author of a quotation used an assumed or pen name, he or she is usually cited by the pseudonym which appears in print in the OED within single quotation marks. The latter are eliminated in the case of certain well-know pseudonymous authors such as George Eliot. For authors who have used both their real names and one or more pseudonyms, the name under which the particular cited work was published is normally given.

This component is not explicitly available in the current version.

Pseudo Quotation Paragraph (tag=<pqp> region=$pqp)

Identifies paragraphs of quotations that illustrate a number of word forms, rather than a single word or sense. These forms are usually Bold Sub-Headwords or Italic Sub-Headwords included within entries and often listed in alphabetical sequence within a single sense, e.g., "television announcer, audience, broadcast, commercial, crew, critic, discussion, ...". The accompanying so-called "pseudo" quotation paragraph usually organizes citations in the same order. As an aid to readers, an asterisk often precedes the initial, i.e., chronologically first, quotation in each grouping.

Quotation (tag=<q> region=$q)

The second edition of the OED contains nearly two and a half million quotations which perform the important function of illustrating the use, form, history, and meaning of word forms in a given sense. Normally quotations pertaining to a particular sense are organized in a quotation paragraph in chronological order by date of publication or composition. Citations typically include the include the following elements: date; author; work (i.e., title), location within the work, such as chapter, page, act, scene, etc.; and the quotation text. Quotations are drawn from all forms of written and published works, including books, manuscripts, journals, newspapers, letters, and diaries, and represent both literary and popular sources.

The policy of the first edition, which dealt with most of the "core" words in the English language, was to include at least one example of use per century. This ratio, however, was increased considerably for entries added in the 1972-86 Supplement and the second edition.

Occasionally, in entries compiled for the first edition, no examples of contemporary usage could be found and illustrations were "made up". Such quotations are introduced by the abbreviation "Mod." (for "modern") and usually appear without a date. (See also Subsidiary Quotation.)

Quotation Paragraph (tag=<qp> region=$qp)

Definitions of words and senses are generally followed by a paragraph in smaller print which lists illustrative quotations in chronological sequence (earliest date first). Occasionally, when a sense covers both the literal and figurative use of a word, more than one quotation paragraph is used. (For an exception to these conventions, see Pseudo Quotation Paragraph.)

Quotation Text (tag=<qt> region=$qt)

This structure contains the actual phrase or passage extracted from the text, as compared to the full citations included in the Quotation region, of all the Dictionary's illustrative quotations. The texts are printed and spelled as they appear in the source edition used. Occasionally, a portion of a quotation text is eliminated and the omission is indicated by two dots (..), or three (...), if the elision includes a period. Sometimes an explanatory word may be inserted in square brackets, and the insert may be preceded by the abbreviation "sc." for "scilicet", meaning "understand" or "supply". In instances where the text quoted is a song title, advertisement or other unusual source, this information is usually given in brackets.

Relative Cross-Reference (tag=<rx> region=$rx)

The OED contains a number of cross-references which use the terms "prec." (preceding) or "next" to indicate to Dictionary users that they should refer to the preceding or next entry, or, in some cases, to the preceding or next sense in the same entry. A frequent use of "prec.", for example, is found in etymolologies of entries for derivatives or combinations which combine the headword of the previous or "preceding" entry with a suffix, combining form, or another word. The Dictionary distinguishes this particular type of reference by tagging all the occurrences of "prec." and "next" within cross-references as "relative cross references."

Sense Level 0 (tag=<s0> region=$(s0))

The various senses and sub-senses in the OED are organized in a hierarchical scheme utilizing numbers and letters to distinguish steps in a headword's development. Sense development is usually chronological, starting with the earliest sense, except for some entries which follow "logical order". The simplest form of identifying senses is linear (1, 2, 3, ...), but often further subdivisions are required which are ordered a, b, c, ... (with the letters in bold type). Further subdivisions are made by italicized series (a), (b), (c), ... or (i), (ii), (iii), ... , or, occasionally, small Greek letters (alpha, beta, ...).

When a word's development is not straightforwardly linear (for example, when groups of senses developed simultaneously or diversely), a second level of numbering and lettering employing upper case roman numerals (I, II, III, ...) identifies branches. Sometimes two parts of speech, such as noun and adjective, are included in one entry, and each "fork" is then identified by the highest level of the scheme, upper case letters (A, B, C, ...). The two upper levels may be integrated in one entry, and are also occasionally used for other purposes, such as organizing groups of senses syntactically or semantically.

Sense levels 1, 2, 4, 6, and 7 identify groups and senses numbered according to this scheme. Level 1 refers to A, B, ... groups; Level 2 to the I, II, ... groupings; Level 4 to structures numbered 1, 2, ...; Level 6 to the a, b, ... sub-senses; and Level 7 to the italicized bracketed sub-division of sub-senses - (a), (i), or Greek letters. The remaining numbers are used as follows: Level 0 (zero) identifies unnumbered sense sections, such as initial over-arching text preceding a regular sense numbering, or unnumbered final paragraphs beginning with the word "hence" that usually contain one or more derivatives. Levels 3 and 5 contain increasing numbers of asterisks (*, **, ***, ...) that provide another means of grouping senses by semantic or syntactical headings in lengthy entries.

The tagging structure consistently places the closing tag before the final quotation paragraph that logically belongs to that sense (or its last sub-sense). Therefore when searching for a sense at level i, it is usually preferable to use the region name $(Sensei) (created for convenience) rather than $(si).

Sense Level 1 (tag=<s1> region=$(s1))

Identifies groups of senses lettered A, B, C, ... and is primarily used to separate two (or more) parts of speech (e.g., noun adjective) when they are included in a single entry.

For a further explanation of sense structure and groupings, see Sense Level 0.

Sense Level 2 (tag=<s2> region=$(s2))

Used to identify groups of senses numbered I, II, ... , representing branches of meanings which developed simultaneously or diversely.

For a further explanation of sense structure and groupings, see Sense Level 0.

Sense Level 3 (tag=<s3> region=$(s3))

A structure which takes the form in the printed text of an increasing number of asterisks (*, **, ***, ...), and is sometimes used in complex and lengthy entries to group senses under semantic or syntactical headings. Sense Level 5 is also sometimes used for the same purpose.

For a further explanation of sense structure and groupings, see Sense Level 0.

Sense Level 4 (tag=<s4> region=$(s4))

The most common type of sense development structure in which senses are numbered consecutively 1, 2, 3, ... For a further explanation of sense structure and groupings, see Sense Level 0.

Sense Level 5 (tag=<s5> region=$(s5))

A structure which takes the form in the printed text of an increasing number of asterisks (*, **, ***, ...), and is sometimes used in complex and lengthy entries to group senses under semantic or syntactical headings. Sense Level 3 is also sometimes used for a the same purpose.

For a further explanation of sense structure and groupings, see Sense Level 0.

Sense Level 6 (tag=<s6> region=$(s6))

Identifies the lower-case bold letter structure (a, b, c, ...) used to subdivide senses. For a further explanation of sense structure and groupings, see Sense Level 0.

Sense Level 7 (tag=<s7a>,<s7n> regions=$(s7a),$(s7n))

Identifies the structure using italicized and bracketed letters (a), (b), ..., or numbers (i), (ii), (iii), ..., or, rarely, lower case Greek letters (alpha, beta, ...) attached to sub-divisions of sub-senses, and usually found in lengthy and complex entries.

For a further explanation of sense structure and groupings, see Sense Level 0.

Sense Number (attribute sn="val")

A sense is a numbered and/or lettered entry component which includes as its major elements a definition and supporting quotation paragraph. The number or letter enclosed by sense number tags not only serves to organize the senses, it also provides a unique address for each sense, an important feature for cross-referencing. Sense identification is especially important in the OED since some entries contain 100 or more senses; for example, the verb "run" has 82 main senses and over 350 sub-senses. (For an explanation of how senses are structured, see Sense Level 0, and also compare with Cross-Reference Sense Number.)

Status (attribute st="val")

A status attribute specifies several types of symbols that usually precede a headword or sense and indicate the form's status in the language. These include the dagger symbol which identifies an obsolete entry or sense (also usually further identified by a label "Obs." following the headword); parallels signifying non-naturalized words or pronunciations; and the so-called "catachrestic" symbol (a reversed paragraph symbol) identifying a confused or erroneous sense. Within the status attribute, these symbols are reprsented by the values "obs" (for "obsolete"), "ali" (for "alien), and "err" (for "erroneous").

In addition, status tagging identifies two types of entries:

1. the numerous cross-reference entries, the headwords of which represent obsolete or variant spellings of main words, and which refer the user from these forms to the relevant "main" entry. These are identified by the abbreviation "xref".

2. a small number (387) of "spurious" entries which are entirely enclosed in square brackets. All of these entries were compiled for the first edition and consist of words that are erroneous, false, or could not be authenticated. Their purpose was primarily to correct errors found in earlier dictionaries resulting from copyists' or translators' errors, misprints, or misreadings of the text. These are identified by the abbreviation "spu".

Stressed Form

The full form of main headwords, bold sub-headwords, and italic sub-headwords. "Full form" means that each form incorporates diacritics, diphthongs, punctuation, stress marks, etc. as they appear in the printed Dictionary. In the database, each of these typographical elements is tagged, although not all headwords contain such elements, e.g., monosyllabic words and most combinations and derivatives, for which stress is self-evident.

In the current version, all lemmas are shown in their stressed forms only, but the search engine ignores stress marks and diacritics, and it expands dipthongs, such as "&ae;", to the corresponding letter pair, i.e., "ae". However, there are some combinations included within entries that are less easily located because the OED sometimes lists minor forms in a style similar to the following example from the entry for "orange": "orange-bloom, -grove, -juice, kernel, leaf, -pip..". A computer program inserted the first element in front of (or, in some cases, following) hyphens. Thus, "orange-grove" and "orange-juice" will be located, but further refinement of the program is needed in order to find unhyphenated minor combinations such as "orange kernel" and "orange leaf". These combinations can often be located by searching quotation texts.

Subentry (tag=<sub> region=$sub)

This structure consists mainly of Bold Sub-Headwords (i.e., defined and illustrated combinations and derivatives included within the entry for their main word) together with their definition text. Corresponding quotations can often be found in pseudo quotation paragraphs. (See also Headword and Italic Sub-Headword).

Sub-Etymology (tag=<et> region=$et)

An etymology attached to a particular sense of a headword. These subordinate etymologies appear in square brackets in the printed text, and normally contain historical information relating to the sense of a word which does not lend itself to inclusion in the etymology at the head of the entry. its use illustrated in the OED if it does not appear as a headword. Many other forms are defined and/or illustrated within entries for their "main" words (see also Bold Sub-Headword and Italic Sub-Headword).

Subsidiary Quotation (region=$sq)

This structure contains quotations in square brackets which are occasionally found in quotation paragraphs, usually as the first citation(s). The convention is used when a quotation does not actually employ the word in context, but is in some way relevant to its history. For example, in the case of a word borrowed from another language, the quotation may document its use in the language of origin.

Superscript (tag=<su> region=$su)

Typographical tagging in this category is attached to most text in the Dictionary which appears in superscript, with the exception of homonym numbers. Superscript text includes miscellaneous typographical conventions used in printing Murray pronunciations (see Pronunciation), mathematical functions, etc. In addition, it contains two special superior numbers preceded by a dash (-0 and -1) that sometimes further define the label "rare". In the first instance, the -0 indicates the word was found only in an earlier dictionary rather than a contextual quotation, while -1 means that only one quotation from a text other than a dictionary was found.

Variant Date (tag=<vd> region=$vd)

Earlier forms of spelling, irregular inflexions, etc. included in variant forms lists, are assigned century ranges indicating when their usage was prevalent. Centuries appear in abbreviated form, for example, "5-6" indicates fifteenth to sixteenth century.

Variant Form (tag=<vf> region=$vf)

The OED attempts to include all documented earlier spellings, irregular inflexions, unusual plurals, etc. of headwords, where appropriate or known. These are contained in a Variant Forms List preceding the etymology. Regional labels are sometimes included to indicate the geographic area in which the particular form prevailed (or prevails). Many of these forms also appear as headwords in cross-reference entries (see Status).

Variant Forms List (tag=<vfl> region=$vfl)

Lists of documented historical, or sometimes contemporary, variants of a headword's spellings, irregular inflexions, and unusual plurals that normally appear in the printed text immediately before the etymology. Forms are further identified by the century range in which they prevailed. Lists are arranged in chronological order with the earliest variant(s) first. In some cases, two or more branches of forms may have developed simultaneously and these are grouped by lower case italic Greek letters. Illustrative quotations that follow are often referenced by the same Greek letters. (See also Variant Date for conventions used for century ranges.)

Work (tag=<w> region=$w)

Refers to the title of the work which was the source for a quotation. The title usually appears in italics following the author's name and preceding the specific location within the work and the actual quotation text. Titles are frequently abbreviated and the definite articles "the" and "and", as well as the preposition "of", are routinely omitted. Abbreviations used for a single work can vary; for example, Shakespeare's "Comedy of Errors" appears as "Com. Err.", "C. Err." and "Err." (for an example of a search by title, see D.L. Berg, 1989). Some works, such as anonymous early texts like "Beowulf" and "Cursor Mundi" are cited by title only. The Bible is a special case; for example, books of the Bible are sometimes tagged <w> with "Bible" as author, especially for the 1611 King James version (for a discussion of the numerous variations in citing translations of the Bible, see D.L. Berg, 1993).

Identification of early and obscure works is frequently difficult and can be aided by reference to the Bibliography which appears at the end of Volume 20 of the printed text, and which includes most, but not all, of the titles cited. A notable exception, since this is a bibliography of English works, are the many foreign dictionaries and other word books often referred to in etymologies. (For a discussion of problems associated with matching citations in the Dictionary text to the Bibliography, see G.V.J. Townsend, "Citation Matching in the Oxford English Dictionary". UW Centre for the New OED, 1989.)