Index term

Share This
« Back to Glossary Index

Index term is a term that encapsulates the primary subject of a document, either as a word, phrase, or code. These terms constitute a controlled vocabulary within bibliographic records, serving as keywords for efficient document retrieval in databases. Index terms can be generated both manually and automatically and can take the form of words, phrases, or alphanumeric codes. They are extensively utilized in web search engines, where they enhance search precision by emphasizing certain words in a document’s title, recurring words, and explicit keywords. Authors frequently contribute index terms to their articles as part of the literature, although the quality often varies based on the provider’s expertise. The significant role of index terms in distinguishing documents from each other has made them a substantial area of research interest.

Index term (Wikipedia)

In information retrieval, an index term (also known as subject term, subject heading, descriptor, or keyword) is a term that captures the essence of the topic of a document. Index terms make up a controlled vocabulary for use in bibliographic records. They are an integral part of bibliographic control, which is the function by which libraries collect, organize and disseminate documents. They are used as keywords to retrieve documents in an information system, for instance, a catalog or a search engine. A popular form of keywords on the web are tags, which are directly visible and can be assigned by non-experts. Index terms can consist of a word, phrase, or alphanumerical term. They are created by analyzing the document either manually with subject indexing or automatically with automatic indexing or more sophisticated methods of keyword extraction. Index terms can either come from a controlled vocabulary or be freely assigned.

Keywords are stored in a search index. Common words like articles (a, an, the) and conjunctions (and, or, but) are not treated as keywords because it's inefficient. Almost every English-language site on the Internet has the article "the", and so it makes no sense to search for it. The most popular search engine, Google removed stop words such as "the" and "a" from its indexes for several years, but then re-introduced them, making certain types of precise search possible again.

The term "descriptor" was by Calvin Mooers in 1948. It is in particular used about a preferred term from a thesaurus.

The Simple Knowledge Organization System language (SKOS) provides a way to express index terms with Resource Description Framework for use in the context of the Semantic Web.

« Back to Glossary Index
Keep up with updates