Category

Information retrieval techniques

page 1

thumb|A post (tweet) on the social network X (social network)|X (Twitter) with several hashtags colored in blue text. A hashtag is a metadata tag operator that is prefaced by the hash symbol, #. On social media, hashtags are used on microblogging and photo-sharing services–especially Twitter and Tumblr–as a form of user-generated tagging that enables cross-referencing of content by topic or theme. For example, a search within Instagram for the hashtag #flowers returns all posts that have been tagged with that term. After the initial hash symbol, a hashtag may include letters, numerals or other

metadata used for classifications or adding of informations

subject heading

lexical unit of a thesaurus (word or phrase) used for indexing and that captures the essence of the topic of a document

personalization

Personalization (broadly known as customization) consists of tailoring a service or product to accommodate specific individuals. It is sometimes tied to groups or segments of individuals. Personalization involves collecting data on individuals, including web browsing history, web cookies, and location. Various organizations use personalization (along with the opposite mechanism of popularization) to improve customer satisfaction, digital sales conversion, marketing results, branding, and improved website metrics as well as for advertising. Personalization acts as a key element in social media

In linguistic morphology and information retrieval, stemming is the process of reducing inflected (or sometimes derived) words to their word stem, base or root form—generally a written word form. The stem need not be identical to the morphological root of the word; it is usually sufficient that related words map to the same stem, even if this stem is not in itself a valid root. Algorithms for stemming have been studied in computer science since the 1960s. Many search engines treat words with the same stem as synonyms as a kind of query expansion, a process called conflation.

word filtered out at natural language processing

cosine similarity

measure of similarity between vectors of an inner product space

collaborative filtering

visible and clickable text in an HTML hyperlink

controlled vocabulary

standardized and organized sets of words and phrases for retrieval and disambiguation of information, distinguishing preferred terms from non-preferred terms

The science of webometrics (also referred to as cybermetrics) aims to quantify the World Wide Web to get knowledge about the number and types of hyperlinks, the structure of the World Wide Web, and using patterns. According to Björneborn and Ingwersen, the definition of webometrics is "the study of the quantitative aspects of the construction and use of information resources, structures and technologies on the Web drawing on bibliometric and informetric approaches." The term webometrics was coined by Almind and Ingwersen (1997). A second definition of webometrics has also been introduced, "the

latent semantic analysis

technique in natural language processing

document indexing

classifying a document by keywords, index terms or descriptors

controlled vocabulary expanded with relations of broader, narrower and related terms, serving subject indexing and vocabulary control

method of information retrieval by filtering on multiple properties in a data set

learning to rank

application of machine learning

standard Boolean model

classical information retrieval model

Search/Retrieve via URL

natural-language user interface

type of computer human interface

personalized search

type of web search

statistical semantics

subfield of computational linguistics and natural language processing

Extended Boolean model

Probabilistic relevance model