Category

Computational linguistics

page 2

distributionalism

Distributionalism is a general theory of language and a discovery procedure for establishing elements and structures of language based on observed usage. The purpose of distributionalism was to provide a scientific basis for syntax as independent of meaning. Zellig Harris defined 'distribution' as follows.“The DISTRIBUTION of an element is the total of all environments in which it occurs, i.e. the sum of all the (different) positions (or occurrences) of an element relative to the occurrence of other elements[.]”Based on this idea, an analysis of immediate constituents could be based on observi

Georgetown-IBM experiment

1954 demonstration of machine translation

mathematical linguistics

branch of applied mathematics

text simplification

automated process

thumb|Architecture of ELMo. It first processes input tokens into embedding vectors by an embedding layer (essentially a lookup table), then applies a pair of forward and backward LSTMs to produce two sequences of hidden vectors, then apply another pair of forward and backward LSTMs, and so on. thumb|How a token is transformed successively over increasing layers of ELMo. At the start, the token is converted to a vector by a linear layer, giving the embedding vector e_0. In the next layer, a forward LSTM produces a hidden vector h_{00}, while a backward LSTM produces another hidden vector h_{00r

Spanish Society for Natural Language Processing

Science society for Natural language Prcessing

The Linguatec Sprachtechnologien GmbH is a language technology provider, specialized in the field of machine translation, speech synthesis and speech recognition. Linguatec was founded in Munich in 1996 and its headquarters are in Pasing.

language resource

lexical density

estimated measure of content per functional and lexical units in total; a descriptive parameter which varies with register and genre

categorial grammar

family of formalisms in natural language syntax motivated by the principle of compositionality and organized according to the view that syntactic constituents should generally combine as functions or according to a function-argument relationship

word-sense induction

Sentence extraction

text summarization technique

Automated Similarity Judgment Program

computational comparative linguistics program

statistical semantics

subfield of computational linguistics and natural language processing

Subvocal recognition

the art of taking subvocalization and converting the detected results to a digital text-based output

MBROLA is speech synthesis software as a worldwide collaborative project. The MBROLA project web page provides diphone databases for many spoken languages.

computational semantics

the study of how to automate the process of constructing and reasoning with meaning representations of natural language

EuroWordNet (EWN) was a European research project to build a multilingual database of wordnets for several European languages. Each language has its own wordnet structured along the same lines as the Princeton WordNet, with synsets linked by semantic relations. The wordnets are interconnected through an Interlingual Index (ILI) that maps language-specific synsets to a shared set of concepts, enabling cross-lingual queries and applications.

Semantic analysis

Computational application of concept approximation

Haitian academic

Kialo is an online structured debate platform with argument maps in the form of debate trees. It is a collaborative reasoning tool for thoughtful discussion, understanding different points of view, and collaborative decision-making, showing arguments for and against claims underneath user-submitted theses or questions.

GloVe, coined from Global Vectors, is a model for distributed word representation. The model is an unsupervised learning algorithm for obtaining vector representations of words. This is achieved by mapping words into a meaningful space where the distance between words is related to semantic similarity. Training is performed on aggregated global word-word co-occurrence statistics from a corpus, and the resulting representations showcase interesting linear substructures of the word vector space. As log-bilinear regression model for unsupervised learning of word representations, it combines the f