Skip to content
Category

Computational linguistics

page 2
distributionalism
Distributionalism is a general theory of language and a discovery procedure for establishing elements and structures of language based on observed usage. The purpose of distributionalism was to provide a scientific basis for syntax as independent of meaning. Zellig Harris defined 'distribution' as follows.“The DISTRIBUTION of an element is the total of all environments in which it occurs, i.e. the sum of all the (different) positions (or occurrences) of an element relative to the occurrence of other elements[.]”Based on this idea, an analysis of immediate constituents could be based on observi
Georgetown-IBM experiment
1954 demonstration of machine translation
mathematical linguistics
branch of applied mathematics
text simplification
automated process
ELMo
thumb|Architecture of ELMo. It first processes input tokens into embedding vectors by an embedding layer (essentially a lookup table), then applies a pair of forward and backward LSTMs to produce two sequences of hidden vectors, then apply another pair of forward and backward LSTMs, and so on. thumb|How a token is transformed successively over increasing layers of ELMo. At the start, the token is converted to a vector by a linear layer, giving the embedding vector e_0. In the next layer, a forward LSTM produces a hidden vector h_{00}, while a backward LSTM produces another hidden vector h_{00r
Spanish Society for Natural Language Processing
Science society for Natural language Prcessing
Linguatec
The Linguatec Sprachtechnologien GmbH is a language technology provider, specialized in the field of machine translation, speech synthesis and speech recognition. Linguatec was founded in Munich in 1996 and its headquarters are in Pasing.
language resource
lexical density
estimated measure of content per functional and lexical units in total; a descriptive parameter which varies with register and genre
categorial grammar
family of formalisms in natural language syntax motivated by the principle of compositionality and organized according to the view that syntactic constituents should generally combine as functions or according to a function-argument relationship
word-sense induction
Sentence extraction
text summarization technique
Automated Similarity Judgment Program
computational comparative linguistics program
statistical semantics
subfield of computational linguistics and natural language processing
Subvocal recognition
the art of taking subvocalization and converting the detected results to a digital text-based output
MBROLA
MBROLA is speech synthesis software as a worldwide collaborative project. The MBROLA project web page provides diphone databases for many spoken languages.
computational semantics
the study of how to automate the process of constructing and reasoning with meaning representations of natural language
EuroWordNet
EuroWordNet (EWN) was a European research project to build a multilingual database of wordnets for several European languages. Each language has its own wordnet structured along the same lines as the Princeton WordNet, with synsets linked by semantic relations. The wordnets are interconnected through an Interlingual Index (ILI) that maps language-specific synsets to a shared set of concepts, enabling cross-lingual queries and applications.
Semantic analysis
Computational application of concept approximation
Michel DeGraff
Haitian academic
Kialo
Kialo is an online structured debate platform with argument maps in the form of debate trees. It is a collaborative reasoning tool for thoughtful discussion, understanding different points of view, and collaborative decision-making, showing arguments for and against claims underneath user-submitted theses or questions.
GloVe
GloVe, coined from Global Vectors, is a model for distributed word representation. The model is an unsupervised learning algorithm for obtaining vector representations of words. This is achieved by mapping words into a meaningful space where the distance between words is related to semantic similarity. Training is performed on aggregated global word-word co-occurrence statistics from a corpus, and the resulting representations showcase interesting linear substructures of the word vector space. As log-bilinear regression model for unsupervised learning of word representations, it combines the f