Skip to content
Category

Formal languages

page 1
Turing machine
abstract computation model; mathematical model of computation that defines an abstract machine which manipulates symbols on a strip of tape according to a table of rules
regular expression
sequence of characters that forms a search pattern
markup language
computer language for annotating documents
formal language
set of strings of symbols that may be constrained by rules that are specific to it; words whose letters are taken from an alphabet and are well-formed according to a specific set of rules
string
data type representing a finite sequence of encoded characters
Chomsky hierarchy
containment hierarchy of classes of formal grammars
Backus–Naur form
metasyntax notation for context-free grammars, developed by John Backus and Peter Naur for the ALGOL 60 report (1963); foundational to formal language specification in computer science
unary numeral system
the simplest numeral system, a non-positional numeral system
formal grammar
structure of a formal language
formal system
any well-defined system of abstract thought based on the model of mathematics
context-free grammar
formal grammar whose production rules can be applied to a nonterminal symbol regardless of its context
alphabet
non-empty set of symbols or letters that make up strings in a formal language
Kleene star
unary operation on sets of strings, used in regular expressions for "zero or more repetitions"
concatenation
thumb|upright=2| A spreadsheet's concatenate ("&") function is used to assemble a complex text string—in this example, XML code for an [[SVG "circle" element.]] In formal language theory and computer programming, string concatenation is the operation of joining character strings end-to-end. For example, the concatenation of "snow" and "ball" is "snowball". In certain formalizations of concatenation theory, also called string theory, string concatenation is a primitive notion.
diff
diff is a shell command that compares the content of files and reports differences. The term diff is also used to identify the output of the command and is used as a verb for running the command. To diff files, one runs diff to create a diff.
regular language
formal language that can be expressed using a regular expression
extended Backus–Naur form
family of metasyntax notations, any of which can be used to express a context-free grammar
Chomsky normal form
form for context-free grammars
abstract syntax tree
tree representation of the abstract syntactic structure of source code
context-free language
formal language that is a member of the set of languages defined by context-free grammars
empty string
the unique string of length zero
proof
sufficient evidence or a sufficient argument for the truth of a proposition
context-sensitive grammar
formal grammar in which components of production rules may be surrounded by a contextual symbols
regular grammar
formal grammar, that is right-regular or left-regular
recursive language
recursive subset of the set of all possible finite sequences over the alphabet of the language
context-sensitive language
formal language that can be defined by a context-sensitive grammar (and equivalently by a noncontracting grammar). Context-sensitive is one of the four types of grammars in the Chomsky hierarchy
recursively enumerable language
a formal language that can be output (enumerated) by an algorithm (mathematical logic, computability theory)
Greibach normal form
form for context-free grammars
interpretation
assignment of meaning to the symbols of a formal language
pumping lemma for regular languages
type of pumping lemma
rewriting
In mathematics, linguistics, computer science, and logic, rewriting covers a wide range of methods of replacing subterms of a formula with other terms. Such methods may be achieved by rewriting systems (also known as rewrite systems, rewrite engines, or reduction systems). In their most basic form, they consist of a set of objects, plus relations on how to transform those objects.
bigram
A bigram or digram is a sequence of two adjacent elements from a string of tokens, which are typically letters, syllables, or words. A bigram is an n-gram for n=2.
well-formed formula
finite sequence of symbols from a given alphabet that is part of a formal language
ambiguous grammar
Ambiguity in Grammar
Myhill–Nerode theorem
theorem
terminal and nonterminal symbols
categories of symbols in formal grammars
longest increasing subsequence
algorithm to find the longest increasing subsequence in an array of numbers
Pumping lemma for context-free languages
lemma giving a property shared by all context-free languages
formal proof
establishment of a theorem using inference from the axioms
SCIgen
SCIgen is a paper generator that uses context-free grammar to randomly generate nonsense in the form of computer science research papers. Its original data source was a collection of computer science papers downloaded from CiteSeer. All elements of the papers are formed, including graphs, diagrams, and citations. Created by scientists at the Massachusetts Institute of Technology, its stated aim is "to maximize amusement, rather than coherence." Originally created in 2005 to expose the lack of scrutiny of submissions to conferences, the generator subsequently became used, primarily by Chinese a
syntax
rules used for constructing or transforming the symbols of a formal language
attribute grammar
formal way to define attributes for the productions of a formal grammar, associating these attributes with values
Ogden's lemma
generalization of the pumping lemma for context-free languages
Kuroda normal form
probabilistic context-free grammar
Grammar model in linguistics
syntax diagram
visual description of context-free grammar
Dyck language
formal language consisting of balanced strings of square brackets
substring
thumb|"string" is a substring of "substring"
metacharacter
A metacharacter is a character that has a special meaning to a computer program, such as a shell interpreter or a regular expression (regex) engine. For instance, in XML, the character is interpreted not as ordinary text (in which it would be the less-than sign), but rather as a metacharacter signalling the beginning of an XML tag.
left recursion
theory of computer science
deterministic context-free language
context-free language that can be accepted by a deterministic pushdown automaton
unrestricted grammar
language theory
deterministic context-free grammar
formal grammar derived from a deterministic pushdown automaton
deterministic pushdown automaton
Abstract machine in computer science
Augmented Backus–Naur Form
metalanguage based on Backus–Naur Form (BNF)
Montague grammar
approach to natural language semantics
symbol
basic element of strings in a formal language
indexed language
formal language
production
in computer science, a rewrite rule specifying a substitution that can be recursively performed to generate new sequences
Indexed grammar
generalization of context-free grammars in that nonterminals are equipped with lists of flags, or index symbols