Category
page 1Bioinformatics

bioinformatics
500px|thumbnail|right|Early bioinformatics—computational alignment of experimentally determined sequences of a class of related proteins; see for further information.
thumbnail|220px|Map of the human X chromosome (from the National Center for Biotechnology Information (NCBI) website)
Bioinformatics () is an interdisciplinary field of science that develops computational methods and software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics integrates principles from biology, chemistry, physics, computer science, data science, computer p
Human Genome Project
research program for sequencing the human genome
biostatistics
Biostatistics (sometimes referred to as biometry) is a branch of statistics that applies statistical methods to a wide range of topics in the biological sciences, with a focus on clinical medicine and public health applications.
The field encompasses the design of experiments, the collection and analysis of experimental and observational data, and the interpretation of the results.
It is closely related to medical statistics.
Enzyme Commission number
hierarchical classification ID given to enzymes, identifying their membership in families
systems biology
computational and mathematical modeling of complex biological systems
Protein Data Bank
international open access database of protein and nucleic acid structures

biomimetics
thumb| axons|Giant axons of the [[longfin inshore squid (Doryteuthis pealeii) were crucial for scientists to understand the action potential.]]
synthetic biology
interdisciplinary branch of biology and engineering

Folding@home
Folding@home (FAH or F@h) is a distributed computing project aimed to help scientists develop new therapeutics for a variety of diseases by the means of simulating protein dynamics. This includes the process of protein folding and the movements of proteins, and is reliant on simulations run on volunteers' personal computers. Folding@home is currently based at the University of Pennsylvania and led by Greg Bowman, a former student of Vijay Pande.
DNA microarray
use of large set of oligonucleotide probes

metagenomics
thumb|upright=1.5|In metagenomics, the genetic materials (DNA, C) are extracted directly from samples taken from the environment (e.g. soil, sea water, human gut, A) after filtering (B), and are sequenced (E) after multiplication by cloning (D) in an approach called [[shotgun sequencing. These short sequences can then be put together again using assembly methods (F) to deduce the individual genomes or parts of genomes that constitute the original environmental sample. This information can then be used to study the species diversity and functional potential of the microbial community of the env

GenBank
The GenBank sequence database is an open access, annotated collection of all publicly available nucleotide sequences and their protein translations. It is produced and maintained by the National Center for Biotechnology Information (NCBI; a part of the National Institutes of Health in the United States) as part of the International Nucleotide Sequence Database Collaboration (INSDC).
open reading frame
DNA section marked with start and stop codon of different length
computational biology
data-analytical and theoretical methods, mathematical modeling and computational simulation techniques to the study of biological, behavioral, and social systems
Q905695
UniProt is a freely accessible database of protein sequence and functional information, many entries being derived from genome sequencing projects. It contains a large amount of information about the biological function of proteins derived from the research literature. It is maintained by the UniProt consortium, which consists of several European bioinformatics organisations and a foundation from Washington, DC, US.
hidden Markov model
statistical Markov model
molecular modelling
discovering chemical properties by physical simulations
Foundational Model of Anatomy
ontology for the domain of human anatomy
heat map
statistical graphic of data in a 2D matrix represented as colors
Ensembl genome database project
gene sequence database
sensitivity and specificity
statistical measures of the performance of a binary classification test

biopunk
thumb|right|200px|Cover of Ribofunk by Paul Di Filippo, a seminal biopunk story collection
Biopunk (a portmanteau of "biotechnology" or "biology" and "punk") is a subgenre of science fiction that focuses on biotechnology. It is derived from cyberpunk, but focuses on the implications of biotechnology rather than mechanical cyberware and information technology. Biopunk is concerned with synthetic biology. It is derived from cyberpunk and often involves bio-hackers, biotech megacorporations, and oppressive organizations that engineer DNA. Most often keeping with the dark atmosphere of cyberpunk,
molecular docking
attempt to predict the structure of the intermolecular complex formed between two or more molecules

Pfam
Pfam is a database of protein families that includes their annotations and multiple sequence alignments generated using hidden Markov models. The latest version of Pfam, 37.0, was released in June 2024 and contains 21,979 families. It is currently provided through InterPro website.
whole genome sequencing
sequencing all the DNA of an individual at once
metabolome
thumb|upright=1.4|General schema showing the relationships of the genome, [[transcriptome, proteome, and metabolome (lipidome, glycome).]]
protein family
group of proteins that share a common evolutionary origin, reflected by similarity in their sequence

morphometrics
thumb|360px|right|Size of genera in the extinct bird family Confuciusornithidae, compared to a human (1.75 meter tall). A. [[Changchengornis. Based on the holotype. B. Confuciusornis. Based on several specimens of about the same size. C. Eoconfuciusornis. Based on the holotype IVPP V11977.]]
thumb|Measuring shell length in bog turtles.
protein structure prediction
constructing an atomic-resolution model of a protein from its amino acid sequence
shotgun sequencing
method for sequencing random DNA strands
multiple sequence alignment
Alignment of more than two molecular sequence

biochip
thumb|Hundreds of gel drops are visible on the biochip.
precision and recall
measures of relevance in pattern recognition and information retrieval
.jpg)
neuroinformatics
Neuroinformatics is the emergent field that combines informatics and neuroscience. Neuroinformatics is related with neuroscience data and information processing by artificial neural networks. There are three main directions where neuroinformatics has to be applied:
the development of computational models of the nervous system and neural processes;
the development of tools for analyzing and modeling neuroscience data; and
the development of tools and databases for management and sharing of neuroscience data at all levels of analysis.

GISAID
GISAID (), the Global Initiative on Sharing All Influenza Data, previously the Global Initiative on Sharing Avian Influenza Data, is a global science initiative established in 2008 to provide access to genomic data of influenza viruses. The database was expanded to include the coronavirus responsible for the COVID-19 pandemic, as well as other pathogens. The database has been described as "the world's largest repository of COVID-19 sequences". GISAID facilitates genomic epidemiology and real-time surveillance to monitor the emergence of new COVID-19 viral strains across the planet.
consensus sequence
most common variant of a genetic sequence across samples
FASTA format
file format used to store nucleotide or amino acid sequences

ExPASy
thumb|Logo Expasy 2020|320px
xenobiology
Xenobiology (XB) is a subfield of synthetic biology, the study of synthesizing and manipulating biological devices and systems. The name "xenobiology" derives from the Greek word xenos, which means "stranger, alien". Xenobiology is a form of biology that is not (yet) familiar to science and is not found in nature. In practice, it describes novel biological systems and biochemistries that differ from the canonical DNA–RNA-20 amino acid system (see central dogma of molecular biology). For example, instead of DNA or RNA, the field of xenobiology explores nucleic acid analogues, termed xeno nuclei
HomoloGene
HomoloGene, a tool of the United States National Center for Biotechnology Information (NCBI), is a system for automated detection of homologs (similarity attributable to descent from a common ancestor) among the annotated genes of several completely sequenced eukaryotic genomes.
Structural genomics
area of genetic research
gene prediction
process of identifying the regions of genomic DNA
brain mapping
imaging techniques used to colocalize sites of brain functions or physiological activity with brain structures
FASTQ format
nucleic acid sequence and base qualities file format
1000 Genomes Project
international research effort
distance matrix
square matrix (two-dimensional array) containing the distances, taken pairwise, between the elements of a set. Depending upon the application involved, the distance being used to define this matrix may or may not be a metric

Critical Assessment of protein Structure Prediction
thumb|right|250px|A target structure (ribbons) and 354 template-based predictions superimposed (gray Calpha backbones); from CASP8
Critical Assessment of Structure Prediction (CASP), sometimes called Critical Assessment of Protein Structure Prediction, is a community-wide, worldwide experiment for protein structure prediction taking place every two years since 1994. CASP provides research groups with an opportunity to objectively test their structure prediction methods and delivers an independent assessment of the state of the art in protein structure modeling to the research community and sof
attack rate
percentage of the population that contracts the disease in an at risk population during a specified time interval
Human Microbiome Project
former research initiative
amino acid sequence
any continuous part of a peptide/protein
sequence analysis
process of analysis of one or more known biological sequences

interactome
In molecular biology, an interactome is the whole set of molecular interactions in a particular cell. The term specifically refers to physical interactions among molecules (such as those among proteins, also known as protein–protein interactions (PPIs); or between small molecules and proteins.) but can also describe sets of indirect interactions among genes (genetic interactions). 400px|right|thumb|Part of the DISC1 interactome with genes represented by text in boxes and interactions noted by lines between the genes. From Hennah and Porteous, 2009.
The word "interactome" was originally coined

ontology engineering
field which studies the methods and methodologies for building ontologies, which are formal representations of a set of concepts within a domain and the relationships between those concepts

virtual screening
academic discipline in cheminformatics
computational genomics
probabilistic context-free grammar
Grammar model in linguistics
biological network
networks found in ecological, evolutionary, and physiological contexts
homology modeling
method of protein structure prediction

synteny
thumb|440x440px|Synteny (in the modern sense) between human and mouse chromosomes. Colors in the human chromosomes indicate regions homologous with parts of the mouse chromosome of the same color. For instance, sequences homologous to mouse chromosome 1 are primarily on human chromosomes 1 and 2, but also 6, 8, and 18. The X chromosome is almost completely syntenic in both species.
In genetics, the term synteny refers to two related concepts:
In classical genetics, synteny describes the physical co-localization of genetic loci on the same chromosome within an individual or species.
In geno
BLOSUM
thumb|400px|The BLOSUM62 matrix, the amino acids have been grouped and coloured based on Margaret Dayhoff|Margaret Dayhoff's classification scheme. Positive and zero values have been highlighted.
In bioinformatics, the BLOSUM (BLOcks SUbstitution Matrix) matrix is a substitution matrix used for sequence alignment of proteins. BLOSUM matrices are used to score alignments between evolutionarily divergent protein sequences. They are based on local alignments. BLOSUM matrices were first introduced in a paper by Steven Henikoff and Jorja Henikoff. They scanned the BLOCKS database for very conserved