Category

Bioinformatics

page 1

500px|thumbnail|right|Early bioinformatics—computational alignment of experimentally determined sequences of a class of related proteins; see for further information. thumbnail|220px|Map of the human X chromosome (from the National Center for Biotechnology Information (NCBI) website) Bioinformatics () is an interdisciplinary field of science that develops computational methods and software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics integrates principles from biology, chemistry, physics, computer science, data science, computer p

Human Genome Project

research program for sequencing the human genome

Biostatistics (sometimes referred to as biometry) is a branch of statistics that applies statistical methods to a wide range of topics in the biological sciences, with a focus on clinical medicine and public health applications. The field encompasses the design of experiments, the collection and analysis of experimental and observational data, and the interpretation of the results. It is closely related to medical statistics.

Enzyme Commission number

hierarchical classification ID given to enzymes, identifying their membership in families

systems biology

computational and mathematical modeling of complex biological systems

Protein Data Bank

international open access database of protein and nucleic acid structures

thumb| axons|Giant axons of the [[longfin inshore squid (Doryteuthis pealeii) were crucial for scientists to understand the action potential.]]

synthetic biology

interdisciplinary branch of biology and engineering

Folding@home (FAH or F@h) is a distributed computing project aimed to help scientists develop new therapeutics for a variety of diseases by the means of simulating protein dynamics. This includes the process of protein folding and the movements of proteins, and is reliant on simulations run on volunteers' personal computers. Folding@home is currently based at the University of Pennsylvania and led by Greg Bowman, a former student of Vijay Pande.

use of large set of oligonucleotide probes

thumb|upright=1.5|In metagenomics, the genetic materials (DNA, C) are extracted directly from samples taken from the environment (e.g. soil, sea water, human gut, A) after filtering (B), and are sequenced (E) after multiplication by cloning (D) in an approach called [[shotgun sequencing. These short sequences can then be put together again using assembly methods (F) to deduce the individual genomes or parts of genomes that constitute the original environmental sample. This information can then be used to study the species diversity and functional potential of the microbial community of the env

The GenBank sequence database is an open access, annotated collection of all publicly available nucleotide sequences and their protein translations. It is produced and maintained by the National Center for Biotechnology Information (NCBI; a part of the National Institutes of Health in the United States) as part of the International Nucleotide Sequence Database Collaboration (INSDC).

open reading frame

DNA section marked with start and stop codon of different length

computational biology

data-analytical and theoretical methods, mathematical modeling and computational simulation techniques to the study of biological, behavioral, and social systems

UniProt is a freely accessible database of protein sequence and functional information, many entries being derived from genome sequencing projects. It contains a large amount of information about the biological function of proteins derived from the research literature. It is maintained by the UniProt consortium, which consists of several European bioinformatics organisations and a foundation from Washington, DC, US.

hidden Markov model

statistical Markov model

molecular modelling

discovering chemical properties by physical simulations

Foundational Model of Anatomy

ontology for the domain of human anatomy

statistical graphic of data in a 2D matrix represented as colors

Ensembl genome database project

gene sequence database

sensitivity and specificity

statistical measures of the performance of a binary classification test

thumb|right|200px|Cover of Ribofunk by Paul Di Filippo, a seminal biopunk story collection Biopunk (a portmanteau of "biotechnology" or "biology" and "punk") is a subgenre of science fiction that focuses on biotechnology. It is derived from cyberpunk, but focuses on the implications of biotechnology rather than mechanical cyberware and information technology. Biopunk is concerned with synthetic biology. It is derived from cyberpunk and often involves bio-hackers, biotech megacorporations, and oppressive organizations that engineer DNA. Most often keeping with the dark atmosphere of cyberpunk,

molecular docking

attempt to predict the structure of the intermolecular complex formed between two or more molecules

Pfam is a database of protein families that includes their annotations and multiple sequence alignments generated using hidden Markov models. The latest version of Pfam, 37.0, was released in June 2024 and contains 21,979 families. It is currently provided through InterPro website.

whole genome sequencing

sequencing all the DNA of an individual at once

thumb|upright=1.4|General schema showing the relationships of the genome, [[transcriptome, proteome, and metabolome (lipidome, glycome).]]

group of proteins that share a common evolutionary origin, reflected by similarity in their sequence

thumb|360px|right|Size of genera in the extinct bird family Confuciusornithidae, compared to a human (1.75 meter tall). A. [[Changchengornis. Based on the holotype. B. Confuciusornis. Based on several specimens of about the same size. C. Eoconfuciusornis. Based on the holotype IVPP V11977.]] thumb|Measuring shell length in bog turtles.

protein structure prediction

constructing an atomic-resolution model of a protein from its amino acid sequence

shotgun sequencing

method for sequencing random DNA strands

multiple sequence alignment

Alignment of more than two molecular sequence

thumb|Hundreds of gel drops are visible on the biochip.

precision and recall

measures of relevance in pattern recognition and information retrieval

neuroinformatics

Neuroinformatics is the emergent field that combines informatics and neuroscience. Neuroinformatics is related with neuroscience data and information processing by artificial neural networks. There are three main directions where neuroinformatics has to be applied: the development of computational models of the nervous system and neural processes; the development of tools for analyzing and modeling neuroscience data; and the development of tools and databases for management and sharing of neuroscience data at all levels of analysis.

GISAID (), the Global Initiative on Sharing All Influenza Data, previously the Global Initiative on Sharing Avian Influenza Data, is a global science initiative established in 2008 to provide access to genomic data of influenza viruses. The database was expanded to include the coronavirus responsible for the COVID-19 pandemic, as well as other pathogens. The database has been described as "the world's largest repository of COVID-19 sequences". GISAID facilitates genomic epidemiology and real-time surveillance to monitor the emergence of new COVID-19 viral strains across the planet.

consensus sequence

most common variant of a genetic sequence across samples

file format used to store nucleotide or amino acid sequences

thumb|Logo Expasy 2020|320px

Xenobiology (XB) is a subfield of synthetic biology, the study of synthesizing and manipulating biological devices and systems. The name "xenobiology" derives from the Greek word xenos, which means "stranger, alien". Xenobiology is a form of biology that is not (yet) familiar to science and is not found in nature. In practice, it describes novel biological systems and biochemistries that differ from the canonical DNA–RNA-20 amino acid system (see central dogma of molecular biology). For example, instead of DNA or RNA, the field of xenobiology explores nucleic acid analogues, termed xeno nuclei

HomoloGene, a tool of the United States National Center for Biotechnology Information (NCBI), is a system for automated detection of homologs (similarity attributable to descent from a common ancestor) among the annotated genes of several completely sequenced eukaryotic genomes.

Structural genomics

area of genetic research

gene prediction

process of identifying the regions of genomic DNA

imaging techniques used to colocalize sites of brain functions or physiological activity with brain structures

nucleic acid sequence and base qualities file format

1000 Genomes Project

international research effort

distance matrix

square matrix (two-dimensional array) containing the distances, taken pairwise, between the elements of a set. Depending upon the application involved, the distance being used to define this matrix may or may not be a metric

Critical Assessment of protein Structure Prediction

thumb|right|250px|A target structure (ribbons) and 354 template-based predictions superimposed (gray Calpha backbones); from CASP8 Critical Assessment of Structure Prediction (CASP), sometimes called Critical Assessment of Protein Structure Prediction, is a community-wide, worldwide experiment for protein structure prediction taking place every two years since 1994. CASP provides research groups with an opportunity to objectively test their structure prediction methods and delivers an independent assessment of the state of the art in protein structure modeling to the research community and sof

percentage of the population that contracts the disease in an at risk population during a specified time interval

Human Microbiome Project

former research initiative

amino acid sequence

any continuous part of a peptide/protein

sequence analysis

process of analysis of one or more known biological sequences

In molecular biology, an interactome is the whole set of molecular interactions in a particular cell. The term specifically refers to physical interactions among molecules (such as those among proteins, also known as protein–protein interactions (PPIs); or between small molecules and proteins.) but can also describe sets of indirect interactions among genes (genetic interactions). 400px|right|thumb|Part of the DISC1 interactome with genes represented by text in boxes and interactions noted by lines between the genes. From Hennah and Porteous, 2009. The word "interactome" was originally coined

ontology engineering

field which studies the methods and methodologies for building ontologies, which are formal representations of a set of concepts within a domain and the relationships between those concepts

virtual screening

academic discipline in cheminformatics

computational genomics

probabilistic context-free grammar

Grammar model in linguistics

biological network

networks found in ecological, evolutionary, and physiological contexts

homology modeling

method of protein structure prediction

thumb|440x440px|Synteny (in the modern sense) between human and mouse chromosomes. Colors in the human chromosomes indicate regions homologous with parts of the mouse chromosome of the same color. For instance, sequences homologous to mouse chromosome 1 are primarily on human chromosomes 1 and 2, but also 6, 8, and 18. The X chromosome is almost completely syntenic in both species. In genetics, the term synteny refers to two related concepts: In classical genetics, synteny describes the physical co-localization of genetic loci on the same chromosome within an individual or species. In geno

thumb|400px|The BLOSUM62 matrix, the amino acids have been grouped and coloured based on Margaret Dayhoff|Margaret Dayhoff's classification scheme. Positive and zero values have been highlighted. In bioinformatics, the BLOSUM (BLOcks SUbstitution Matrix) matrix is a substitution matrix used for sequence alignment of proteins. BLOSUM matrices are used to score alignments between evolutionarily divergent protein sequences. They are based on local alignments. BLOSUM matrices were first introduced in a paper by Steven Henikoff and Jorja Henikoff. They scanned the BLOCKS database for very conserved