Category

Data mining

page 1

the process of extracting and discovering patterns in large data sets

cluster analysis

task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters)

anomaly detection

The identification of rare items, events or observations which raise suspicions by differing significantly from the expected or majority of the data

receiver operating characteristic

performance of a binary classifier system as its discrimination threshold is varied

association rule learning

method for discovering interesting relations between variables in databases

document classification

problem in library science, information science and computer science

automatic summarization

computer-based method for shortening a text

nearest neighbor search

(as a form of proximity search (metric space)) optimization problem of finding the point in a given set that is closest (or most similar) to a given point

in machine learning, individual measurable property or characteristic of a phenomenon being observed

data pre-processing

manipulation of data before it is analyzed

local outlier factor

use of data mining to uncover patterns in data that can be presented as statistically significant

formal concept analysis

a rigorous method of deriving an ontology from a collection of objects and their properties

affinity analysis

market research and business management technique

embedding of data within a manifold based on a similarity function

astrostatistics

Astrostatistics is a discipline which spans astrophysics, statistical analysis and data mining. It is used to process the vast amount of data produced by automated scanning of the cosmos, to characterize complex datasets, and to link astronomical data to astrophysical theory. Many branches of statistics are involved in astronomical analysis including nonparametrics, multivariate regression and multivariate classification, time series analysis, and especially Bayesian inference. The field is closely related to astroinformatics.

social profiling

process of constructing a social media user's profile using his or her social data. In general, profiling refers to the data science process of generating a person's profile with computerized algorithms and technology

sequence mining

data mining technique

simplified example problem used for research or exposition

web intelligence

area of scientific research and development

information delivered immediately after collection

Bibliomining is the use of a combination of data mining, data warehousing, and bibliometrics for the purpose of analyzing library services. The term was created in 2003 by Scott Nicholson, Assistant Professor, Syracuse University School of Information Studies, in order to distinguish data mining in a library setting from other types of data mining.

thumb|right|Stephen Heywood's profile on PatientsLikeMe PatientsLikeMe (PLM) is an integrated community, health management, and real-world data platform. The platform currently has over 830,000 members who are dealing with more than 2,900 conditions, such as ALS, MS, and epilepsy. Data generated by patients themselves are collected and quantified with the goal of providing an environment for peer support and learning. These data capture the influences of different lifestyle choices, socio-demographics, conditions and treatments on a person's health.

change of statistical properties over time

outline of machine learning

Wikimedia list article