Skip to content
Category

Data mining

page 1
data mining
the process of extracting and discovering patterns in large data sets
cluster analysis
task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters)
anomaly detection
The identification of rare items, events or observations which raise suspicions by differing significantly from the expected or majority of the data
receiver operating characteristic
performance of a binary classifier system as its discrimination threshold is varied
association rule learning
method for discovering interesting relations between variables in databases
document classification
problem in library science, information science and computer science
automatic summarization
computer-based method for shortening a text
nearest neighbor search
(as a form of proximity search (metric space)) optimization problem of finding the point in a given set that is closest (or most similar) to a given point
feature
in machine learning, individual measurable property or characteristic of a phenomenon being observed
data pre-processing
manipulation of data before it is analyzed
local outlier factor
algorithm
data dredging
use of data mining to uncover patterns in data that can be presented as statistically significant
formal concept analysis
a rigorous method of deriving an ontology from a collection of objects and their properties
affinity analysis
market research and business management technique
latent space
embedding of data within a manifold based on a similarity function
astrostatistics
Astrostatistics is a discipline which spans astrophysics, statistical analysis and data mining. It is used to process the vast amount of data produced by automated scanning of the cosmos, to characterize complex datasets, and to link astronomical data to astrophysical theory. Many branches of statistics are involved in astronomical analysis including nonparametrics, multivariate regression and multivariate classification, time series analysis, and especially Bayesian inference. The field is closely related to astroinformatics.
social profiling
process of constructing a social media user's profile using his or her social data. In general, profiling refers to the data science process of generating a person's profile with computerized algorithms and technology
sequence mining
data mining technique
toy problem
simplified example problem used for research or exposition
ROUGE
metric
web intelligence
area of scientific research and development
Real-time data
information delivered immediately after collection
bibliomining
Bibliomining is the use of a combination of data mining, data warehousing, and bibliometrics for the purpose of analyzing library services. The term was created in 2003 by Scott Nicholson, Assistant Professor, Syracuse University School of Information Studies, in order to distinguish data mining in a library setting from other types of data mining.
PatientsLikeMe
thumb|right|Stephen Heywood's profile on PatientsLikeMe PatientsLikeMe (PLM) is an integrated community, health management, and real-world data platform. The platform currently has over 830,000 members who are dealing with more than 2,900 conditions, such as ALS, MS, and epilepsy. Data generated by patients themselves are collected and quantified with the goal of providing an environment for peer support and learning. These data capture the influences of different lifestyle choices, socio-demographics, conditions and treatments on a person's health.
concept drift
change of statistical properties over time
outline of machine learning
Wikimedia list article