computerized information extraction from images
Computer vision is technology that allows computers to automatically extract and understand information from images and videos, similar to how human eyes and brains work together. It matters because it enables machines to perform tasks like recognizing faces, reading text, detecting objects, and analyzing visual data at scale—applications that are increasingly important in fields ranging from medicine to autonomous vehicles to security.
AI-generated from the Wikipedia summary — may contain errors.
Computer vision tasks include methods for acquiring, processing, analyzing, and understanding digital images, and extraction of high-dimensional data from the real world in order to produce numerical or symbolic information, e.g. in the form of decisions. "Understanding" in this context signifies the transformation of visual images into descriptions of the world that make sense to thought processes and can elicit appropriate action. This image understanding can be seen as the disentangling of symbolic information from image data using models constructed with the aid of geometry, physics, statistics, and learning theory.
The scientific discipline of computer vision is concerned with the theory behind artificial systems that extract information from images. Image data can take many forms, such as video sequences, views from multiple cameras, multi-dimensional data from a 3D scanner, 3D point clouds from LiDaR sensors, or medical scanning devices. The technological discipline of computer vision seeks to apply its theories and models to the construction of computer vision systems.
Discovered by embedding cosine similarity (sentence-transformers MiniLM, 384-dim).