
process of encoding information using fewer bits than the original representation
Data compression is a process that encodes information using fewer bits than needed in its original form, making files smaller. This matters because smaller files take up less storage space and travel faster over the internet, saving time and resources.
AI-generated from the Wikipedia summary — may contain errors.
In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy compression reduces bits by removing unnecessary or less important information. Typically, a device that performs data compression is referred to as an encoder, and one that performs the reversal of the process (decompression) as a decoder.
The process of reducing the size of a data file is often referred to as data compression. In the context of data transmission, it is called source coding: encoding is done at the source of the data before it is stored or transmitted. Source coding should not be confused with channel coding, for error detection and correction or line coding, the means for mapping data onto a signal.
Discovered by embedding cosine similarity (sentence-transformers MiniLM, 384-dim).