right|framed|A heavily redacted page from a 2004 lawsuit filed by the ACLU — American Civil Liberties Union v. Ashcroft (2004)|American Civil Liberties Union v. Ashcroft Redaction or sanitization is the process of removing sensitive information from a document so that it may be distributed to a broader audience. It is intended to allow the selective disclosure of information. Typically, the result is a document that is suitable for publication or for dissemination to others rather than the intended audience of the original document.
right|framed|A heavily redacted page from a 2004 lawsuit filed by the ACLU — American Civil Liberties Union v. Ashcroft (2004)|American Civil Liberties Union v. Ashcroft Redaction or sanitization is the process of removing sensitive information from a document so that it may be distributed to a broader audience. It is intended to allow the selective disclosure of information. Typically, the result is a document that is suitable for publication or for dissemination to others rather than the intended audience of the original document.
When the intent is secrecy protection, such as in dealing with classified information, redaction attempts to reduce the document's classification level, possibly yielding an unclassified document. When the intent is privacy protection, it is often called data anonymization. Originally, the term sanitization was applied to printed documents; it has since been extended to apply to computer files and the problem of data remanence.
Discovered by embedding cosine similarity (sentence-transformers MiniLM, 384-dim).