Data Security and Privacy Cheat Sheet

The core ideas of Data Security and Privacy distilled into a single, scannable reference — perfect for review or quick lookup.

PiqCue — piqcue.com/data-security-privacy/cheatsheet

Quick Reference

Symmetric Encryption

Symmetric encryption uses the same secret key for both encryption and decryption. It is fast and efficient for encrypting large volumes of data, making it the standard for bulk data encryption. The primary challenge is securely distributing the shared key to all authorized parties. AES (Advanced Encryption Standard) with 128-bit or 256-bit keys is the most widely used symmetric algorithm.

Asymmetric Encryption

Asymmetric encryption uses a mathematically related pair of keys: a public key (shared openly) for encryption and a private key (kept secret) for decryption. It solves the key distribution problem of symmetric encryption but is computationally slower. RSA and Elliptic Curve Cryptography (ECC) are common algorithms. Asymmetric encryption is fundamental to digital signatures, certificate-based authentication, and secure key exchange.

Hashing

A hash function takes an input of any size and produces a fixed-length output (the hash or digest) that is deterministic, one-way (cannot be reversed to recover the input), and collision-resistant (extremely unlikely for two different inputs to produce the same hash). Hashing is used for data integrity verification, password storage, and digital signatures. Common algorithms include SHA-256 and SHA-3.

Data Classification

Data classification is the process of categorizing data based on its sensitivity, value, and regulatory requirements to determine appropriate security controls. Common classification levels include public, internal, confidential, and restricted (or top secret). Classification drives decisions about encryption, access controls, storage requirements, retention policies, and handling procedures.

Personally Identifiable Information (PII)

PII is any information that can be used to identify, contact, or locate a specific individual, either alone or when combined with other data. Direct identifiers include names, Social Security numbers, and biometric data. Quasi-identifiers like ZIP code, date of birth, and gender can identify individuals when combined. PII protection is central to privacy regulations and requires specific security controls.

GDPR (General Data Protection Regulation)

GDPR is a comprehensive data protection law enacted by the European Union in 2018 that governs how organizations collect, process, store, and share personal data of EU residents. Key principles include lawful basis for processing, data minimization, purpose limitation, storage limitation, and individual rights (access, rectification, erasure, portability). GDPR applies to any organization handling EU residents' data, regardless of where the organization is located, with fines up to 4% of annual global revenue.

Key Management

Key management encompasses the policies, procedures, and technology for generating, distributing, storing, rotating, revoking, and destroying cryptographic keys throughout their lifecycle. Poor key management can render even the strongest encryption useless. Key management systems (KMS) and hardware security modules (HSMs) provide centralized, secure key storage with access controls and audit logging.

Data Lifecycle Management

Data lifecycle management is the practice of governing data from its creation through storage, use, sharing, archival, and eventual destruction. Each stage has specific security requirements: creation requires classification, storage requires encryption and access controls, use requires audit logging, sharing requires data loss prevention, archival requires integrity verification, and destruction requires secure deletion that prevents recovery.

Data Loss Prevention (DLP)

Data Loss Prevention refers to technologies and strategies that detect and prevent the unauthorized transmission of sensitive data outside the organization. DLP systems monitor data in motion (network traffic), data at rest (storage), and data in use (endpoint activities) to enforce policies that prevent accidental or intentional data leakage. They can identify sensitive data using pattern matching, keywords, and machine learning.

Key Terms at a Glance

AES (Advanced Encryption Standard):A symmetric encryption algorithm adopted as a standard by NIST. Uses 128, 192, or 256-bit keys and is the most widely used symmetric cipher for data encryption.

RSA:An asymmetric encryption algorithm based on the mathematical difficulty of factoring large prime numbers. Used for encryption, digital signatures, and key exchange.

SHA-256:A cryptographic hash function from the SHA-2 family that produces a 256-bit digest. Used for data integrity verification, digital signatures, and password hashing.

PII (Personally Identifiable Information):Any information that can be used to identify a specific individual, including direct identifiers (name, SSN) and quasi-identifiers (ZIP code + birth date) that identify when combined.

GDPR (General Data Protection Regulation):EU regulation governing the collection, processing, and storage of personal data of EU residents. Enforces principles like consent, data minimization, and breach notification with fines up to 4% of global revenue.

CCPA (California Consumer Privacy Act):California law granting residents the right to know, delete, and opt out of the sale of their personal information, applying to businesses meeting certain revenue or data volume thresholds.

Data Classification:The process of categorizing data by sensitivity level (public, internal, confidential, restricted) to determine appropriate security controls, access permissions, and handling procedures.

Encryption at Rest:Protecting stored data by encrypting it on disk, in databases, or in storage systems, ensuring that data is unreadable without the decryption key even if the storage media is physically accessed.

Key Management System (KMS):A system for managing the lifecycle of cryptographic keys, including generation, distribution, storage, rotation, revocation, and destruction, often using HSMs for secure key storage.

HSM (Hardware Security Module):A tamper-resistant hardware device that securely generates, stores, and manages cryptographic keys and performs cryptographic operations within a protected boundary.

Data Loss Prevention (DLP):Technologies and strategies that detect and prevent unauthorized transmission of sensitive data outside the organization by monitoring data in motion, at rest, and in use.

Differential Privacy:A mathematical framework providing provable privacy guarantees by adding calibrated noise to data analysis results, making it impossible to determine whether any individual's data was included.

Pseudonymization:Replacing direct identifiers with artificial pseudonyms while maintaining a separate mapping that can re-identify individuals. Data remains personal data under GDPR.

Get study tips in your inbox

We'll send you evidence-based study strategies and new cheat sheets as they're published.

We'll notify you about updates. No spam, unsubscribe anytime.