- ACID
- Atomicity, Consistency, Isolation, Durability — the four guarantees of a reliable database transaction.
- Atomicity (ACID)
- A transaction completes fully or not at all — partial changes are rolled back.
- Consistency (ACID)
- A transaction moves the database from one valid state to another, honoring all constraints.
- Isolation (ACID)
- Concurrent transactions don't interfere; each behaves as if it ran alone.
- Durability (ACID)
- Once committed, changes survive crashes and power loss (persisted to non-volatile storage).
- Relational database
- Stores data in tables (relations) of rows and columns with a defined schema; queried with SQL.
- Primary key
- A column (or set) that uniquely identifies each row; cannot be null or duplicated.
- Foreign key
- A column that references another table's primary key, enforcing referential integrity.
- Composite key
- A primary key made up of two or more columns combined.
- Candidate key
- Any column or set of columns that could serve as the primary key (uniquely identifies a row).
- Surrogate key
- An artificial, system-generated key (e.g., auto-increment id) used instead of natural data.
- Natural key
- A key made from real, meaningful data that already exists (e.g., email or SSN).
- Referential integrity
- The rule that every foreign-key value must match an existing primary-key value (no orphans).
- Entity integrity
- The rule that every table has a primary key and no primary-key value is null.
- Schema
- The structure/blueprint of a database: tables, columns, data types, keys, and relationships.
- Tuple / row
- A single record in a table — one set of related values across all columns.
- Attribute / column
- A named field in a table holding one type of value for every row.
- Cardinality
- The nature of a relationship between tables: one-to-one, one-to-many, or many-to-many.
- Normalization
- Organizing tables to reduce redundancy and avoid update/insert/delete anomalies.
- First Normal Form (1NF)
- Each cell holds a single atomic value; no repeating groups; each row is unique.
- Second Normal Form (2NF)
- 1NF and every non-key column depends on the whole primary key (no partial dependencies).
- Third Normal Form (3NF)
- 2NF and no non-key column depends on another non-key column (no transitive dependencies).
- BCNF
- Boyce-Codd Normal Form — a stricter 3NF where every determinant is a candidate key.
- Denormalization
- Deliberately adding redundancy to speed up read-heavy queries, trading off integrity.
- DDL
- Data Definition Language — CREATE, ALTER, DROP, TRUNCATE; defines and changes structure.
- DML
- Data Manipulation Language — SELECT, INSERT, UPDATE, DELETE; reads and changes data.
- DCL
- Data Control Language — GRANT and REVOKE; controls permissions.
- TCL
- Transaction Control Language — COMMIT, ROLLBACK, SAVEPOINT; manages transactions.
- SELECT
- DML statement that retrieves rows from one or more tables.
- INSERT
- DML statement that adds a new row to a table.
- UPDATE
- DML statement that modifies existing rows that match a condition.
- DELETE
- DML statement that removes rows matching a WHERE clause; can be rolled back.
- TRUNCATE
- DDL statement that removes all rows from a table quickly and resets it.
- DROP
- DDL statement that removes an entire table (structure and data) from the database.
- WHERE clause
- Filters rows in a query to those that meet a condition.
- GROUP BY
- Groups rows that share values so aggregate functions apply per group.
- HAVING clause
- Filters groups after GROUP BY (WHERE filters rows before grouping).
- ORDER BY
- Sorts the result set by one or more columns, ascending or descending.
- DISTINCT
- Removes duplicate rows from a SELECT result set.
- INNER JOIN
- Returns only rows with matching values in both joined tables.
- LEFT JOIN
- Returns all rows from the left table plus matches from the right (NULLs where none).
- RIGHT JOIN
- Returns all rows from the right table plus matches from the left (NULLs where none).
- FULL OUTER JOIN
- Returns all rows from both tables, with NULLs where either side has no match.
- CROSS JOIN
- Returns the Cartesian product — every row of one table paired with every row of the other.
- SELF JOIN
- A table joined to itself, often to compare rows within the same table.
- Aggregate function
- Computes a single value over a set of rows: COUNT, SUM, AVG, MIN, MAX.
- COUNT()
- Aggregate function returning the number of rows (or non-null values).
- View
- A saved, named query that acts like a virtual table; simplifies access and can restrict columns.
- Index
- A structure that speeds row lookups by a column's value without scanning the whole table.
- Data redundancy
- The same data stored in more than one place — a source of inconsistency normalization reduces.
- NULL
- A marker for a missing or unknown value — not zero and not an empty string.
- Constraint
- A rule enforced on a column: PRIMARY KEY, FOREIGN KEY, UNIQUE, NOT NULL, CHECK, DEFAULT.
- UNIQUE constraint
- Ensures all values in a column (or set) are different across rows.
- CHECK constraint
- Restricts column values to those that satisfy a boolean condition.
- Data type
- Defines the kind of value a column stores: INT, VARCHAR, DATE, DECIMAL, BOOLEAN, etc.
- VARCHAR vs CHAR
- VARCHAR stores variable-length text; CHAR is fixed-length and padded.
- SQL vs NoSQL
- SQL = structured, fixed-schema relational tables; NoSQL = flexible schema, horizontal scale.
- Document database
- NoSQL store of self-describing documents (e.g., JSON) — example: MongoDB.
- Key-value store
- NoSQL store mapping unique keys to values — example: Redis.
- Column-family store
- NoSQL store organizing data by column families for wide rows — example: Cassandra.
- Graph database
- NoSQL store of nodes and relationships — example: Neo4j.
- ERD
- Entity-Relationship Diagram — a model of entities, attributes, and relationships in a design.
- DBMS
- Database Management System — software that stores, retrieves, and manages databases.
- Stored procedure
- A named, precompiled set of SQL statements executed by name, often with parameters.
- Trigger
- Code that runs automatically in response to an INSERT, UPDATE, or DELETE event.
- Function (SQL)
- A routine that returns a value and can be used inside a query.
- OLTP
- Online Transaction Processing — many short real-time transactions; normalized for fast writes.
- OLAP
- Online Analytical Processing — complex analytical queries over large historical data; often denormalized.
- ETL
- Extract, Transform, Load — moves and reshapes data from sources into a warehouse.
- ELT
- Extract, Load, Transform — loads raw data first, then transforms it in the target system.
- Data warehouse
- A central repository of integrated historical data optimized for analysis and reporting.
- Data lake
- A store of raw data in native format at scale, for varied later processing.
- Data mart
- A subject-focused subset of a data warehouse for a specific team or use.
- Star schema
- A warehouse design: one central fact table linked to denormalized dimension tables.
- Snowflake schema
- A star schema whose dimension tables are normalized into related sub-tables.
- Fact table
- The central table in a warehouse holding measurable, numeric business events.
- Dimension table
- A table of descriptive attributes (who, what, where, when) joined to facts.
- Database deployment
- Installing, configuring, and making a database available in an environment.
- Configuration baseline
- A documented standard config used to deploy databases consistently and securely.
- Infrastructure as code (IaC)
- Managing infrastructure and config as version-controlled files (e.g., Terraform, Ansible).
- Database migration
- Moving a database or its data/schema to a new version, server, or platform.
- Schema migration
- Applying versioned, repeatable changes to a database structure over time.
- On-premises database
- A database running on hardware the organization owns and manages itself.
- Cloud database
- A database hosted/managed in a cloud provider's environment (IaaS, PaaS, or DBaaS).
- DBaaS
- Database as a Service — a fully managed cloud database (provider handles patching, backups, HA).
- IaaS
- Infrastructure as a Service — you manage the OS and DBMS on provider-supplied VMs.
- PaaS
- Platform as a Service — the provider manages the platform; you deploy databases/apps on it.
- Scripting fundamentals
- Using shell/Python/PowerShell + SQL scripts to automate deployment and admin tasks.
- Idempotent script
- A script that produces the same result no matter how many times it runs.
- Connection string
- Parameters an app uses to connect: server, database, credentials, and options.
- ODBC / JDBC
- Standard APIs/drivers that let applications connect to and query databases.
- Containerized database
- A database packaged in a container (e.g., Docker) for portable, repeatable deployment.
- Staging environment
- A production-like environment for testing deployments before go-live.
- Sandbox / dev environment
- An isolated environment for development and experimentation, separate from production.
- Version control (DB)
- Tracking schema and script changes in a repository (e.g., Git) for review and rollback.
- Data import/export
- Bulk loading or extracting data via tools like BCP, COPY, or import/export utilities.
- CSV
- Comma-Separated Values — a plain-text format commonly used to import/export tabular data.
- Database instance
- A running copy of the DBMS engine managing one or more databases in memory and on disk.
- Clustered index
- Defines the physical sort order of a table's rows; only one per table.
- Non-clustered index
- A separate structure pointing back to rows; many allowed per table.
- Composite index
- An index on two or more columns, useful for multi-column lookups and sorting.
- B-tree index
- A balanced-tree structure that keeps lookups fast as data grows.
- Covering index
- An index that contains all columns a query needs, so the table isn't read.
- Index fragmentation
- Disordered index pages that slow reads; fixed by rebuilding or reorganizing.
- Index rebuild
- Recreates an index to remove fragmentation (heavier than a reorganize).
- Index reorganize
- Defragments index pages in place with lower overhead than a rebuild.
- Statistics (DB)
- Metadata about data distribution the optimizer uses to choose a good plan.
- Query optimizer
- The DBMS component that chooses the execution strategy for a SQL statement.
- Execution plan
- The optimizer's step-by-step plan for running a query (scans, seeks, joins, sorts).
- Table scan
- Reading every row of a table — slow on large tables; a sign a useful index is missing.
- Index seek
- Using an index to jump directly to matching rows — efficient lookup.
- Performance tuning
- Improving speed via indexing, query rewriting, statistics, and configuration.
- Query profiling
- Measuring a query's resource use and timing to find bottlenecks.
- Locking
- Preventing conflicting concurrent access to data to preserve isolation.
- Deadlock
- Two transactions each hold a lock the other needs; the DBMS kills one as the victim.
- Lock escalation
- The DBMS converting many fine-grained locks into a coarser one to save resources.
- Concurrency
- Multiple transactions executing at the same time without corrupting data.
- Isolation level
- Sets how visible one transaction's changes are to others (read committed, serializable, etc.).
- Dirty read
- Reading uncommitted changes that may later be rolled back.
- Phantom read
- Rows appearing/disappearing between reads in a transaction due to other inserts/deletes.
- Database job / scheduler
- Automated, scheduled task: backups, index maintenance, statistics updates, cleanup.
- Patching
- Applying vendor updates to the DBMS to fix bugs and close security holes.
- Maintenance plan
- A scheduled set of upkeep tasks: backups, integrity checks, index/statistics maintenance.
- Integrity check (DBCC)
- A consistency check (e.g., DBCC CHECKDB) that detects corruption in a database.
- Capacity planning
- Forecasting storage, memory, and CPU needs to keep performance acceptable as data grows.
- Monitoring
- Continuously tracking health metrics (CPU, memory, I/O, waits, blocking, errors).
- Baseline (performance)
- A record of normal performance used to spot abnormal drift.
- Alerting
- Automated notifications when a metric crosses a threshold (e.g., low disk, high latency).
- Log file (transaction log)
- Records all changes so transactions can be committed, rolled back, and recovered.
- Log truncation
- Reclaiming space in the transaction log after backups, preventing it from filling.
- Partitioning
- Splitting a large table across segments to improve manageability and performance.
- Sharding
- Horizontally splitting data across multiple servers to scale out.
- Archiving
- Moving old, rarely accessed data to cheaper storage to keep active tables lean.
- Vacuum / cleanup
- Reclaiming space from dead rows and updating statistics (e.g., PostgreSQL VACUUM).
- Connection pooling
- Reusing a set of database connections to reduce overhead and improve throughput.
- Resource governor
- A feature that caps CPU/memory per workload to prevent one query starving others.
- CIA triad
- Confidentiality, Integrity, Availability — the three core security goals.
- Confidentiality
- Keeping data accessible only to authorized parties (encryption, access control).
- Integrity
- Ensuring data is accurate and unaltered except by authorized changes.
- Availability
- Ensuring data and the database are accessible when needed.
- Authentication
- Verifying who a user or service is (passwords, keys, MFA, certificates).
- Authorization
- Determining what an authenticated user is allowed to do.
- RBAC
- Role-Based Access Control — assign permissions to roles, then users to roles.
- Least privilege
- Granting only the minimum access a role needs to do its job.
- Principle of separation of duties
- Splitting critical tasks among people so no one has unchecked control.
- GRANT
- SQL (DCL) command that gives a user or role a permission.
- REVOKE
- SQL (DCL) command that removes a previously granted permission.
- SQL injection
- Inserting malicious SQL via unsanitized input to read or alter data.
- Parameterized query
- A prepared statement that separates SQL code from input — the main SQL-injection defense.
- Input validation
- Checking and sanitizing user input to block malicious or malformed data.
- Encryption at rest
- Encrypting stored data on disk so files/backups are unreadable if stolen.
- Encryption in transit
- Encrypting data moving over the network (TLS) so it can't be intercepted.
- TDE
- Transparent Data Encryption — encrypts database files at rest automatically.
- TLS
- Transport Layer Security — the protocol that encrypts data in transit between client and server.
- AES
- Advanced Encryption Standard — a strong symmetric cipher widely used for data encryption.
- Hashing
- A one-way function producing a fixed-length digest; used to store passwords (with salt).
- Salt
- Random data added to a password before hashing to defeat precomputed (rainbow) attacks.
- Key management
- Securely generating, storing, rotating, and revoking encryption keys.
- Data masking
- Hiding sensitive values (e.g., showing only last 4 digits) for non-privileged users.
- Tokenization
- Replacing sensitive data with a non-sensitive token mapped in a secure vault.
- Auditing
- Recording who accessed or changed what and when, for accountability and compliance.
- Audit log
- An immutable record of security-relevant database events.
- Data classification
- Labeling data by sensitivity (public, internal, confidential, restricted) to drive controls.
- PII
- Personally Identifiable Information — data that can identify a person; requires protection.
- GDPR
- EU regulation governing personal-data privacy, consent, and breach notification.
- HIPAA
- U.S. law protecting the privacy and security of health information (PHI).
- PCI DSS
- Security standard for organizations that handle payment-card data.
- Privilege escalation
- An attacker gaining higher permissions than they were granted.
- Defense in depth
- Layering multiple security controls so no single failure exposes the data.
- Database firewall
- A control that monitors and filters SQL traffic to block malicious queries.
- Backup encryption
- Encrypting backup files so stolen backups don't expose data.
- Account lockout
- Disabling an account after repeated failed logins to slow brute-force attacks.
- Service account
- A non-human account an application uses; should run with least privilege.
- RPO
- Recovery Point Objective — the maximum tolerable data loss; drives backup frequency.
- RTO
- Recovery Time Objective — the maximum tolerable downtime; drives recovery design.
- Full backup
- A complete copy of all data; fastest restore, most storage and time.
- Differential backup
- Copies all changes since the last FULL backup; restore = full + latest differential.
- Incremental backup
- Copies changes since the last backup of ANY type; smallest, but slowest restore.
- 3-2-1 rule
- Keep 3 copies of data, on 2 media types, with 1 copy off-site.
- Backup verification
- Testing that backups actually restore — an untested backup is unproven.
- Cold backup
- A backup taken while the database is shut down (offline).
- Hot backup
- A backup taken while the database is running and online.
- Point-in-time recovery
- Restoring a database to a specific moment using backups plus transaction logs.
- Snapshot
- A read-only, point-in-time image of data used for fast recovery or cloning.
- RAID 0
- Striping for speed; no redundancy — one disk failure loses all data.
- RAID 1
- Mirroring — an identical copy on a second disk; survives one disk failure.
- RAID 5
- Striping with distributed parity; survives one disk failure; needs 3+ disks.
- RAID 6
- Double parity; survives two simultaneous disk failures; needs 4+ disks.
- RAID 10
- Mirrored pairs that are striped; high performance and redundancy.
- High availability (HA)
- Designing systems to minimize downtime through redundancy and failover.
- Failover
- Automatically switching to a standby system when the primary fails.
- Failover cluster
- Multiple nodes sharing storage so a standby can take over the database service.
- Replication
- Keeping synchronized copies of a database on multiple servers.
- Synchronous replication
- Commits on the primary and replica together — zero data loss, higher latency.
- Asynchronous replication
- Replica lags slightly behind — lower latency, small risk of data loss.
- Log shipping
- Periodically copying and restoring transaction logs to a standby server.
- Mirroring (DB)
- Maintaining a hot standby copy of a database for failover.
- Disaster recovery (DR)
- The plan and process to restore service after a major outage or disaster.
- DR site
- A secondary location (hot, warm, or cold) used to resume operations after a disaster.
- Hot site
- A fully equipped, running DR site that can take over almost immediately.
- Warm site
- A partially ready DR site needing some setup/data load before takeover.
- Cold site
- A facility with infrastructure but no live systems/data — slowest to bring up.
- Business continuity plan (BCP)
- An organization-wide plan to keep critical functions running during disruption.
- MTTR
- Mean Time To Repair — average time to fix a failed component.
- MTBF
- Mean Time Between Failures — average operating time between failures (reliability).
- Geo-redundancy
- Storing copies in geographically separate regions to survive a regional outage.
- Retention policy
- Rules for how long backups/data are kept before deletion or archive.
- LIKE operator
- Pattern-matches text in a WHERE clause using wildcards % (any string) and _ (one char).
- IN operator
- Tests whether a value matches any value in a list or subquery.
- BETWEEN operator
- Tests whether a value falls within an inclusive range.
- Subquery
- A query nested inside another query, used in SELECT, FROM, or WHERE.
- UNION
- Combines result sets of two queries and removes duplicates (UNION ALL keeps them).
- Alias (AS)
- A temporary name given to a column or table in a query for readability.
- COALESCE
- Returns the first non-null value from a list of expressions.
- CASE expression
- Adds if/then/else conditional logic inside a SQL query.
- Transaction
- A unit of work treated as a single, all-or-nothing operation.
- COMMIT
- Permanently saves the changes made in the current transaction.
- ROLLBACK
- Undoes the changes made in the current transaction.
- SAVEPOINT
- A marker within a transaction you can roll back to without undoing everything.
- One-to-many relationship
- One row in a table relates to many rows in another (the most common relationship).
- Many-to-many relationship
- Rows on both sides relate to many — implemented with a junction (bridge) table.
- Junction table
- A bridge table that resolves a many-to-many relationship using two foreign keys.
- Domain (column)
- The set of valid values a column may hold, enforced by data type and constraints.
- Heap
- A table with no clustered index — rows stored in no particular order.
- Auto-increment / identity
- A column that automatically assigns the next sequential number to new rows.
- Temporal data
- Time-based data; some databases support system-versioned (temporal) tables for history.
- JSON column
- A column storing JSON documents, blending relational and document storage.
- Blue-green deployment
- Two identical environments; switch traffic to the new one for zero-downtime releases.
- Rollback (deployment)
- Reverting a deployment to the previous known-good version after a failure.
- Smoke test
- A quick post-deployment check that core functionality works.
- Bulk insert
- Loading many rows at once efficiently, bypassing row-by-row overhead.
- Data transformation
- Cleaning, reshaping, and converting data during ETL/ELT.
- Data validation (load)
- Verifying imported data meets type, range, and integrity rules.
- Connection encryption
- Requiring TLS for client-server connections during deployment hardening.
- Environment variable
- A configuration value (e.g., credentials, host) supplied to a deployment at runtime.
- Secrets management
- Storing credentials/keys securely (vaults) rather than hard-coded in scripts.
- Database link / linked server
- A configured connection that lets one database query another remote database.
- Replication setup
- Configuring primary and replica roles during deployment for HA or read scaling.
- Capacity sizing
- Estimating CPU, memory, and storage before deploying a database.
- Wait statistics
- Metrics showing what the engine is waiting on (I/O, locks, CPU) to find bottlenecks.
- Blocking
- One transaction holding a lock that stalls another (not yet a deadlock).
- Query hint
- An instruction that overrides the optimizer's default choice for a query.
- Parameter sniffing
- The optimizer caching a plan based on the first parameter, hurting other values.
- Plan cache
- Stored execution plans reused to avoid recompiling frequent queries.
- Recompilation
- Generating a fresh execution plan when data or statistics change significantly.
- Database integrity
- The overall accuracy and consistency of data, protected by constraints and checks.
- Orphaned user
- A database user whose matching server login is missing — a common post-restore fix.
- Tempdb / temp space
- Workspace the engine uses for sorts, joins, and temporary objects.
- Autogrowth
- A setting that lets a data/log file expand automatically when it fills.
- Compression (DB)
- Reducing storage and I/O by compressing data or backups.
- Health check
- A routine review of performance, errors, security, and backups.
- Refresh statistics
- Updating data-distribution stats so the optimizer makes good choices.
- Maintenance window
- A scheduled period for disruptive upkeep (patching, rebuilds) with low user impact.
- MFA
- Multi-Factor Authentication — requiring two or more proofs of identity.
- Certificate authentication
- Using digital certificates to verify identity instead of passwords.
- Row-level security
- Restricting which rows a user can see based on their identity or role.
- Column-level security
- Restricting access to specific sensitive columns within a table.
- Dynamic data masking
- Masking column values at query time for unauthorized users without changing stored data.
- Vulnerability assessment
- Scanning a database for misconfigurations and known weaknesses.
- Penetration test
- An authorized simulated attack to find exploitable security gaps.
- Patch (security)
- A fix that closes a known vulnerability in the DBMS.
- Brute-force attack
- Trying many passwords/keys until one works; mitigated by lockout and strong policies.
- Privilege creep
- Users accumulating permissions over time beyond what they currently need.
- Data breach
- Unauthorized access to or disclosure of protected data.
- Incident response
- The plan to detect, contain, eradicate, and recover from a security incident.
- Compliance audit
- A formal review confirming controls meet a regulation or standard.
- Data sovereignty
- The principle that data is subject to the laws of the country where it is stored.
- Secure baseline
- A hardened default configuration applied to every database.
- Standby database
- A secondary copy kept ready to take over (warm/hot) if the primary fails.
- Active-active
- Both nodes serve traffic simultaneously, sharing load and providing redundancy.
- Active-passive
- One node serves traffic while a standby waits to take over on failure.
- Quorum
- The minimum number of cluster nodes that must agree for the cluster to stay online.
- Split-brain
- A cluster fault where two nodes both think they're primary — prevented by quorum/witness.
- Witness / arbiter
- An extra vote that breaks ties and prevents split-brain in a cluster.
- Backup window
- The time period available to run backups without harming performance.
- Backup retention
- How long backup copies are kept before deletion (driven by policy/compliance).
- Offsite backup
- A backup stored in a separate location to survive a site-level disaster.
- Tabletop exercise
- A walkthrough drill that tests a DR/BCP plan without real failover.
- Recovery testing
- Periodically restoring backups to prove they work and meet the RTO.
- Bare-metal recovery
- Restoring a full system, including OS, from backup onto new hardware.
- Continuous data protection
- Capturing every change so you can restore to nearly any point in time.
- SLA
- Service Level Agreement — the committed uptime/recovery targets for a service.