Career Employer

FREE CompTIA Data+ Study Guide 2026: DA0-002

The most important things the CompTIA Data+ (DA0-002) tests — an interactive study guide with built-in quizzes and flashcards, organized by all 5 official domains.

Check sections to boost your score

Don't know where to start?

To find us again, just search “Career Employer CompTIA Data+

By

This free CompTIA Data+ study guide walks through every content domain the Data+ (DA0-002) exam tests, organized to the current CompTIA exam objectives.[1]

It’s interactive, not a wall of text: every module has built-in checkpoint quizzes, flashcards, and practice questions, so you learn by doing — not just reading.

Data+ tests five official domains, and we teach them as five study modules, all five organized to the official blueprint. Read a module, test yourself at each checkpoint, then drill gaps with our free practice test and flashcards. This guide is a high-yield overview that maps the official content — not a full data-analytics textbook.

CompTIA Data+ is one of the 14 CompTIA certifications — explore our CompTIA study guides to compare and prep across the whole family.

Data+ Exam Snapshot

CompTIA Data+ (DA0-002) at a glance
DetailData+ Exam
Exam codeDA0-002 (V2; current — replaced DA0-001)
QuestionsMaximum of 90 (multiple choice + performance-based)
Time90 minutes
Passing score675 on a 100–900 scale (scaled score, not a percentage)
Certifying bodyCompTIA (delivered by Pearson VUE)
CostAbout $255 (voucher; ~$304 with retake assurance)
PrerequisitesNone required (18–24 months in a data role recommended)
Validity3 years
Renewal30 CEUs over 3 years, or pass a higher CompTIA cert

Data+ covers five domains. The largest — Data Analysis — and the next, Data Acquisition & Preparation, together make up nearly half the exam (46%), so that is where to invest first.[1] Study by weight:

Data+ DA0-002 weighting by domain (CompTIA exam objectives)
3.0 Data Analysis24% · Statistics + techniques
2.0 Data Acquisition & Preparation22% · ETL/ELT, cleansing
1.0 Data Concepts & Environments20% · Types, databases, lakes
4.0 Visualization & Reporting20% · Charts, dashboards
5.0 Data Governance, Quality & Controls14% · Governance, privacy

Every analysis follows the same arc — define the question, get and clean the data, analyze it, then visualize and communicate the result. Keep this lifecycle in mind as you work through the modules:

Module 1 · Data Concepts & Environments

One official domain, 20% of the exam. This is the foundation — the kinds of data you work with and the systems that store it. Nail the vocabulary here and the rest of the exam reads far more clearly.

1.1 Data Types & Structures

Start by classifying data two ways. By structure: fits neat rows and columns; (text, images, audio, video) does not; and (JSON, XML) sits between, carrying tags without a rigid table. By measurement: is categorical (nominal or ordinal), while is numeric (interval or ratio) and supports math.[1]

Data types and measurement scales
TypeScaleExampleMath allowed?
QualitativeNominal (categories, no order)Eye color, countryCount only
QualitativeOrdinal (ordered categories)Survey: poor/fair/goodOrder, not arithmetic
QuantitativeInterval (no true zero)Temperature in °CAdd / subtract
QuantitativeRatio (true zero)Sales, height, countAll arithmetic

1.2 Databases, Warehouses & Lakes

Data lives in different places for different jobs. A runs operations day-to-day () — normalized tables linked by a and . For analysis (), data flows into a (structured and modeled, often a ) or a (raw, any format). A blends both.[1]

OLTP vs. OLAP
AspectOLTP (operations)OLAP (analysis)
PurposeRun the business (transactions)Analyze the business (insight)
WorkloadMany small reads/writesFew large, complex queries
DesignNormalized for integrityDenormalized/modeled for speed
ExampleOrder-entry systemSales data warehouse

1.3 Big Data & the Analytics Lifecycle

is data too large or complex for traditional tools, described by the V’s: volume, velocity, variety, veracity, and value.[5] Its scale and variety are exactly why data lakes and cloud platforms exist. Whatever the size, work follows the — and the question always comes before the data.

Checkpoint · Data Concepts & Environments

Question 1 of 10

In data analytics, what does the term "Data Lake" primarily refer to?

Module 2 · Data Acquisition & Preparation

One official domain, 22% of the exam. This domain — renamed from “Data Mining” in the DA0-001 era — is where raw data becomes analysis-ready. In practice it is where analysts spend most of their time, and it is heavily tested.

2.1 Acquiring & Integrating Data (ETL/ELT)

Data is acquired from databases, files, APIs, web scraping, surveys, and sensors, then combined through . The two pipeline patterns to know cold are (transform before loading — the classic warehouse approach) and (load raw, then transform in the target — the cloud/lake/big-data approach).[1]

When you can’t (or shouldn’t) use a whole population, you sample it. Good sampling — random, representative, large enough — keeps conclusions valid; biased sampling quietly breaks every downstream result.

2.2 Cleansing & Preparing Data

fixes the problems that would otherwise poison analysis. Handle missing values (delete the record, or with the mean, median, or a predicted value), remove duplicates, and investigate each (error or genuine extreme?). Then make values comparable with or , and convert data types as needed.[1]

Common data-preparation tasks
ProblemTechnique
Missing valuesDelete the row, or impute (mean/median/predicted)
Duplicate recordsDeduplication (often via a unique key)
OutliersInvestigate; cap, transform, or remove if erroneous
Different scalesNormalize (0–1) or standardize (z-score)
Wrong data typeType conversion / casting (e.g., text to date)
Inconsistent formatsParsing and standardizing (dates, units, casing)

2.3 Data Mining Techniques

finds patterns in large datasets. The four techniques to know are (assign to known categories), (group similar records with no labels), regression (model a numeric relationship), and (items that co-occur — the Apriori algorithm and market-basket analysis). Watch for , where a model memorizes the training data and fails on new data.[1]

Data mining techniques
TechniqueLearning typeUse it to…
ClassificationSupervisedSort records into known categories (spam / not spam)
RegressionSupervisedPredict a numeric value (next month's sales)
ClusteringUnsupervisedGroup similar records with no labels (customer segments)
Association rulesUnsupervisedFind items bought together (market-basket analysis)

Checkpoint · Data Acquisition & Preparation

Question 1 of 10

Which concept in data management focuses on the use of data across different domains and formats for improved decision-making?

Module 3 · Data Analysis

One official domain, 24% of the exam — the single heaviest. This is the statistical core: summarizing data, measuring relationships, and choosing the right kind of analysis. Invest the most time here.

3.1 Descriptive Statistics

Descriptive statistics summarize a dataset. Central tendency: the (average, outlier-sensitive), the (middle, robust), and the (most frequent). Spread: the range, , and (spread around the mean, in the data’s own units). A marks the value below which a given share of data falls.[1]

Descriptive statistics — what each one tells you
MeasureWhat it tells youWatch out for
MeanThe arithmetic averageDistorted by outliers / skew
MedianThe middle valueBest for skewed data
ModeThe most common valueCan be none or several
RangeMax minus min (total spread)Driven by extremes
Standard deviationTypical distance from the meanSame units as the data
Percentile / quartilePosition within the distribution

3.2 Relationships & Inference

measures how two variables move together, from −1 to +1, but it never proves — a confounding third variable or coincidence can drive both. weighs sample evidence against a claim using a (a small p-value, often below 0.05, lets you reject the null hypothesis), and a gives a plausible range for the true value.[1]

3.3 Types of Analytics

Match the analysis to the question. says what happened, why, what will happen, and what to do about it. Each step up the ladder delivers more value and demands more sophisticated technique.[1]

Checkpoint · Data Analysis

Question 1 of 10

In data mining, what is the primary purpose of the Apriori algorithm?

Module 4 · Visualization & Reporting

One official domain, 20% of the exam. Analysis only matters if it’s communicated. This domain is about choosing the right chart, building clear dashboards and reports, and not misleading your audience.

4.1 Choosing the Right Chart

The single most-tested visualization skill is matching a chart to a goal. Use a to compare categories, a line chart for a trend over time, a pie or stacked bar for parts of a whole, a for the relationship between two variables, a or for distribution, and a to find the vital few.[1]

High-yield chart types and when to use them
ChartBest forExample
Bar / columnComparing categoriesSales by region
LineTrends over timeMonthly revenue
Pie / stacked barParts of a wholeMarket share
Scatter plotRelationship between two variablesAd spend vs. sales
HistogramDistribution of one variableCustomer ages
Box plotDistribution + outliersSalary spread by team
Pareto chartThe vital few (80/20)Top defect causes
Heat mapMagnitude across two dimensionsActivity by hour/day

4.2 Dashboards & Reports

A surfaces the right for an audience at a glance, with interactivity such as filters and drill-downs. Choose the report type for the need — ad hoc (one-off), recurring (monitoring), or self-service (exploration) — and design for clarity. Above all, never mislead: truncated axes, distorted proportions, and cherry-picked ranges are accuracy and ethics failures.[1]

Designing honest, useful visuals
DoAvoid
Start bar-chart axes at zeroTruncating the y-axis to exaggerate differences
Pick the chart that fits the dataForcing a 3-D or fancy chart that distorts
Label clearly; show units and sourceClutter (chartjunk) that hides the message
Match KPIs to the audience's decisionDumping every metric onto one screen

Checkpoint · Visualization & Reporting

Question 1 of 10

What is the primary purpose of using a box plot in data analysis?

Module 5 · Data Governance, Quality & Controls

One official domain, 14% of the exam. Smaller in weight but conceptually important — this domain is about trusting your data and using it responsibly: governance, quality, privacy, and security controls.

5.1 Governance & Master Data

is the framework of policies, roles, and standards controlling data across its lifecycle. Data owners are accountable; a handles day-to-day quality. Supporting tools include a (an inventory of data assets), (where data came from and how it moved), and (one authoritative “golden record” per entity).[1]

5.2 Data Quality

is measured across dimensions — accuracy, completeness, consistency, timeliness, uniqueness, validity, and integrity. A failure in any one can invalidate an entire analysis, which is exactly why cleansing (Module 2) and governance exist.[1]

5.3 Privacy, Security & Controls

Sensitive data demands controls. Identify (and PHI for health data), then protect it with , or anonymization, encryption (at rest and in transit), and access controls.[6] Regulations dictate the rules: (EU privacy), (U.S. health), PCI-DSS (payment cards), and CCPA (California).

Privacy and security controls
ControlWhat it does
Data classificationLabels data by sensitivity to apply the right controls
Data maskingReplaces sensitive values with realistic fakes for safe use
AnonymizationRemoves identifiers so individuals can't be re-identified
EncryptionProtects data at rest and in transit from unauthorized reading
Access controlsLimits who can see or change data (least privilege)
Retention & disposalKeeps data only as long as needed, then securely destroys it

Checkpoint · Data Governance, Quality & Controls

Question 1 of 10

What does the term "Data Governance" primarily refer to?

How to Use This Data+ Study Guide

This guide is built to be worked, not just read. The most efficient path to a pass:

  • Study by weight. Data Analysis (24%) and Data Acquisition & Preparation (22%) are nearly half the exam — master statistics, correlation vs. causation, and data cleansing first.
  • Check off as you go. Use the Study Guide Contents to mark each section done; it raises your exam-readiness score.
  • Take every checkpoint. The end-of-module quizzes show you exactly which domains need another pass.
  • Drill the weak domain. Send your weak area into the flashcards and a practice test until the score climbs.
  • Practice the PBQs. Performance-based questions reward applied skill — read a dataset, pick the right chart, and interpret a statistic until it’s automatic.

Data+ Concept Questions

Common Data+ concepts candidates search while studying — each answered briefly and backed by an official source. Test yourself, then drill them as flashcards.

Data+ Glossary

The high-yield Data+ terms in one place — hover any dotted term in the guide, or flip the whole deck here as a self-grading flashcard set.

Association rules
Finding items that frequently occur together (e.g., the Apriori algorithm for market-basket analysis).
Bar chart
A chart that compares values across distinct categories (bars have gaps).
Big data
Datasets too large or complex for traditional tools, characterized by the V's: volume, velocity, variety, veracity, value.
Box plot
A chart showing a distribution's median, quartiles, and outliers.
Causation
A relationship in which one variable directly causes a change in another; not proven by correlation alone.
Classification
A supervised technique that assigns records to predefined categories.
Clustering
An unsupervised technique that groups similar records without predefined labels.
Confidence interval
A range of plausible values for a population parameter, with a stated level of confidence.
Correlation
A measure of how two variables move together, from −1 (perfect negative) to +1 (perfect positive).
Dashboard
An interactive display of the most important metrics and visuals for an audience, at a glance.
Data analytics lifecycle
The end-to-end process: define the question, acquire, prepare, analyze, visualize, then communicate and act.
Data catalog
An organized inventory of an organization's data assets with descriptions and metadata.
Data classification
Labeling data by sensitivity (e.g., public, internal, confidential) to apply the right controls.
Data cleansing
Detecting and correcting errors and inconsistencies — missing values, duplicates, outliers — to improve data quality.
Data governance
The framework of policies, roles, and standards controlling how data is managed across its lifecycle.
Data integration
Combining data from multiple sources into a unified store for analysis.
Data lake
A repository that stores vast amounts of raw data in its native format (schema-on-read); cheap and flexible.
Data lakehouse
A hybrid architecture that adds warehouse-style structure and management on top of a data lake.
Data lineage
A record of data's origin and how it moves and transforms through systems.
Data mart
A subject-specific subset of a data warehouse serving a single department or function.
Data masking
Replacing sensitive values with realistic but fake data so it can be used without exposing the real values.
Data mining
Discovering patterns and relationships in large datasets using techniques like classification and clustering.
Data quality
The degree to which data is fit for purpose across dimensions like accuracy, completeness, and consistency.
Data steward
A person responsible for the day-to-day quality and proper use of a data domain.
Data warehouse
A central analytical store of structured, modeled data (schema-on-write) optimized for reporting and analysis.
Descriptive analytics
Analytics that summarizes what happened (reports, KPIs).
Diagnostic analytics
Analytics that explains why something happened (drill-down, correlation).
ELT
Extract, Load, Transform — load raw data first, then transform it inside the target (cloud/lake/big-data pattern).
ETL
Extract, Transform, Load — clean and shape data before loading it into the target (classic data-warehouse pattern).
Foreign key
A column that references the primary key of another table, enforcing relationships between tables.
GDPR
General Data Protection Regulation — the EU law governing personal-data privacy and protection.
Heat map
A chart that uses color intensity to show magnitude across two dimensions.
HIPAA
U.S. law protecting health information (PHI — Protected Health Information).
Histogram
A chart showing the distribution of one continuous variable by grouping values into bins (bars touch).
Hypothesis testing
A method for deciding whether sample evidence supports a claim about a population, using a p-value.
Imputation
Filling in missing values using a strategy such as the mean, median, or a predicted value.
KPI
Key Performance Indicator — a measurable value that shows how well a goal is being met.
Master Data Management (MDM)
Maintaining a single authoritative version of core business entities (the 'golden record').
Mean
The arithmetic average of all values; sensitive to outliers.
Median
The middle value of sorted data; robust to outliers.
Mode
The most frequently occurring value in a dataset.
Normalization
Rescaling numeric values to a fixed range, typically 0 to 1, so features are comparable.
OLAP
Online Analytical Processing — systems optimized for complex queries and aggregations over large, historical datasets.
OLTP
Online Transaction Processing — systems optimized for many fast, small reads and writes that run daily operations.
Outlier
A value far outside the typical range of a dataset; may be an error or a genuine extreme to investigate.
Overfitting
When a model memorizes training data and performs poorly on new, unseen data.
p-value
The probability of seeing results at least as extreme as the data if the null hypothesis is true.
Pareto chart
A bar chart ordered largest-to-smallest with a cumulative line, highlighting the vital few (80/20).
Percentile
A value below which a given percentage of observations fall (e.g., the 90th percentile).
PII
Personally Identifiable Information — data that can identify an individual (name, SSN, email).
Predictive analytics
Analytics that forecasts what is likely to happen using models.
Prescriptive analytics
Analytics that recommends what action to take (optimization).
Primary key
A column (or set of columns) whose value uniquely identifies each row in a table.
Qualitative data
Descriptive, categorical data (e.g., colors, names) — also called nominal or ordinal.
Quantitative data
Numeric, measurable data that supports mathematical operations (interval or ratio scales).
Relational database
A structured store organizing data into related tables linked by keys; queried with SQL.
Scatter plot
A chart that plots two numeric variables as points to reveal their relationship or correlation.
Semi-structured data
Data that carries tags or markers (JSON, XML) but does not fit a rigid table structure.
Standard deviation
A measure of spread around the mean, in the same units as the data; the square root of variance.
Standardization
Rescaling values to a mean of 0 and standard deviation of 1 (a z-score).
Star schema
A data-warehouse design with a central fact table linked to surrounding dimension tables.
Structured data
Data organized into a defined schema of rows and columns, such as a relational database table.
Unstructured data
Data with no predefined model — text documents, images, audio, and video — that cannot be stored in simple rows and columns.
Variance
A measure of how far values spread from the mean (the square of the standard deviation).

Data+ Study Guide FAQ

DA0-002 (V2) is the current version — it launched October 14, 2025, and DA0-001 (V1) retired in English on April 14, 2026. The exam has a maximum of 90 questions (multiple choice plus performance-based questions) and a 90-minute time limit.

References

  1. 1.CompTIA. “CompTIA Data+ (DA0-002) Certification — Exam Details & Objectives.” comptia.org.
  2. 2.CompTIA. “CompTIA Data+ (DA0-001) — Retiring Version.” comptia.org.
  3. 3.CompTIA. “CompTIA Data+: Your Questions Answered (FAQ).” comptia.org.
  4. 4.CompTIA. “Continuing Education — Renewal Fees & CEU Requirements.” comptia.org.
  5. 5.National Institute of Standards and Technology. “NIST Big Data Program.” nist.gov.
  6. 6.National Institute of Standards and Technology. “NIST Privacy Framework.” nist.gov.
Career Employer

Career Employer is the ultimate resource to help you get started working the job of your dreams. We cover topics from general career information, career searching, exam preparation with free study materials, career interviewing, and becoming successful in your career of choice.

Follow Us:

All Posts

Career Employer’s Editorial Process

Here at Career Employer, we focus a lot on providing factually accurate information that is always up to date. We strive to provide correct information using strict editorial processes, article editing, and fact-checking for all of the information found on our website. We only utilize trustworthy and relevant resources. To find out more, make sure to read our full editorial process page here.