DeepPCB: PCB Defect Detection Dataset

Updated 9 February 2026

DeepPCB is a comprehensive dataset featuring 1,500 image pairs and full-board variants with six annotated defect categories for PCB defect detection.
The dataset enables rigorous evaluation of detection, classification, and registration methods using metrics such as mAP and per-class precision.
Advanced preprocessing and augmentation techniques, including template matching, global binarization, and random rotations, support robust model development.

DeepPCB is a comprehensive dataset designed for benchmarking and developing printed circuit board (PCB) defect detection algorithms, particularly in the context of deep learning-based methods. DeepPCB enables rigorous evaluation of object detection, classification, and registration methods on PCB imagery, featuring annotated defects and standardized experimental protocols. The dataset was introduced to address limitations in earlier PCB inspection datasets, such as lack of public availability, insufficient annotation precision, and synthetically limited defect diversity (Tang et al., 2019, Huang et al., 2019).

1. Dataset Composition and Statistics

DeepPCB comprises 1,500 image pairs, each consisting of a defect-free template image and a corresponding tested image of the same PCB, both at 640×640 pixel resolution. Defect annotations span six common fault categories: open, short, mousebite, spur, pin-hole, and spurious copper. Each tested image contains on average 3–12 defect instances distributed among these classes, resulting in approximately:

Defect Type	Total Count (~)
Open	3,200
Short	3,100
Mousebite	3,800
Spur	3,600
Spurious copper	3,900
Pin-hole	3,700

The dataset split consists of 1,000 image pairs for training and 500 for testing. No explicit validation split is included, but users may allocate 10–20% of training data for that purpose. Images are stored as “template” (defect-free) and “tested” (potentially defective) pairs, supporting comparative algorithms that operate on aligned templates and samples (Tang et al., 2019).

A related variant of DeepPCB described in (Huang et al., 2019) contains 1,386 full-board RGB images (4608×3456 px), with similar defect categories. Defects are simulated onto photographs of ten standard PCB templates, and images are annotated following the Pascal VOC XML schema.

2. Defect Taxonomy and Annotation Methodology

DeepPCB enumerates six defect types, captured via manual annotation and augmented through artificial “defect stamping” to ensure consistent sample numbers:

Open: Interruption or absence in a copper trace.
Short: Unintended electrical connection bridging two conductors.
Mousebite: Notches or missing segments at trace borders.
Spur: Small copper protrusions from traces.
Pin-hole: Small circular omissions within conducted areas.
Spurious copper: Isolated copper blobs not part of the intended design.

Annotations employ axis-aligned bounding boxes represented as $(x_{min}, y_{min}, x_{max}, y_{max})$ , with integer class labels (1–6, background as 0). Templates are manually verified as defect-free and are aligned to tested samples by template matching to correct for planar misalignment. Defect regions are then marked; every annotation is verified by at least two human annotators. Illumination artifacts are suppressed by stringent thresholding and global binarization during annotation (Tang et al., 2019).

In the synthesized dataset version (Huang et al., 2019), orientation variation is introduced via random rotation (–180° to +180°), with per-image .txt files encoding ground-truth angles.

3. File Structure and Access Protocol

DeepPCB is distributed via GitHub (https://github.com/tangsanli5201/DeepPCB), with the following canonical structure:

$\mathrm{IoU}=0.33$ 0

Annotation files are line-delimited with the format: $\mathrm{IoU}=0.33$ 1 The full-board image set (Huang et al., 2019) features JPEG image folders structured by defect type, annotation XMLs, rotation metadata, and original template boards. No license restrictions exist beyond the specification “free for research.”

4. Benchmarking and Evaluation Protocols

DeepPCB supports evaluation using mean Average Precision (mAP) at Intersection-over-Union threshold $\mathrm{IoU}=0.33$ , averaged across 6 defect classes:

$\mathrm{mAP} = \frac{1}{N}\,\sum_{i=1}^{N}\mathrm{AP}_i$

where $N = 6$ and $\mathrm{AP}_i$ is the average precision for class $i$ . Precision/recall F-measure is also reported.

Using the default VGG16-tiny backbone with Group Pyramid Pooling (GPP, max pooling), the proposed detector achieves:

mAP: 98.6% @ 62 FPS
Per-class AP: open 98.5%, short 98.5%, mousebite 99.1%, spur 98.2%, spurious copper 98.5%, pin-hole 99.4%

Comparative mAP results using the same test split:

Method	mAP
Image Processing	89.3%
YOLO	92.6%
SSD	95.9%
Faster R-CNN	97.6%
Ours (avg pooling)	97.1%
Ours (max pooling)	98.6%

The synthesized dataset (Huang et al., 2019) enables registration (rotation/affine estimation), detection (localization error $P_n$ ), and classification (per-class precision $P_c$ , AP) tasks. For reference-based methods (SURF + adaptive threshold + XOR + morphology), detection error rates are <0.2% for all defect classes. CNN-based classifiers achieve per-class AP $>$ 97% on test crops.

5. Preprocessing and Data Augmentation

DeepPCB employs multiple preprocessing and augmentation routines:

Offline: Template matching for geometric alignment and global binarization for noise suppression.
Online: Synchronized horizontal/vertical flips (probability 0.5) and random cropping to $512 \times 512$ during minibatch training. No photometric distortions or color transformations are applied in standard benchmarks.

The synthesized set (Huang et al., 2019) includes orientation augmentation via random large-angle rotations.

6. Applications and Research Significance

DeepPCB is utilized primarily for the development and comparison of deep object detection architectures adapted for PCB fault inspection. Its dual-image input (template + tested) supports algorithms exploiting PCB repeatability and structural alignment for fine-grained defect localization. The high annotation density, object-level bounding boxes, artificial and real defects, and established benchmarks provide a reproducible experimental foundation. DeepPCB also supports registration and patch-level classification pipelines (Tang et al., 2019, Huang et al., 2019).

7. Limitations and Practical Considerations

No explicit validation split is specified in the canonical distribution of DeepPCB; empirical protocols may adopt cross-validation or carve-outs from training data for early stopping. Color augmentation, domain adaptation, and tile-based multi-scale testing are not benchmarked in published results. While the dataset introduces augmented defects for completeness, a plausible implication is that not all synthetic defects may fully capture manufacturing variability.

DeepPCB is publicly accessible and free for research use, but users should consult the hosting repository for any updates regarding versioning or usage guidelines. The dataset supports both detection and classification benchmarks using its annotation and evaluation protocols.

References:

(Tang et al., 2019) "Online PCB Defect Detector On A New PCB Defect Dataset" (Huang et al., 2019) "A PCB Dataset for Defects Detection and Classification"

Markdown Report Issue Upgrade to Chat

References (2)

Online PCB Defect Detector On A New PCB Defect Dataset (2019)

A PCB Dataset for Defects Detection and Classification (2019)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to DeepPCB Dataset.