Real-IAD Variety: Pushing Industrial Anomaly Detection Dataset to a Modern Era

aFudan University   bYoutu Lab, Tencent   cShanghai Jiao Tong University   dRongcheer Co., Ltd  
eCity University of Hong Kong   fNational University of Singapore   gShanghai Ocean University

Dataset Download

If you are interested in using the dataset, you can download it at Hugging Face Hugging Face. Your access request will be automatically approved. For further questions, please feel free to send emails to realiad4ad@outlook.com.

Dataset Organization

Real-IAD Variety comprises 198,950 high-resolution images across 160 distinct object categories. The dataset ensures unprecedented diversity by covering 28 industries, 24 material types, 22 color variations, and 27 defect types. The category count of Real-IAD Variety is approximately 5.3 times that of its predecessor, Real-IAD, making it the first truly multi-industry benchmark that overcomes the constraints of prior datasets for unified and zero-/few-shot anomaly detection.

The dataset is partitioned into training and testing subsets, comprising 3,991 normal samples (19,955 images) for training and 35,799 samples (3,990 normal and 31,809 anomalous, totaling 178,995 images) for testing. Notably, the testing subset exhibits a nearly balanced distribution between normal and anomalous images.

Dataset Statistics Composite Visualization
Error: Image not found - Compositeimage.png

Figure 1. The distribution of normal and abnormal images (first row), defect spatial locations (second row), defects scales (third row) and defect area ratios (last row) in our Real-IAD Variety dataset.

Collection Pipeline of Real-IAD Variety

We construct a rigorous three-stage data collection pipeline inspired by Real-IAD to ensure dataset comprehensiveness and high-fidelity annotations:

  • Stage 1: Diverse Material Preparation. Through 11,000 working hours, we assembled 160 object categories spanning 24 material types across 28 industrial domains. Leveraging extensive production line experience, we artificially introduced realistic defect variations, aggregating to 27 defect types across the dataset.
  • Stage 2: Acquisition Equipment Design. Our high-precision capture apparatus features a multi-spectral light source (RGBW) and five cameras: one top-down camera (5,328×3,040 resolution, 0.01mm/pixel lateral accuracy) and four peripheral cameras symmetrically arranged at 40°-45° angles (4,096×3,000 resolution, 0.028mm/pixel).
  • Stage 3: Data Collection and Annotation Refinement. This stage involves meticulous pixel-level manual annotation, followed by algorithmic cross-validation. The process iteratively refines the masks until the model's predicted Average Precision (AP) scores stabilize below a predetermined threshold.
Collection pipeline
Error: Image not found - pipeline.jpg

Figure 2. Data collection and annotation pipeline for the proposed Real-IAD Variety. The pipeline comprises a four-stage sequential process: (a) Material Preparation: This initial phase encompasses the assembly of a diverse array of materials, spanning 160 distinct categories sourced from 28 industrial domains and encompassing 24 material compositions. (b) Acquisition Equipment Design: The second phase involves the design of data capture apparatus, comprising one top-down camera for overhead views and four peripheral cameras to capture lateral perspectives. (c) Data Collection and Annotation: The third phase pertains to the data collection process, which includes meticulous pixel-level manual annotation, rigorous algorithmic cross-validation, and iterative refinement. (d) Defect Taxonomy: The lower section illustrates distinct defect types alongside their characteristic visual representations.

Real-IAD Variety Dataset

Real-IAD Variety is the largest and most diverse IAD benchmark, featuring 198,950 images across 160 distinct object categories. The dataset offers several distinctive advantages: (1) Unprecedented scale: With 198,950 images including 159,045 anomalous images, it provides substantially larger training and evaluation data compared to prior datasets; (2) Comprehensive annotations: Pixel-level defect masks with rigorous quality control ensure high annotation fidelity; (3) Multi-view coverage: Multiple viewpoints per sample (five distinct viewpoints) enable robust evaluation of view-invariant detection methods; (4) Diverse defect taxonomy: 27 defect types represent the most comprehensive defect categorization among existing IAD datasets; (5) Extended resolution range: Image resolutions spanning 260∼5,328 pixels accommodate industrial products of varying scales, with over 90% of images exceeding 2,000 pixels in resolution.

Diverse anomaly patterns

Figure 3. Visual illustration of diverse anomaly patterns in the Real-IAD Variety dataset. We present representative samples across 27 defect types, where anomaly regions are highlighted with red boundaries. Each sample is captured from five distinct viewpoints, demonstrating that certain defects are only detectable from specific angles while remaining invisible in others. Text annotations follow the format of "object (defect type)", covering both structural damages (e.g., scratch) and logical inconsistencies (e.g., missing part).

Industry distribution

Figure 4. Industry distribution of the proposed Real-IAD Variety dataset. The dataset encompasses 8 major industrial groups (denoted as c): electrical, transport, cultural products, metal, general, electronics, rubber plastic, and other manufacturing sectors. These major categories are further subdivided into 28 industrial subcategories (denoted as s), providing comprehensive coverage from a practical application perspective.

Statistical characteristics

Figure 5. Statistical characteristics of Real-IAD Variety across multiple dimensions. (a) Anomalous region proportion: Real-IAD Variety exhibits a broader and more balanced distribution of anomalous region proportions relative to total image area compared to Real-IAD, substantially increasing dataset complexity. (b) Defect aspect ratio: Real-IAD Variety provides diverse aspect ratios for minimum bounding rectangles of defects, introducing additional diversity and detection challenges. (c) Material distribution: Real-IAD Variety encompasses 24 material types for practical applications, imposing higher requirements on method robustness. (d) Color distribution: Real-IAD Variety captures a wide color spectrum, which is essential for color-based anomaly detection research.

Performance comparison

Figure 6. Performance trends of I-AUROC and P-AUPR metrics with increasing categories for MUAD, ZSAD and FSAD methods on Real-IAD Variety. The results demonstrate that traditional multi-class unsupervised methods suffer significant performance degradation (ranging from 10% to 20%) when scaled from 30 to 160 categories, while zero-shot and few-shot approaches exhibit remarkable robustness to category scale-up.

<

BibTeX

@article{zhu2025realiad-variety,
  title={Real-IAD Variety: Pushing Industrial Anomaly Detection Dataset to a Modern Era},
  author={Zhu, Wenbing and Wang, Chengjie and Gao, Bin-Bin and Zhang, Jiangning and Jiang, Guannan and Hu, Jie and Gan, Zhenye and Wang, Lidong and Zhou, Ziqing and Zhang, Jianghui and Cheng, Linjie and Pan, Yurui and Peng, Bo and Chi, Mingmin and Ma, Lizhuang},
  journal={Pattern Recognition},
  year={2026}
}