Open Source Datasets
We identify a critical gap in open-source datasets for the implementation security testing of post-quantum cryptography. The following are our guidelines for generating this dataset for side-channel analysis. Fault injection datasets would also be useful, but need more research maturity before discussing specifics.
Dataset Scope and Strategy
- Full algorithm execution traces may result in large, impractical datasets.
- A more effective approach may involve generating side-channel traces for unit-level testing.
Proposed Units for Trace Collection
- Targeted units include:
- Number Theoretic Transform (NTT)
- Samplers (e.g., binomial, ternary, Gaussian, rejection sampling)
- Hash functions (particularly those used in PRFs for PQC)
- Comparison functions (e.g., for rejection sampling)
- Arithmetic-to-Boolean and Boolean-to-Arithmetic conversions used in masking
- Each unit can be further subdivided based on algorithmic variations and implementation strategies.
Implementation Variants
- Each unit should ideally be implemented in four forms:
- Baseline (no countermeasures)
- First-order masking
- Shuffling
- Higher-order masking
Hardware Measurement Emphasis
- While software-based measurements are acceptable, there is a strong community preference for hardware-based side-channel measurements, particularly from FPGA implementations
Metadata Requirements
- Trace datasets should be accompanied by detailed metadata, including:
- FPGA implementation specifics
- Description of applied defense techniques
- Methodology for input generation
- The metadata description could further follow details in our earlier document.
Examples of existing datasets: Kyber https://eprint.iacr.org/2025/811