AES-NI Hardware Acceleration Rust is a development claude skill built by Londo Spark. Best for: Backend engineers processing large encrypted files need hardware-accelerated AES cryptography with measurable performance benchmarking and trade-off analysis..
- What it does
- Configure Rust projects to enable AES-NI CPU instructions and optimize chunk sizes for 1.5-1.7x cryptographic performance gains.
- Category
- development
- Created by
- Londo Spark
- Last updated
AES-NI Hardware Acceleration Rust
Configure Rust projects to enable AES-NI CPU instructions and optimize chunk sizes for 1.5-1.7x cryptographic performance gains.
Skill instructions
SKILL: AES-NI Hardware Acceleration in Rust
Context
Optimizing AES-CTR decryption performance using hardware acceleration (AES-NI) and cache-aware chunk sizing.
Pattern
1. Enable Native CPU Features
Create .cargo/config.toml:
[build]
rustflags = ["-C", "target-cpu=native"]
This enables:
- AES-NI instructions (x86_64)
- AVX/AVX2 vector operations
- SSE4.2 instructions
- All other CPU-native features
2. Verify Runtime Detection
Check if your crypto crate uses cpufeatures:
grep cpufeatures Cargo.lock
RustCrypto crates (aes, sha2, etc.) automatically detect and use hardware acceleration when available.
3. Benchmark Chunk Sizes
For streaming AES-CTR operations, chunk size matters:
Test methodology:
// Test powers of 2 from 1MB to 64MB
let chunk_sizes = [1, 2, 4, 8, 16, 32, 64];
for size_mb in chunk_sizes {
let chunk_size = size_mb * 1024 * 1024;
// Benchmark decryption with this chunk
}
Expected results:
- 2-8 MB: Optimal (fits L3 cache on modern CPUs)
- 16+ MB: Degraded (cache pressure)
- 32+ MB: Significant degradation (memory bandwidth bottleneck)
4. Measure Consistently
Run multiple iterations and calculate standard deviation:
for ( = 1; -le 5; ++) {
# Time operation
+=
}
= ( | Measure-Object -Average).Average
= [math]::Sqrt(...)
Select configuration with lowest variance, not just fastest single run.
Typical Speedups
target-cpu=nativealone: 1.15-1.25x (AES-NI + vectorization)- Chunk size optimization: 1.2-1.4x (cache locality)
- Combined: 1.5-1.7x speedup over baseline
Verification
Always verify correctness with cryptographic hash (SHA256) after changes:
= (Get-FileHash -Algorithm SHA256).Hash
When to Apply
- ✅ AES-CTR, AES-GCM, or other hardware-accelerated crypto
- ✅ Large file processing (>100 MB)
- ✅ Desktop/server targets (not embedded)
- ❌ Cross-compilation targets without AES-NI
- ❌ Web/WASM targets (use portable build)
Trade-offs
Pros:
- Zero code changes (config-only)
- Significant speedup (1.5x+)
- No dependencies added
Cons:
- Binary not portable to older CPUs (pre-2010 x86_64)
- Larger binary size (~5-10%)
- Must test on target hardware
Related
- Crates:
aes,ctr,cipher,cpufeatures - Flags:
-C target-cpu=native,-C target-feature=+aes - Tools:
cargo bench, PowerShellMeasure-Command
Use this skill
Most skills are portable instruction packages. Claude Code supports SKILL.md directly. Other agents can use adapted files like AGENTS.md, .cursorrules, and GEMINI.md.
Claude Code
Save SKILL.md into your Claude Skills folder, then restart Claude Code.
mkdir -p ~/.claude/skills/aes-ni-hardware-acceleration-rust && curl -L "https://raw.githubusercontent.com/londospark/citrust/6764d8d0b88f3405f1b0a8849de9f10160967d3b/.squad/skills/aes-ni-optimization/SKILL.md" -o ~/.claude/skills/aes-ni-hardware-acceleration-rust/SKILL.mdInstalls to ~/.claude/skills/aes-ni-hardware-acceleration-rust/SKILL.md.
Use cases
Backend engineers processing large encrypted files need hardware-accelerated AES cryptography with measurable performance benchmarking and trade-off analysis.
Reviews
No reviews yet. Be the first to review this skill.
No signup required
Stats
Creator
LLondo Spark
@londospark