别再只测分类模型了!用PyTorch复现论文:自动驾驶回归模型对抗攻击实战(附Udacity数据集)
自动驾驶回归模型对抗攻击实战从论文复现到PyTorch实现自动驾驶系统正逐渐从实验室走向现实道路而对抗攻击这一隐形杀手却可能让最先进的AI模型做出致命误判。与常见的图像分类攻击不同自动驾驶模型作为回归任务的代表其对抗攻击具有独特的评价标准和实现方式。本文将带您深入实战使用PyTorch复现经典论文中的攻击方法在Udacity数据集上构建完整的攻击验证流程。1. 环境准备与数据加载1.1 基础环境配置复现对抗攻击实验需要准备以下核心组件# 基础环境安装 conda create -n adv_drive python3.8 conda install pytorch1.12.1 torchvision0.13.1 cudatoolkit11.3 -c pytorch pip install opencv-python pandas matplotlib tqdm硬件配置建议GPUNVIDIA RTX 3060及以上显存≥8GB内存16GB以上存储空间至少50GB可用空间用于存储数据集和模型1.2 Udacity数据集处理Udacity自动驾驶数据集包含33805张训练图像和5614张测试图像每张图像对应归一化到[-1,1]范围的转向角度值。我们需要自定义PyTorch Dataset类from torch.utils.data import Dataset import cv2 import pandas as pd class UdacityDataset(Dataset): def __init__(self, csv_path, img_dir, transformNone): self.df pd.read_csv(csv_path) self.img_dir img_dir self.transform transform def __len__(self): return len(self.df) def __getitem__(self, idx): img_path os.path.join(self.img_dir, self.df.iloc[idx][filename]) image cv2.cvtColor(cv2.imread(img_path), cv2.COLOR_BGR2RGB) angle self.df.iloc[idx][steering_angle] if self.transform: image self.transform(image) return image, torch.tensor([angle], dtypetorch.float32)注意原始图像尺寸为640x480建议统一resize到224x224以适应常见CNN架构2. 自动驾驶模型构建与训练2.1 三种经典模型实现论文中对比了Epoch、DAVE-2和VGG16三种架构以下是PyTorch实现要点DAVE-2模型架构NVIDIA提出的轻量级网络class Dave2(nn.Module): def __init__(self): super().__init__() self.conv_layers nn.Sequential( nn.Conv2d(3, 24, 5, stride2), nn.ELU(), nn.Conv2d(24, 36, 5, stride2), nn.ELU(), nn.Conv2d(36, 48, 5, stride2), nn.ELU(), nn.Conv2d(48, 64, 3), nn.ELU(), nn.Conv2d(64, 64, 3), nn.ELU() ) self.linear_layers nn.Sequential( nn.Linear(1152, 100), nn.ELU(), nn.Linear(100, 50), nn.ELU(), nn.Linear(50, 10), nn.ELU(), nn.Linear(10, 1) ) def forward(self, x): x self.conv_layers(x) x x.view(x.size(0), -1) return self.linear_layers(x)训练关键参数对比参数Epoch模型DAVE-2VGG16学习率1e-41e-41e-5Batch Size323216训练轮数505030优化器AdamAdamAdamW损失函数MSEMSEHuber2.2 模型训练技巧自动驾驶回归任务特有的训练注意事项数据增强策略随机亮度调整模拟光照变化水平翻转同时取反转向角度随机平移模拟车辆位置变化train_transform transforms.Compose([ transforms.ToPILImage(), transforms.RandomApply([transforms.ColorJitter(brightness0.2)], p0.5), transforms.RandomHorizontalFlip(p0.5), transforms.RandomAffine(degrees0, translate(0.1, 0.1)), transforms.Resize((224, 224)), transforms.ToTensor(), transforms.Normalize(mean[0.485, 0.456, 0.406], std[0.229, 0.224, 0.225]) ])评估指标平均绝对误差MAE均方根误差RMSE预测角度分布直方图3. 对抗攻击方法实现3.1 回归攻击的特殊性与分类攻击不同自动驾驶回归攻击的成功标准是|f(x) - f(x)| Δ其中Δ为对抗阈值论文中设为0.3。这意味着攻击需要使模型输出的转向角度偏离原始预测超过0.3归一化值。3.2 核心攻击方法代码实现IT-FGSM攻击迭代目标FGSMdef it_fgsm_attack(model, image, target_angle, epsilon0.05, alpha0.01, iters10): perturbed_image image.clone().detach().requires_grad_(True) for _ in range(iters): output model(perturbed_image) loss F.mse_loss(output, target_angle) loss.backward() with torch.no_grad(): perturbation alpha * perturbed_image.grad.sign() perturbed_image perturbation perturbed_image torch.clamp(perturbed_image, 0, 1) perturbed_image.grad.zero_() return perturbed_image.detach()AdvGAN攻击实现框架class AdvGAN(nn.Module): def __init__(self, target_model): super().__init__() self.generator nn.Sequential( nn.Conv2d(3, 32, 3, padding1), nn.LeakyReLU(0.2), nn.Conv2d(32, 64, 3, padding1), nn.LeakyReLU(0.2), nn.Conv2d(64, 3, 3, padding1), nn.Tanh() ) self.discriminator nn.Sequential( nn.Conv2d(3, 64, 3, stride2), nn.LeakyReLU(0.2), nn.Conv2d(64, 128, 3, stride2), nn.LeakyReLU(0.2), nn.AdaptiveAvgPool2d(1), nn.Flatten(), nn.Linear(128, 1) ) self.target_model target_model def forward(self, x): perturbation self.generator(x) return torch.clamp(x perturbation, 0, 1)提示AdvGAN训练时需要交替优化生成器和判别器同时加入针对目标模型的攻击损失3.3 攻击效果评估指标建立完整的评估体系def evaluate_attack(model, test_loader, attack_fn, delta0.3): total 0 success 0 mae_before 0 mae_after 0 for images, angles in test_loader: perturbed attack_fn(model, images, angles delta) with torch.no_grad(): orig_output model(images) perturbed_output model(perturbed) mae_before F.l1_loss(orig_output, angles, reductionsum) mae_after F.l1_loss(perturbed_output, angles, reductionsum) success (torch.abs(perturbed_output - orig_output) delta).sum() total len(images) return { success_rate: success / total, mae_increase: (mae_after - mae_before) / total, avg_perturbation: ... # 计算平均扰动大小 }4. 防御策略与实践4.1 对抗训练实现将生成的对抗样本加入训练集def adversarial_train(model, train_loader, optimizer, attack_fn, epochs10): for epoch in range(epochs): for images, angles in train_loader: # 生成对抗样本 adv_images attack_fn(model, images, angles 0.3) # 混合原始和对抗样本 mixed_images torch.cat([images, adv_images]) mixed_angles torch.cat([angles, angles]) # 训练步骤 optimizer.zero_grad() outputs model(mixed_images) loss F.mse_loss(outputs, mixed_angles) loss.backward() optimizer.step()4.2 特征压缩防御实现两种特征压缩方法def bit_depth_reduction(x, bits4): 将图像颜色位深降低到指定位数 x_quantized torch.round(x * (2**bits - 1)) / (2**bits - 1) return x_quantized def median_filter(x, kernel_size3): 应用中值滤波 pad kernel_size // 2 x_padded F.pad(x, (pad, pad, pad, pad), modereflect) unfolded F.unfold(x_padded, kernel_size) median torch.median(unfolded, dim1)[0] return median.view_as(x)防御检测逻辑def detect_attack(model, x, threshold0.1): original_pred model(x) # 应用两种压缩方法 x_bit bit_depth_reduction(x) x_median median_filter(x) pred_bit model(x_bit) pred_median model(x_median) # 计算差异 diff_bit torch.abs(pred_bit - original_pred) diff_median torch.abs(pred_median - original_pred) return (diff_bit threshold) | (diff_median threshold)5. 完整实验流程与结果分析5.1 实验设计矩阵建立系统的实验评估框架实验维度具体设置模型架构Epoch / DAVE-2 / VGG16攻击方法IT-FGSM / Opt / AdvGAN / 通用扰动攻击强度ε ∈ [0.01, 0.05, 0.1]防御策略对抗训练 / 特征压缩 / 异常检测评估指标攻击成功率 / MAE变化 / 扰动可视化5.2 典型攻击效果对比不同攻击方法在DAVE-2模型上的表现攻击方法白盒成功率黑盒成功率平均扰动L2范数IT-FGSM98.2%15.7%0.032Opt99.1%18.3%0.028AdvGAN97.5%22.4%0.035Opt_uni96.8%25.1%0.038AdvGAN_uni95.3%30.2%0.0425.3 关键问题排查复现过程中常见问题及解决方案攻击成功率低检查梯度是否正常回传调整对抗阈值Δ验证模型预测范围是否匹配标签范围生成图像出现伪影添加图像范围约束clip操作在GAN损失中加入感知损失调整生成器网络结构防御效果不理想检查防御参数如特征压缩的bit数验证防御模块是否真正参与前向计算平衡防御强度与模型原始精度# 梯度检查示例 def check_gradient_flow(model, x): x.requires_grad True output model(x) loss output.mean() loss.backward() grad_norm x.grad.norm() print(fGradient norm: {grad_norm.item():.6f}) return grad_norm 1e-6 # 检查梯度是否非零6. 可视化分析与案例研究6.1 对抗样本可视化def visualize_attack(original, perturbed, delta): plt.figure(figsize(10, 5)) # 原始图像和扰动图像 plt.subplot(1, 3, 1) plt.imshow(original) plt.title(Original) plt.subplot(1, 3, 2) plt.imshow(perturbed) plt.title(Perturbed) # 扰动放大显示 plt.subplot(1, 3, 3) perturbation (perturbed - original 1) / 2 # 调整到可视范围 plt.imshow(perturbation) plt.title(fPerturbation (Δ{delta:.2f})) plt.tight_layout() plt.show()6.2 转向角度预测对比构建预测轨迹对比图def plot_steering_comparison(orig_angles, adv_angles, delta0.3): plt.figure(figsize(12, 6)) frames range(len(orig_angles)) plt.plot(frames, orig_angles, b-, labelOriginal Prediction) plt.plot(frames, adv_angles, r--, labelAdversarial Prediction) # 标记攻击成功区域 mask np.abs(adv_angles - orig_angles) delta plt.fill_between(frames, -1, 1, wheremask, colorred, alpha0.1, labelAttack Success) plt.axhline(ydelta, colork, linestyle:, labelThreshold) plt.axhline(y-delta, colork, linestyle:) plt.ylim(-1.1, 1.1) plt.xlabel(Frame Sequence) plt.ylabel(Steering Angle) plt.legend() plt.title(Steering Angle Deviation Under Attack) plt.show()在实际项目中我们发现DAVE-2模型对右侧车道线扰动特别敏感这可能与其训练数据分布有关。通过定向分析这类脆弱性特征可以更有针对性地设计防御策略。