关于batch_normlization问题

一、问题“ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 1024])”

今天记录一下关于添加batch_normlization的小问题：

ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 1024])

首先我是正常的导入包、数据准备、图片增强等等一系列的操作，但是添加到如下代码的时候，在定义train函数的时候就会报错

我的训练函数代码如下：

def train(epoches, model, train_loader, test_loader):
    correct = 0
    total = 0
    running_loss = 0

    for x, y in train_loader:
        x, y = x.to(device), y.to(device)
        # 前向传播
        y_pred = model(x)
        # 计算损失
        loss = loss_func(y_pred, y)
        # 反向传播
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        with torch.no_grad():
            # argmax() 返回最大值的索引，代表真实预测的类别
            y_pred = torch.argmax(y_pred, dim=1)
            correct += (y_pred == y).sum().item()
            total += y.size(0)
            running_loss += loss.item()

    lr_scheduler.step()
    epoch_loss = running_loss / total
    epoch_acc = correct / total

    # 测试过程
    test_correct = 0
    test_total = 0
    test_running_loss = 0
    with torch.no_grad():
        for x, y in test_loader:
            x, y = x.to(device), y.to(device)
            y_pred = model(x)
            loss = loss_func(y_pred, y)
            y_pred = torch.argmax(y_pred, dim=1)
            test_correct += (y_pred == y).sum().item()
            test_total += y.size(0)
            test_running_loss += loss.item()

    test_epoch_loss = test_running_loss / test_total
    test_epoch_acc = test_correct / test_total

    print('Epoch: {}, Loss: {:.4f}, Acc: {:.4f}, test_Loss: {:.4f}, test_Acc: {:.4f}'.format(epoches + 1, epoch_loss,
                                                                                             epoch_acc,
                                                                                             test_epoch_loss,
                                                                                             test_epoch_acc))
    return epoch_loss, epoch_acc, test_epoch_loss, test_epoch_acc

找了好长时间的错误，发现了如下两种方法：

二、解决办法

方法一：drop_last=True

有的人说在dataloader中添加drop_last=True这个代码

这个代码虽然能够解决运行报错的问题，但是在数据量较少的情况下不建议这么使用。因为数据量本身少，然后还要丢弃，那么这个方法训练出来的train_accuracy相当高，而test_accuracy会弱爆了！我就是train_accuracy接近1的程度，而test_accuracy的精度就只有28%、29%那个样子

方法二：eval()

最后实在没辙在chatGPT进行检索回答：

然后进行如下代码的改动

这个屎山代码就跑起来了：

三、反思

在 PyTorch 中，BatchNorm 层（如 BatchNorm1d 和 BatchNorm2d）有两种工作模式：

模式	均值 / 方差来源	行为说明
`train()`	当前 batch 的统计量	必须 batch size ≥ 2，否则无法计算方差（除以 n-1）
`eval()`	训练期间积累的均值和方差（moving average）	不再依赖 batch，即使 batch size = 1 也能用

⚠️ 错误原因详解：

遇到的错误：

1	`ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 1024])`

意味着在 训练模式（model.train()） 下运行，并输入了 batch size = 1 的数据。

这时 BatchNorm 会尝试用该单一样本的值来计算方差，但方差的计算公式为：

$Var(x) = \frac{1}{N - 1} \sum_{i=1}^{N}(x_i - \bar{x})^2$

当 N=1（也就是只有一个样本时），公式除以 N - 1 = 0，因此引发错误。

✅ 为什么 `eval()` 就没问题了？

当使用：

1	`model.eval()`

此时 BatchNorm 不再试图用当前 batch 计算均值和方差，而是使用训练阶段累计的 滑动平均均值与方差（moving mean & var），所以不管你输入 1 张图还是 100 张图，它都能运行。

机器学习

#机器学习 #BUG

关于batch_normlization问题

http://example.com/2025/07/07/关于batch-normlization的问题/

作者

Alaskaboo

发布于

2025年7月7日

更新于

2025年7月7日

许可协议

Transformer论文上一篇

大模型应用开发下一篇

关于batch_normlization问题

一、问题“ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 1024])”

二、解决办法

方法一：drop_last=True

方法二：eval()

三、反思

⚠️ 错误原因详解：

✅ 为什么 eval() 就没问题了？

✅ 为什么 `eval()` 就没问题了？