Pytorch手写数字识别

一、手写数字MNIST库介绍

什么是MNIST
MNIST数据库是美国国家标注技术研究所数据集Mixed National Institute of Standards and Technology的简称，是美国国家标准与技术研究院(National Institute of Standards and Technology，NIST)下的子数据库.MNIST是一个入门级的计算机视觉数据集，是深度学习领域内通用的标准数据集.
它由一张张手写数字的图片构成，同时也包含对应的标签用以告诉我们(电脑)这个是数字几.

二、手写数字MNIST库介绍

MNIST的构成
通常我们所使用的MNIST数据集 (http://yann.lecun.com/exdb/mnist/) 包含有60000组训练集与10000组测试集
数据(图像标签)被压缩在一些.gz格式的压缩包内
训练图像 trai training set images(9912422 bytes)
训练标签 training set labels (28881 bytes)
测试图像+10L test set images (1648877 bytes)
测试标签110k- test set labels (4542 bytes)
其中图像数据通常为28×28的矩阵，或是一个784维
的向量，而标签数据则是0-9的数字

三、主程序
(环境配置就不说了)

首先安装库：

1	pip install numpy torch torchvision matplotlib

运行:

1	python test.py

主体:

import torch
from torch.utils.data import DataLoader
from torchvision import transforms
from torchvision.datasets import MNIST
import matplotlib.pyplot as plt


class Net(torch.nn.Module):

    def __init__(self):
        super().__init__()
        self.fc1 = torch.nn.Linear(28*28, 64)
        self.fc2 = torch.nn.Linear(64, 64)
        self.fc3 = torch.nn.Linear(64, 64)
        self.fc4 = torch.nn.Linear(64, 10)
    
    def forward(self, x):
        x = torch.nn.functional.relu(self.fc1(x))
        x = torch.nn.functional.relu(self.fc2(x))
        x = torch.nn.functional.relu(self.fc3(x))
        x = torch.nn.functional.log_softmax(self.fc4(x), dim=1)
        return x


def get_data_loader(is_train):
    to_tensor = transforms.Compose([transforms.ToTensor()])
    data_set = MNIST("", is_train, transform=to_tensor, download=True)
    return DataLoader(data_set, batch_size=15, shuffle=True)


def evaluate(test_data, net):
    n_correct = 0
    n_total = 0
    with torch.no_grad():
        for (x, y) in test_data:
            outputs = net.forward(x.view(-1, 28*28))
            for i, output in enumerate(outputs):
                if torch.argmax(output) == y[i]:
                    n_correct += 1
                n_total += 1
    return n_correct / n_total


def main():

    train_data = get_data_loader(is_train=True)
    test_data = get_data_loader(is_train=False)
    net = Net()
    
    print("initial accuracy:", evaluate(test_data, net))
    optimizer = torch.optim.Adam(net.parameters(), lr=0.001)
    for epoch in range(2):
        for (x, y) in train_data:
            net.zero_grad()
            output = net.forward(x.view(-1, 28*28))
            loss = torch.nn.functional.nll_loss(output, y)
            loss.backward()
            optimizer.step()
        print("epoch", epoch, "accuracy:", evaluate(test_data, net))

    for (n, (x, _)) in enumerate(test_data):
        if n > 3:
            break
        predict = torch.argmax(net.forward(x[0].view(-1, 28*28)))
        plt.figure(n)
        plt.imshow(x[0].view(28, 28))
        plt.title("prediction: " + str(int(predict)))
    plt.show()


if __name__ == "__main__":
    main()

下来我们来逐个分析：

定义神经网络模型：

class Net(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = torch.nn.Linear(28*28, 64)
        self.fc2 = torch.nn.Linear(64, 64)
        self.fc3 = torch.nn.Linear(64, 64)
        self.fc4 = torch.nn.Linear(64, 10)
    
    def forward(self, x):
        x = torch.nn.functional.relu(self.fc1(x))
        x = torch.nn.functional.relu(self.fc2(x))
        x = torch.nn.functional.relu(self.fc3(x))
        x = torch.nn.functional.log_softmax(self.fc4(x), dim=1)
        return x

首先，Net 类继承自 torch.nn.Module，定义了一个简单的前向传播神经网络模型。该模型包含了四个全连接层 (fc1, fc2, fc3, fc4)。输入层为 28*28（MNIST 图像大小），输出层为 10（表示数字 0 到 9 的分类）。
forward 方法定义了前向传播过程，其中使用了 ReLU 激活函数，并在最后一层使用了 log_softmax 来得到输出。

数据加载和预处理：

def get_data_loader(is_train):
    to_tensor = transforms.Compose([transforms.ToTensor()])
    data_set = MNIST("", is_train, transform=to_tensor, download=True)
    return DataLoader(data_set, batch_size=15, shuffle=True)

用 get_data_loader 函数根据 is_train 参数获取训练或测试数据集的数据加载器。使用了 transforms.Compose 将 ToTensor() 转换函数组合起来，将数据转换为张量类型。使用 MNIST 数据集从 torchvision 库中加载数据集，transform 参数将对数据进行预处理，download=True 表示如果数据集不存在则进行下载。

模型训练和评估：

train_data = get_data_loader(is_train=True)
test_data = get_data_loader(is_train=False)
net = Net()

# 训练模型
optimizer = torch.optim.Adam(net.parameters(), lr=0.001)
for epoch in range(2):
    for (x, y) in train_data:
        # 前向传播、损失计算、反向传播和参数优化
        net.zero_grad()
        output = net.forward(x.view(-1, 28*28))
        loss = torch.nn.functional.nll_loss(output, y)
        loss.backward()
        optimizer.step()

# 评估模型
print("initial accuracy:", evaluate(test_data, net))
for epoch in range(2):
    # 打印每个 epoch 后的准确率
    print("epoch", epoch, "accuracy:", evaluate(test_data, net))

使用 get_data_loader 获取训练和测试数据集。
创建 Net 类的实例作为神经网络模型。
使用 torch.optim.Adam 作为优化器，学习率为 0.001。
迭代训练模型，在每个 epoch 中遍历训练集，进行前向传播、损失计算、反向传播和参数优化。
使用 evaluate 函数评估模型在测试数据集上的准确率，并打印每个 epoch 后的准确率。

展示预测结果：

for (n, (x, _)) in enumerate(test_data):
    if n > 3:
        break
    predict = torch.argmax(net.forward(x[0].view(-1, 28*28)))
    plt.figure(n)
    plt.imshow(x[0].view(28, 28))
    plt.title("prediction: " + str(int(predict)))
plt.show()

测试数据集中的部分样本，并进行模型的预测。
对于每个样本，使用模型预测其对应的数字。
使用 matplotlib 库展示样本图像和模型的预测结果。

这样就实现了一个简单的神经网络模型，在 MNIST 数据集上进行训练和评估，并展示了部分样本的预测结果。