Question 1

I tried to train my neural network, and then evaluate it's testing accuracy. I am using the code at the bottom of this post to train. The fact is that for other neural networks, I can evaluate the testing accuracy with my code without issue. However, for this neural network (which I constructed correctly according to the description of the neural network paper), I can't evaluate the testing accuracy properly and its giving me the traceback below. So maybe something's wrong in my forward pass?

Traceback

Here is the training and testing code:

//imports including import deepnet.py
cudnn.benchmark = True
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
X_train = X_train.astype('float32')
X_train = np.transpose(X_train, axes=(0, 3, 1, 2))
X_test = X_test.astype('float32')
X_test = np.transpose(X_test, axes=(0, 3, 1, 2))
X_train /= 255
X_test /= 255
device = torch.device('cuda:0')# This is where you can load any model of your choice.
# I stole PyTorch Vision's VGG network and modified it to work on CIFAR-10.
# You can take this line out and add any other network and the code
# should run just fine.
model = deepnet.cifar10_deep()
#model.to(device)# Forward pass
opfun = lambda X: model.forward(Variable(torch.from_numpy(X)))# Forward pass through the network given the input
predsfun = lambda op: np.argmax(op.data.numpy(), 1)# Do the forward pass, then compute the accuracy
accfun = lambda op, y: np.mean(np.equal(predsfun(op), y.squeeze()))*100# Initial point
x0 = deepcopy(model.state_dict())# Number of epochs to train for
# Choose a large value since LB training needs higher values
# Changed from 150 to 30
nb_epochs = 30 
batch_range = [25, 40, 50, 64, 80, 128, 256, 512, 625, 1024, 1250, 1750, 2048, 2500, 3125, 4096, 4500, 5000]# parametric plot (i.e., don't train the network if set to True)
hotstart = Falseif not hotstart:for batch_size in batch_range:optimizer = torch.optim.Adam(model.parameters())model.load_state_dict(x0)#model.to(device)average_loss_over_epoch = '-'print('Optimizing the network with batch size %d' % batch_size)np.random.seed(1337) #So that both networks see same sequence of batchesfor e in range(nb_epochs):model.eval()print('Epoch:', e, ' of ', nb_epochs, 'Average loss:', average_loss_over_epoch)average_loss_over_epoch = 0# Checkpoint the model every epochtorch.save(model.state_dict(), "./models/DeepNetC2BatchSize" + str(batch_size) + ".pth")array = np.random.permutation(range(X_train.shape[0]))slices = X_train.shape[0] // batch_sizebeginning = 0end = 1# Training loop!for _ in range(slices):start_index = batch_size * beginning end_index = batch_size * endsmpl = array[start_index:end_index]model.train()optimizer.zero_grad()ops = opfun(X_train[smpl])tgts = Variable(torch.from_numpy(y_train[smpl]).long().squeeze())loss_fn = F.nll_loss(ops, tgts)average_loss_over_epoch += loss_fn.data.numpy() / (X_train.shape[0] // batch_size)loss_fn.backward()optimizer.step()beginning += 1end += 1grid_size = 18 #How many points of interpolation between [0, 5000]
data_for_plotting = np.zeros((grid_size, 3)) #Uncomment this line if running entire code from scratch
sharpnesses1eNeg3 = []
sharpnesses5eNeg4 = []
#data_for_plotting = np.load("DeepNetCIFAR10-intermediate-values.npy") #Uncomment this line to use an existing NumPy array
print(data_for_plotting)
i = 0# Fill in test accuracy values for `grid_size' points in the interpolation
for batch_size in batch_range:mydict = {}batchmodel = torch.load("./models/DeepNetC2BatchSize" + str(batch_size) + ".pth")for key, value in batchmodel.items():mydict[key] = valuemodel.load_state_dict(mydict)j = 0for datatype in [(X_train, y_train), (X_test, y_test)]:dataX = datatype[0]datay = datatype[1]for smpl in np.split(np.random.permutation(range(dataX.shape[0])), 10):ops = opfun(dataX[smpl])tgts = Variable(torch.from_numpy(datay[smpl]).long().squeeze())var = F.nll_loss(ops, tgts).data.numpy() / 10if j == 1:data_for_plotting[i, j-1] += accfun(ops, datay[smpl]) / 10.j += 1print(data_for_plotting[i])np.save('DeepNetCIFAR10-intermediate-values', data_for_plotting)i += 1

And the model code is here and includes the forward pass

import torch
import torch.nn as nn
F = nn.functional
__all__ = ['cifar10_deepnet', 'cifar100_deepnet']class VGG(nn.Module):def __init__(self, num_classes=10):super(VGG, self).__init__()self.features = nn.Sequential(nn.Conv2d(3, 64, kernel_size=3, bias=False),nn.BatchNorm2d(64),nn.ReLU(inplace=True),nn.Dropout(0.3),nn.Conv2d(64, 64, kernel_size=3, padding = 1, bias=False),nn.BatchNorm2d(64),nn.ReLU(inplace=True),nn.MaxPool2d(kernel_size=2, stride=2),nn.Conv2d(64, 128, kernel_size=3, padding = 1, bias=False),nn.BatchNorm2d(128),nn.ReLU(inplace=True),nn.Dropout(0.4),nn.Conv2d(128, 128, kernel_size=3, padding = 1, bias=False),nn.BatchNorm2d(128),nn.ReLU(inplace=True),nn.MaxPool2d(kernel_size=2, stride=2),nn.Conv2d(128, 256, kernel_size=3, padding = 1, bias=False),nn.BatchNorm2d(256),nn.ReLU(inplace=True),nn.Dropout(0.4),nn.Conv2d(256, 256, kernel_size=3, padding = 1, bias=False),nn.BatchNorm2d(256),nn.ReLU(inplace=True),nn.Dropout(0.4),nn.Conv2d(256, 256, kernel_size=3, padding = 1, bias=False),nn.BatchNorm2d(256),nn.ReLU(inplace=True),nn.MaxPool2d(kernel_size=2, stride=2),nn.Conv2d(256, 512, kernel_size=3, padding = 1, bias=False),nn.BatchNorm2d(512),nn.ReLU(inplace=True),nn.Dropout(0.4),nn.Conv2d(512, 512, kernel_size=3, padding = 1, bias=False),nn.BatchNorm2d(512),nn.ReLU(inplace=True),nn.Dropout(0.4),nn.Conv2d(512, 512, kernel_size=3, padding = 1, bias=False),nn.BatchNorm2d(512),nn.ReLU(inplace=True),nn.MaxPool2d(kernel_size=2, stride=2),nn.Conv2d(512, 512, kernel_size=3, padding = 1, bias=False),nn.BatchNorm2d(512),nn.ReLU(inplace=True),nn.Dropout(0.4),nn.Conv2d(512, 512, kernel_size=3, padding = 1, bias=False),nn.BatchNorm2d(512),nn.ReLU(inplace=True),nn.Dropout(0.4),nn.Conv2d(512, 512, kernel_size=3, padding = 1, bias=False),nn.BatchNorm2d(512),nn.ReLU(inplace=True),nn.MaxPool2d(kernel_size=2, stride=2),)self.classifier = nn.Sequential(nn.Linear(512, 512, bias=False),nn.Dropout(0.5),nn.BatchNorm1d(512),nn.ReLU(inplace=True),nn.Dropout(0.5),nn.Linear(512, num_classes))def forward(self, x):x = self.features(x)x = x.view(-1, 512)x = self.classifier(x)return F.log_softmax(x)def cifar10_deep(**kwargs):num_classes = getattr(kwargs, 'num_classes', 10)return VGG(num_classes)def cifar100_deep(**kwargs):num_classes = getattr(kwargs, 'num_classes', 100)return VGG(num_classes)

Question 2

You are trying to load a state dict that belongs to another model.

The error shows that your model is the class AlexNet.

RunTimeError: Error(s) in loading state_dict for AlexNet:

But the state dict you are trying to load is from the VGG you posted, which doesn't have the same modules as AlexNet.

You need to use the same model whose state dict you saved before.

Error in Calculating neural network Test Accuracy

Related Q&A

Adding stats code to a function in Python

Multiple Histograms, each for a label of x-axis, on the same graph matplotlib

How can I split a string in Python? [duplicate]

How to enable media device access in Edge browser using Selenium?

Print specific rows that are common between two dataframes

What is the problem with buttons connection in PyQt5?

Install ipykernel in vscode - ipynb (Jupyter)

How do I write a form in django

Whats the difference between namespaces and names?

Finding a string in python and writing it in another file [closed]