pytorch 验证函数中的验证损失为nan

zynd9foi  于 2023-06-23  发布在  其他
关注(0)|答案(1)|浏览(160)

我已经使用PyTorch使用这个数据集执行了一个简单的线性回归

DATASET_URL = "https://gist.github.com/BirajCoder/5f068dfe759c1ea6bdfce9535acdb72d/raw/c84d84e3c80f93be67f6c069cbdc0195ec36acbd/insurance.csv"

已将dataframe转换为numpy数组

# Make a copy of the original dataframe
    dataframe1 = dataframe.copy(deep=True)
    # Convert non-numeric categorical columns to numbers
    for col in categorical_cols:
        dataframe1[col] = dataframe1[col].astype('category').cat.codes
    # Extract input & outupts as numpy arrays
    inputs_array = dataframe1[input_cols].to_numpy()
    targets_array = dataframe1[output_cols].to_numpy()
    return inputs_array, targets_array
inputs_array, targets_array = dataframe_to_arrays(dataframe)
inputs_array, targets_array

然后把它们转换成Tensor

inputs = torch.tensor(inputs_array, dtype=torch.float32)
targets = torch.tensor(targets_array, dtype=torch.float32)

已将Tensor数据集指定为dataset = TensorDataset(inputs, targets)
执行了分裂

from torch.utils.data import random_split
val_percent = 0.2 # between 0.1 and 0.2
val_size = int(num_rows * val_percent)
train_size = num_rows - val_size
train_ds, val_ds = random_split(dataset,[train_size,val_size])

使用DataLoader与batch size of 128

train_loader = DataLoader(train_ds, batch_size, shuffle=True)
val_loader = DataLoader(val_ds, batch_size)

创建为扩展类

class InsuranceModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = nn.Linear(input_size, output_size)                 (hint: use input_size & output_size defined above)
        
    def forward(self, xb):
        # xb = xb.view(xb.size(0), -1)
        # xb = xb.reshape(-1, 5)
        out = self.linear(xb)                          
        return out
    
    def training_step(self, batch):
        inputs, targets = batch 
        # Generate predictions
        out = self(inputs)          
        # Calcuate loss
        loss = F.mse_loss(out,targets)                          
        return loss
    
    def validation_step(self, batch):
        inputs, targets = batch
        # Generate predictions
        out = self(inputs)
        # Calculate loss
        loss = F.mse_loss(out,targets)                            
        return {'val_loss': loss.detach()}
        
    def validation_epoch_end(self, outputs):
        batch_losses = [x['val_loss'] for x in outputs]
        epoch_loss = torch.stack(batch_losses).mean()   # Combine losses
        return {'val_loss': epoch_loss.item()}
    
    def epoch_end(self, epoch, result, num_epochs):
        # Print result every 20th epoch
        if (epoch+1) % 20 == 0 or epoch == num_epochs-1:
            print("Epoch [{}], val_loss: {:.4f}".format(epoch+1, result['val_loss']))

现在使用model = InsuranceModel()

def evaluate(model, val_loader):
    outputs = [model.validation_step(batch) for batch in val_loader]
    return model.validation_epoch_end(outputs)

def fit(epochs, lr, model, train_loader, val_loader, opt_func=torch.optim.SGD):
    history = []
    optimizer = opt_func(model.parameters(), lr)
    for epoch in range(epochs):
        # Training Phase 
        for batch in train_loader:
            loss = model.training_step(batch)
            loss.backward()
            optimizer.step()
            optimizer.zero_grad()
        # Validation phase
        result = evaluate(model, val_loader)
        model.epoch_end(epoch, result, epochs)
        history.append(result)
    return history

这给予我val_loss : nan

epochs = 30
lr = 1e-2
history1 = fit(epochs, lr, model, train_loader, val_loader)

我得到的警告是
用户警告:使用与输入大小(torch.Size([128,1])不同的目标大小(torch.Size([128,5]))。这可能会导致由于广播而导致不正确的结果。请确保它们的大小相同。loss = F.mse_loss(out,targets)

km0tfn4u

km0tfn4u1#

问题是损失太高,因此得到nan。修复方法是通过改变超参数来一次又一次地初始化模型,最终将损失降至一个小数字。

相关问题