PyTorch is one of the most well-known deep learning frameworks as well as Tensorflow. This posting will describe the basic steps for traing a model with PyTorch. Briefly, the basic steps of how to train a model are as follow.

  1. Load & pre-process the dataset
  2. Set a model
  3. Set an optimizer & a loss function
  4. Train the model
  5. Evaluate the model

Full code can be found at the following github repository.

This post was written with reference to the following materials.


1. Load & pre-process the dataset

We used the FashionMNIST dataset for the experiment. The dataset can be downloaded through torchvision.datasets, and preprocessing can be performed through transforms.Compose.


2. Set a model

We constructed a simple convolutional neural network model.


3. Set optimizer & loss function

We set the optimizer and loss function to train the model constructed in the previous step. The optimizer and loss function types were selected according to the type of the features of the given input data and the purpose of the model.


4. Train the model

In order to train a model, we must follow the following 5 steps.

  1. initialize the gradient
  2. forward propagation
  3. calculate the loss
  4. backward propagation
  5. optimize (update) the weights based on forward & backward propagation


5. Evaluate the model

We used the validation dataset to measure the performance of the completed training model. Note that the evaluate process is very similar to the training process. However, in the evaluate process, torch.no_grad() is required because the weights must never change.