Deep Learning models use neural networks containing neurons within multiple layers. These neurons are connected to the neurons of the next layer with their weights for each connection. Optimizers in deep learning have a major role as they are used to reduce the loss value by managing the weights. There are different optimizers like SGD and Adam used in deep learning to improve the accuracy of the model.

**What is Adam Optimizer**

**Adam **or **Adaptive Moment Estimation optimizers **are used to control the learning rate for the model. Neural networks use feedforward techniques to predict the future and backpropagation to improve their performance. At the start of the model, the weights are given at random by the user and they are updated or fine-tuned by the model while backpropagating. It is done through multiple iterations while training the model.

**Syntax**

To use the Adam optimizer in PyTorch, simply implement the following syntax:

`torch.optim.Adam(parameter, lr=<value>, betas=<value>, eps=<value>)`

- Use the
**optim**package from the**torch**library to call the**Adam()**method with its arguments. **parameter**argument is used to update the model parameters like weights, initialized by the user at the start.**learning rate (lr)**is used to get the steps to improve the accuracy.- The optimizer uses the
**betas**argument to remember its previous movements - The
**eps**argument is the stability parameter to get the diversity in the values.

**How to Use Adam Optimizer in PyTorch**

Use the Adam optimizer in neural networks to control the parameters and learning rate of the model. First of all, build the structure of the model with the dimensions of the neural networks. After that, use a dataset to train the model and optimizer to enhance the performance throughout the training phase. To learn the process of using the Adam optimizer in deep learning, simply go through the following steps:

**Step 1: Importing Torch Library**

The torch library contains multiple packages and methods to build and optimize the deep learning models in Python:

`import torch`

**Step 2: Setting up the Model’s Dimensions**

The next step is to set the dimensions for multiple layers of the neural network structure with neurons present in them. These dimensions are required while setting up the structure of the neural networks and the flow of the model. They also explain the process of getting input and evaluating it to get the final output:

```
batch = 128
input_dim = 2000
hidden_dim = 200
output_dim = 20
```

- Creating a
**batch**variable means that each epoch or iteration of the training contains 128 mini-batches to train the model. - Here, the
**input_dim**variable contains 2000 values suggesting the dimensions of the input layer. - Now, the
**hidden_dim**contains the 200 value that refers to the dimensions of the hidden layer. - The
**output_dim**suggests the 20 dimensions at the output layer and it produces the predictions as the output.

**Step 3: Building the Dataset**

After setting up the model’s dimensions, simply create the tensors to build the dataset using the following code. The dataset is stored in the input variable and the output variable is going to store the predicted values using the output_dim variable:

```
input = torch.randn(batch, input_dim)
output = torch.randn(batch, output_dim)
```

- The
**input**variable stores the tensor with random values using the**batch**and**input**dimensions. - The second tensor with random values is stored in the
**output**variable with**batch**and**output**dimensions. - Both the tensors store the random numbers with normal distributions with
**mean**=0 and**variance**=1.

**Step 4: Building the Neural Network Model**

Now, build the structure of the neural network using the model functions and the number of layers with the activation functions. This explains the structure of the model as to how all the layers containing the neurons work to generate output values:

```
model = torch.nn.Sequential(
torch.nn.Linear(input_dim, hidden_dim),
torch.nn.ReLU(),
torch.nn.Linear(hidden_dim, output_dim),
)
```

- Design a
**Neural Network**structure stored in the**model**variable using the**Sequential()**method. - The Sequential() model places all the processes of the deep learning in a sequential order.
- Design a layer that takes the
**input**from the user and produces the**output**for the hidden layer. - Call the
**activation**function using the**ReLU()**method to apply the non-linearity method on the output of the hidden layer. - Create another layer by calling the
**Linear()**method with hidden and output dimensions. - It takes the output of the hidden layer as input and produces the final output.

**Step 5: Calling the Adam Optimizer**

Call the loss function with the optimizer to improve the performance of the model in the training process:

```
loss_fn = torch.nn.MSELoss(reduction='sum')
optim = torch.optim.Adam(model.parameters(), lr=0.05)
```

- Before the
**Adam**optimizer, call the**MSELoss**or any loss function offered by the**torch**library to get a mean sum error at the time of prediction. - Create the
**optim**variable to store the optimizer method with its arguments. **parameters()**argument is used to finetune the hyperparameters of the model like weights etc.- The
**learning rate**controls the optimization steps at the time of backpropagation.

**Step 6: Model’s Training**

Now, head towards the model training phase with multiple iterations so the model can be implemented more accurately. Multiple iterations are used in training so the model can evaluate its performance and bring the best possible outcome:

```
for epoch in range(1):
running_loss = 0
for i, data in enumerate(input, 0):
optim.zero_grad()
pred = model(input)
loss = loss_fn(pred, output)
loss.backward()
optim.step()
running_loss += loss.item()
print('[%d, %5d] Value: %.3f' %
(epoch + 1, i + 1, running_loss / 10000))
running_loss = 0
print('Finished Training')
```

- Use the
**for**loop with any number of iterations and the starting loss value at 0 followed by the**nested for**loop. - Integrate all the components created earlier to apply
**backpropagation**after getting the predictions for each batch. - Using the predicted and output values, get the loss value from the loss_fn() method and then apply the
**optimizer**. - At the end, print loss values for each batch in the iteration or epoch.

The loss values in the above snippet explain that the model is improving itself with each batch. The main idea of using the optimizer in machine learning is to get the optimal results after each iteration. There are many optimizers offered by the Torch library but majorly used are SGD and Adam. Let’s see what are some of the differences among them while optimizing the deep learning models:

**Difference Between Adam and SGD Optimizer**

There are multiple differences between the SGD and Adam optimizer regarding the deep learning models as mentioned below:

- SGD optimizer keeps a single learning rate with momentum but Adam keeps on changing the learning rate throughout the iterations.
- SGD produces high convergence by skipping over the minimum steps which brings in the Adam optimizer to get better optimization.
- Adam optimizer extends the SGD algorithm by dynamically changing the learning rate for each weight.

That’s all about using the Adam optimizer in the PyTorch framework.

**Conclusion**

To use the Adam optimizer in PyTorch, build the neural network using the torch library, and call the Adam() optimizer to enhance its predictions. Optimizers improve the performance of the deep learning model while applying backpropagation techniques. This guide builds the Linear() neural network structure and builds random data to train the model. After that, it calls the Adam() method as an optimizer during the training process and gets the loss value for each batch in the epochs.