# Day 4 - PyTorch for Dummies

Soooo… day 4! Exciting!

Yesterday I didn’t do any work on **PyTorch**, so today I decided to simply translate my previous RNN (Recursive Neural Network)
created with Tensorflow and Keras into PyTorch code.

I thought this would have been a trivial task, but turned out to be quite challenging! I don’t know why people say that PyTorch is easier for begineers compared to Tensorflow, I don’t think that’s true, especially if you don’t have previous machine learning experience.

I spent a day re-implementing that simple Tensorflow model in PyTorch. By the end I managed to learn how to define new models and create training and evaluation loops. I think the biggest difference between the two frameworks is that Tensorflow abstracts a lot of the logic involved in the model training. That could be a good thing if you want to get simple things done quickly, but implementing a model in Pytorch forced me to think and understand how my model works, what input and hidden layers I needed ot use and I actually designed my own training loop.

## Lessons Learned

- After today I have a much better understanding of how models, training and evaluation loops work. The PyTorch approach allows greater flexibility to people that want to create automations around the model definition and training

## Tensorflow vs PyTorch

For completeness I thought to share the two implementations so you can see the difference for yourself

**The Tensorflow way**

```
model = tf.keras.Sequential([
tf.keras.layers.Bidirectional(
tf.keras.layers.LSTM(units=128, input_shape=[x_train.shape[1],x_train.shape[2]])
),
tf.keras.layers.Dropout(.2),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(y_train.shape[1], activation='softmax'),
])
model.compile(optimizer='adam',
loss=tf.keras.losses.CategoricalCrossentropy(),
metrics=['accuracy'])
hystory = model.fit(x_train,
y_train,
validation_split=0.3,
epochs=15)
```

**The PyTorch way**

```
class LstmClassification(torch.nn.Module):
def __init__(self, lstm_size: int, num_layers: int = 1):
super().__init__()
self.num_layers = num_layers
self.lstm_size = lstm_size
self.lstm = nn.LSTM(
input_size=5,
hidden_size=lstm_size,
bidirectional=True,
batch_first=True,
num_layers=num_layers,
)
hidden_size = self.lstm_size * 2 # bidirectional
self.sequential = nn.Sequential(
nn.Dropout(0.2),
nn.Linear(hidden_size, hidden_size),
nn.ReLU(),
nn.Linear(hidden_size, 2),
)
def forward(self, x):
_x = x.float()
h0 = torch.zeros(self.num_layers * 2, x.size(0), self.lstm_size).to(device)
c0 = torch.zeros(self.num_layers * 2, x.size(0), self.lstm_size).to(device)
out, _ = self.lstm(_x, (h0, c0))
out = out[:, -1, :]
out = self.sequential(out)
return out
def train(
train_loader: DataLoader,
test_loader: DataLoader,
lstm_size: int = 128,
lstm_layers: int = 1,
learning_rate: float = 1e-3,
batch_size: int = 12,
epochs: int = 1,
):
model = LstmClassification(lstm_size=lstm_size, num_layers=lstm_layers).to(device)
print("Using model: ", model)
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(params=model.parameters(), lr=learning_rate)
total_steps = len(train_loader)
for epoch in range(epochs):
model.train()
for i, (input_seq, target_labels) in enumerate(train_loader):
input_seq = input_seq.to(device)
target_labels = target_labels.to(device)
outputs = model(input_seq) # forward pass
loss = loss_fn(outputs, target_labels) # calculate loss
optimizer.zero_grad() # reset gradients
loss.backward() # calculate gradients
optimizer.step() # update weights
if (i + 1) % 100 == 0:
print(
f"Epoch [{epoch+1}/{epochs}], Step [{i}/{total_steps}], Loss: {loss.item():.4f}"
)
model.eval()
with torch.no_grad():
n_correct = 0
n_samples = 0
for input_seq, target_labels in test_loader:
input_seq = input_seq.to(device)
target_labels = target_labels.to(device)
outputs = model(input_seq)
_, predicted_labels = torch.max(outputs.data, 1)
n_samples += target_labels.size(0)
n_correct += (predicted_labels == target_labels).sum().item()
acc = 100.0 * n_correct / n_samples
print(f"Accuracy: {acc:.2f} %")
return model
```