Training a deep neural network can take hours or even days to complete. It is not practical to train such a neural network every time you want to make predictions. In this case, you can save and then later reload your model.
When you save a model you can save it after training or save checkpoints at regular intervals during training. We will cover both techniques in this article.
Let's first start by importing the required libraries and defining our model.
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Flatten
import tensorflow as tf
# load the dataset
fashion_mnist = tf.keras.datasets.fashion_mnist
(X_train, y_train), (X_test, y_test) = fashion_mnist.load_data()
X_train, X_valid, y_train, y_valid = train_test_split(X_train, y_train,
test_size=0.25, random_state=42)
# rescale and reshape the data
X_train, X_valid, X_test = X_train / 255.0, X_valid / 255.0, X_test / 255.0
X_train = X_train.reshape(-1, 28 * 28)
X_valid = X_valid.reshape(-1, 28 * 28)
X_test = X_test.reshape(-1, 28 * 28)
# build the model
model = Sequential([
Flatten(input_shape=[784]),
Dense(200, activation="relu"),
Dense(100, activation="relu"),
Dense(10, activation="softmax"),
])
# compile the model
model.compile(
optimizer="sgd",
loss="sparse_categorical_crossentropy",
metrics=["accuracy"],
)
# train the model
model.fit(X_train, y_train,
validation_data=(X_valid, y_valid),
epochs=5)
As you can see, we are using the Fashion MNIST dataset.
output:
Epoch 1/5
1407/1407 [==============================] - 7s 5ms/step - loss: 0.7755 - accuracy: 0.7419 - val_loss: 0.6479 - val_accuracy: 0.7638
...
Epoch 5/5
1407/1407 [==============================] - 5s 3ms/step - loss: 0.4134 - accuracy: 0.8554 - val_loss: 0.4151 - val_accuracy: 0.8535
To save the model, we can either use the function model.save() or tf.keras.models.save_model().
We also have two formats to save our model:
# Save the model as a SavedModel
model.save("my_model")
This will create a new folder named my_model containing the model architecture, weights, and training configuration.
You can save the model in the HDF5 format by simply adding .h5 at the end of the filename.
# save in the H5 format
model.save("my_model.h5")
Now to reload our model, we can use the tf.keras.models.load_model() function:
my_model = tf.keras.models.load_model("my_model")
Let's evaluate the model:
loss, accuracy = my_model.evaluate(X_test, y_test)
print(f"accuracy: {accuracy * 100:.2f}%")
output:
313/313 [==============================] - 1s 2ms/step - loss: 0.4435 - accuracy: 0.8404
accuracy: 84.04%
In some cases, you want to save only the model's weights. For example, if you only need your model for inference or you want to do transfer learning.
In this case, you can use the model.save_weights() function:
model.save_weights("weights") # or model.save_weights("weights.h5") to save in the HDF5 format
And then you can create a new model with the same architecture as the previous one and load its weights:
from keras import Input
from keras import Model
# The two models need to share the same architecture
# I created a new model using the functional API
inputs = Input(shape=(784,))
x = Dense(200, activation="relu")(inputs)
x = Dense(100, activation="relu")(x)
outputs = Dense(10, activation="softmax")(x)
new_model = Model(inputs=inputs, outputs=outputs)
# compile the new model
new_model.compile(
optimizer="sgd",
loss="sparse_categorical_crossentropy",
metrics=["accuracy"],
)
# evaluate the new model before loading the weights
loss, accuracy = new_model.evaluate(X_test, y_test)
print(f"accuracy of new model before loading weights: {accuracy * 100:.2f}%")
output:
313/313 [==============================] - 1s 2ms/step - loss: 2.3907 - accuracy: 0.0725
accuracy of new model before loading weights: 7.25%
Since the model has a randomly initialized weights, its accuracy is very low.
Let's now load the weights from the previous model:
# load the weights from the previous model
new_model.load_weights("weights")
# re-evaluate the new model
loss, accuracy = new_model.evaluate(X_test, y_test)
print(f"accuracy of new model after loading weights: {accuracy * 100:.2f}%")
output:
313/313 [==============================] - 1s 2ms/step - loss: 0.4435 - accuracy: 0.8404
accuracy of new model after loading weights: 84.04%
In the previous section, we saw how to save a model after training. But what if you want to save your model, for example, after each epoch during training in case where your computer crashes?
In this case, you can use the ModelCheckpoint callback. This callback saves checkpoints of the model at the end of each epoch by default (we can change this behavior).
from keras.callbacks import ModelCheckpoint
cp_checkpoint = ModelCheckpoint("my_model", verbose=1)
model.fit(X_train, y_train,
validation_data=(X_valid, y_valid),
epochs=5,
callbacks=[cp_checkpoint])
# later ...
my_model = tf.keras.models.load_model("my_model")
The callback also gives you the choice to only keep the model that has achieved the best performance so far by setting save_best_only=True.
cp_checkpoint = ModelCheckpoint("my_model", save_best_only=True, verbose=1)
model.fit(X_train, y_train,
validation_data=(X_valid, y_valid),
epochs=5,
callbacks=[cp_checkpoint])
# later...
# load the best model
my_model = tf.keras.models.load_model("my_model")
Another option the callback provides is to save only the model's weights:
cp_checkpoint = ModelCheckpoint("weights",
save_weights_only=True,
verbose=1)
model.fit(X_train, y_train,
validation_data=(X_valid, y_valid),
epochs=5,
callbacks=[cp_checkpoint])
And then we can create a new model and load the weights:
# Now create a new model
inputs = Input(shape=(784,))
x = Dense(200, activation="relu")(inputs)
x = Dense(100, activation="relu")(x)
outputs = Dense(10, activation="softmax")(x)
new_model = Model(inputs=inputs, outputs=outputs)
# compile the new model
new_model.compile(
optimizer="sgd",
loss="sparse_categorical_crossentropy",
metrics=["accuracy"],
)
# and load the weights
new_model.load_weights("weights")
In this article, you learned how to save your models and how to save only the model's weights.
Basically, there are two ways to save a model (or only the model's weights):
I hope this has given you an overall idea on how to save your trained models.
The final code used in this tutorial is available on GitHub in my repository.
You can also directly run the code on Google Colab.
If you need help on how to save your models, please ask your question in the section below and I will do my best to answer it.