{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Guide to Unsupervised Pretraining\n", "\n", "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/masterfulai/masterful-docs/blob/main/notebooks/guide_pretraining.ipynb)        \n", "[![Download](images/download.png)](https://masterful-public.s3.us-west-1.amazonaws.com/933013963/latest/guide_pretraining.ipynb)[Download this Notebook](https://masterful-public.s3.us-west-1.amazonaws.com/933013963/latest/guide_pretraining.ipynb)\n", "\n", "In this guide, you'll learn how to train a backbone without using labels. At the end of this guide, you'll build a supervised k-nearest-neighbors classifier head on top of this backbone. Although the KNN classifier demonstrates that the backbone has learned representations of the data, see the \"Training with a backbone\" guide for the recommended applications of a backbone (e.g. linear or MLP head for classification, pyramid feature network for detection, unet for segmentation)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Prerequisites\n", "\n", "Please follow the Masterful installation instructions [here](../tutorials/tutorial_installation.md) in order to run this Quickstart." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Imports\n", "\n", "Import libraries and register Masterful. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import tensorflow as tf\n", "import masterful\n", "\n", "masterful = masterful.register()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For this guide you'll be using the CIFAR10 dataset, which consists of 60,000 color images of 10 separate, non-overlapping classes of objects. There are 6,000 images of each class of object; 5,000 in a training set and 1,000 held out for testing." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()\n", "training_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))\n", "validation_dataset = tf.data.Dataset.from_tensor_slices((x_test, y_test))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a [DataParams](../api/api_data.rst#masterful.data.DataParams) instance, which captures CIFAR10's relevant metadata:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "training_dataset_params = masterful.data.learn_data_params(\n", " dataset=training_dataset,\n", " task=masterful.enums.Task.CLASSIFICATION,\n", " image_range=masterful.enums.ImageRange.ZERO_255,\n", " num_classes=10,\n", " sparse_labels=False,\n", ")\n", "validation_dataset_params = masterful.data.learn_data_params(\n", " dataset=validation_dataset,\n", " task=masterful.enums.Task.CLASSIFICATION,\n", " image_range=masterful.enums.ImageRange.ZERO_255,\n", " num_classes=10,\n", " sparse_labels=False,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now setup the [OptimizationParams](../api/api_optimization.rst#masterful.optimization.OptimizationParams), which establishes pretraining-related optimization hyperparameters. Also, create the SSL parameters that you will use for training. In this guide, you will use [Barlow Twins](https://arxiv.org/abs/2103.03230) to learn the self-supervised representation." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "optimization_params = masterful.optimization.OptimizationParams(\n", " batch_size=512,\n", " epochs=3,\n", " warmup_epochs=1,\n", ")\n", "ssl_params = masterful.ssl.SemiSupervisedParams(\n", " algorithms = [\"barlow_twins\"],\n", ") " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, create a model to be pretrained – here, the same simple CNN used in [this](https://www.tensorflow.org/tutorials/images/cnn) TensorFlow tutorial for labeled training:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "model = tf.keras.models.Sequential()\n", "model.add(tf.keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))\n", "model.add(tf.keras.layers.MaxPooling2D((2, 2)))\n", "model.add(tf.keras.layers.Conv2D(64, (3, 3), activation='relu'))\n", "model.add(tf.keras.layers.MaxPooling2D((2, 2)))\n", "model.add(tf.keras.layers.Conv2D(64, (3, 3), activation='relu'))\n", "\n", "model_params = masterful.architecture.learn_architecture_params(\n", " model=model,\n", " task=masterful.enums.Task.CLASSIFICATION,\n", " input_range=masterful.enums.ImageRange.ZERO_255,\n", " num_classes=10,\n", " prediction_logits=True,\n", " backbone_only=True,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now you're ready to pretrain! Pretraining with Masterful will return a training report, including fields like\n", "- `loss`: the unsupervised loss at the end of pretraining.\n", "- `accuracy`: the accuracy achieved by performing K-Nearest Neighbors classification on the pretrained backbone's output features\n", "\n", "(Note that the labels in the dataset passed to the `training_dataset` parameter below are only used for KNN classification, _not_ for pretraining the backbone.)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/3\n", "98/98 [==============================] - 128s 1s/step - loss: 997.4935\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Feature extracting: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 98/98 [00:02<00:00, 34.59it/s]\n", "Test Epoch: Acc@1:42.27%: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:01<00:00, 14.59it/s]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "kNN Test Accuracy at epoch 0: 42.27000045776367 Max Accuracy so far: 42.27000045776367\n", "Epoch 2/3\n", "98/98 [==============================] - 126s 1s/step - loss: 796.1797\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Feature extracting: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 98/98 [00:02<00:00, 35.67it/s]\n", "Test Epoch: Acc@1:44.84%: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:01<00:00, 14.96it/s]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "kNN Test Accuracy at epoch 1: 44.84000015258789 Max Accuracy so far: 44.84000015258789\n", "Epoch 3/3\n", "98/98 [==============================] - 125s 1s/step - loss: 715.0130\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Feature extracting: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 98/98 [00:02<00:00, 36.52it/s]\n", "Test Epoch: Acc@1:45.50%: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:01<00:00, 15.09it/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "kNN Test Accuracy at epoch 2: 45.5 Max Accuracy so far: 45.5\n", "KnnEvaluator: Restoring model weights from epoch 3 with accuracy 45.5.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "training_report = masterful.ssl.learn_representation(\n", " model=model,\n", " model_params=model_params,\n", " optimization_params=optimization_params,\n", " ssl_params=ssl_params,\n", " training_dataset=training_dataset,\n", " training_dataset_params=training_dataset_params,\n", " validation_dataset=validation_dataset,\n", " validation_dataset_params=validation_dataset_params,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The pretraining policy above only runs for a few epochs, to save time; you should expect better results with more epochs (and quicker results for a smaller model). However, note that the pretrained model's output features—without the use of labels—already outperform the guessing of randomly initialized weights:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Final pretraining loss: 715.0130004882812\n", "Final pretraining KNN accuracy: 45.5\n" ] } ], "source": [ "loss = training_report.validation_results['loss']\n", "acc = training_report.validation_results['accuracy']\n", "print(f'Final pretraining loss: {loss}')\n", "print(f'Final pretraining KNN accuracy: {acc}')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now you're ready to use Masterful's unsupervised pretraining API!" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.12" } }, "nbformat": 4, "nbformat_minor": 4 }