{ "cells": [ { "cell_type": "markdown", "source": [ "# Guide to Ensembling (Work In Progress)\n", "\n", "In this guide, you'll learn how to create ensemble model. For a more conceptual discussion, see the concepts documents. First, import the necessary libraries." ], "metadata": {} }, { "cell_type": "code", "execution_count": 1, "source": [ "import tensorflow as tf\n", "import tensorflow_datasets as tfds\n", "import tensorflow_addons as tfa\n", "\n", "import sys\n", "sys.path.append('../../../')\n", "\n", "import masterful" ], "outputs": [], "metadata": {} }, { "cell_type": "markdown", "source": [ "Let's prepare our dataset. We'll use a subset of Imagenet called Imagenette, with some minimal preprocessing. In a real production environment, we would follow the [dataset performance guide](https://www.tensorflow.org/guide/data_performance). " ], "metadata": {} }, { "cell_type": "code", "execution_count": 8, "source": [ "BATCHSIZE = 128\n", "\n", "train, val = tfds.load('imagenette/160px', split=['train', 'validation'], as_supervised=True, shuffle_files=True)\n", "\n", "train = train.map(lambda image, label: (float(image) / 127.5 - 1.0, tf.one_hot(label, 10)), num_parallel_calls=tf.data.AUTOTUNE)\n", "val = val.map(lambda image, label: (float(image) / 127.5 - 1.0, tf.one_hot(label, 10)), num_parallel_calls=tf.data.AUTOTUNE)\n", "\n", "train = train.map(lambda image, label: (tf.image.resize(image, (160,160)), label), num_parallel_calls=tf.data.AUTOTUNE)\n", "val = val.map(lambda image, label: (tf.image.resize(image, (160,160)), label), num_parallel_calls=tf.data.AUTOTUNE)\n", "\n", "train.cache()\n", "val.cache()\n", "\n", "train = train.shuffle(1000)\n", "\n", "train = train.prefetch(tf.data.AUTOTUNE)\n", "val = val.prefetch(tf.data.AUTOTUNE)" ], "outputs": [], "metadata": {} }, { "cell_type": "markdown", "source": [ "Now, let's train a model on a simple dataset. We will use a very small model for demonstration purposes. " ], "metadata": {} }, { "cell_type": "code", "execution_count": 9, "source": [ "def get_model():\n", " backbone = tf.keras.applications.EfficientNetB0(include_top=False, weights=None, input_shape=(160,160,3))\n", " retval = tf.keras.models.Sequential()\n", " retval.add(backbone)\n", " retval.add(tf.keras.layers.GlobalAveragePooling2D())\n", " retval.add(tf.keras.layers.Dense(10, activation='softmax'))\n", "\n", " retval.compile(tfa.optimizers.LAMB(tf.sqrt(float(BATCHSIZE)) / tf.sqrt(2.) / 32000), 'categorical_crossentropy', 'acc')\n", " return retval\n", "\n", "model = get_model()\n", "model.summary()" ], "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Model: \"sequential_3\"\n", "_________________________________________________________________\n", "Layer (type) Output Shape Param # \n", "=================================================================\n", "efficientnetb0 (Functional) (None, 5, 5, 1280) 4049571 \n", "_________________________________________________________________\n", "global_average_pooling2d_1 ( (None, 1280) 0 \n", "_________________________________________________________________\n", "dense_2 (Dense) (None, 10) 12810 \n", "=================================================================\n", "Total params: 4,062,381\n", "Trainable params: 4,020,358\n", "Non-trainable params: 42,023\n", "_________________________________________________________________\n" ] } ], "metadata": {} }, { "cell_type": "code", "execution_count": 10, "source": [ "model.fit(x=train.batch(BATCHSIZE), validation_data=val.batch(BATCHSIZE), epochs=20)" ], "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Epoch 1/20\n", "101/101 [==============================] - 45s 289ms/step - loss: 2.2532 - acc: 0.1597 - val_loss: 2.3500 - val_acc: 0.1000\n", "Epoch 2/20\n", "101/101 [==============================] - 28s 274ms/step - loss: 1.8171 - acc: 0.3709 - val_loss: 3.0967 - val_acc: 0.1000\n", "Epoch 3/20\n", "101/101 [==============================] - 28s 273ms/step - loss: 1.4765 - acc: 0.5042 - val_loss: 4.2789 - val_acc: 0.1000\n", "Epoch 4/20\n", "101/101 [==============================] - 28s 273ms/step - loss: 1.2076 - acc: 0.6028 - val_loss: 4.3226 - val_acc: 0.1000\n", "Epoch 5/20\n", "101/101 [==============================] - 28s 274ms/step - loss: 1.0039 - acc: 0.6675 - val_loss: 3.5492 - val_acc: 0.1000\n", "Epoch 6/20\n", "101/101 [==============================] - 28s 276ms/step - loss: 0.8081 - acc: 0.7363 - val_loss: 2.9440 - val_acc: 0.1720\n", "Epoch 7/20\n", "101/101 [==============================] - 28s 275ms/step - loss: 0.6417 - acc: 0.7908 - val_loss: 1.3759 - val_acc: 0.5700\n", "Epoch 8/20\n", "101/101 [==============================] - 28s 274ms/step - loss: 0.4930 - acc: 0.8363 - val_loss: 1.0968 - val_acc: 0.7080\n", "Epoch 9/20\n", "101/101 [==============================] - 28s 275ms/step - loss: 0.3975 - acc: 0.8708 - val_loss: 1.1089 - val_acc: 0.7120\n", "Epoch 10/20\n", "101/101 [==============================] - 28s 274ms/step - loss: 0.3300 - acc: 0.8945 - val_loss: 1.5058 - val_acc: 0.6700\n", "Epoch 11/20\n", " 61/101 [=================>............] - ETA: 10s - loss: 0.2752 - acc: 0.9121" ] } ], "metadata": {} }, { "cell_type": "markdown", "source": [ "(TODO: Access masterful.core.ensemble)" ], "metadata": {} } ], "metadata": { "interpreter": { "hash": "e11de040a44de2599d5826916dec5532a989d7fc6a7daf05571191351ea2bbfc" }, "kernelspec": { "name": "python3", "display_name": "Python 3.6.9 64-bit ('tf24': venv)" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.9" } }, "nbformat": 4, "nbformat_minor": 2 }