Coursera

Ungraded Lab: The Hello World of Deep Learning with Neural Networks

Like every first app, you should start with something super simple that shows the overall scaffolding for how your code works. In the case of creating neural networks, one simple case is where it learns the relationship between two numbers. So, for example, if you were writing code for a function like this, you already know the ‘rules’:

def hw_function(x):
    y = (2 * x) - 1
    return y

So how would you train a neural network to do the equivalent task? By using data! By feeding it with a set of x’s and y’s, it should be able to figure out the relationship between them.

This is obviously a very different paradigm from what you might be used to. So let’s step through it piece by piece.

Imports

Let’s start with the imports. Here, you are importing TensorFlow and calling it tf for convention and ease of use.

You then import a library called numpy which helps to represent data as arrays easily and to optimize numerical operations.

The framework you will use to build a neural network as a sequence of layers is called keras so you will import that too.

import tensorflow as tf
import numpy as np
from tensorflow import keras

print(tf.__version__)

Define and Compile the Neural Network

Next, you will create the simplest possible neural network. It has 1 layer with 1 neuron, and the input shape to it is just 1 value. You will build this model using Keras’ Sequential class which allows you to define the network as a sequence of layers. You can use a single Dense layer to build this simple network as shown below.

# Build a simple Sequential model
model = tf.keras.Sequential([keras.layers.Dense(units=1, input_shape=[1])])

Now, you will compile the neural network. When you do so, you have to specify 2 functions: a loss and an optimizer.

If you’ve seen lots of math for machine learning, here’s where it’s usually used. But in this case, it’s nicely encapsulated in functions and classes for you. But what happens here? Let’s explain…

You know that in the function declared at the start of this notebook, the relationship between the numbers is y=2x-1. When the computer is trying to ‘learn’ that, it makes a guess… maybe y=10x+10. The loss function measures the guessed answers against the known correct answers and measures how well or how badly it did.

It then uses the optimizer function to make another guess. Based on how the loss function went, it will try to minimize the loss. At that point maybe it will come up with something like y=5x+5, which, while still pretty bad, is closer to the correct result (i.e. the loss is lower).

It will repeat this for the number of epochs which you will see shortly. But first, here’s how you will tell it to use mean squared error for the loss and stochastic gradient descent for the optimizer. You don’t need to understand the math for these yet, but you can see that they work!

Over time, you will learn the different and appropriate loss and optimizer functions for different scenarios.

# Compile the model
model.compile(optimizer='sgd', loss='mean_squared_error')

Providing the Data

Next up, you will feed in some data. In this case, you are taking 6 X’s and 6 Y’s. You can see that the relationship between these is y=2x-1, so where x = -1, y=-3 etc.

The de facto standard way of declaring model inputs and outputs is to use numpy, a Python library that provides lots of array type data structures. You can specify these values by building numpy arrays with np.array().

# Declare model inputs and outputs for training
xs = np.array([-1.0,  0.0, 1.0, 2.0, 3.0, 4.0], dtype=float)
ys = np.array([-3.0, -1.0, 1.0, 3.0, 5.0, 7.0], dtype=float)

Training the Neural Network

The process of training the neural network, where it ‘learns’ the relationship between the x’s and y’s is in the model.fit() call. This is where it will go through the loop we spoke about above: making a guess, measuring how good or bad it is (aka the loss), using the optimizer to make another guess etc. It will do it for the number of epochs you specify. When you run this code, you’ll see the loss on the right hand side.

# Train the model
model.fit(xs, ys, epochs=500)

Ok, now you have a model that has been trained to learn the relationship between x and y. You can use the model.predict() method to have it figure out the y for a previously unknown x. So, for example, if x=10, what do you think y will be? Take a guess before you run this code:

# Make a prediction
print(model.predict([10.0]))

You might have thought 19, right? But it ended up being a little under. Why do you think that is?

Remember that neural networks deal with probabilities. So given the data that we fed the model with, it calculated that there is a very high probability that the relationship between x and y is y=2x-1, but with only 6 data points we can’t know for sure. As a result, the result for 10 is very close to 19, but not necessarily 19.

As you work with neural networks, you’ll see this pattern recurring. You will almost always deal with probabilities, not certainties, and will do a little bit of coding to figure out what the result is based on the probabilities, particularly when it comes to classification.