Lab 1. PyTorch and ANNs

Deadline: Friday May 14th, 11:59pm.

Total: 30 Points

Late Penalty: There is a penalty-free grace period of one hour past the deadline. Any work that is submitted between 1 hour and 24 hours past the deadline will receive a 20% grade deduction. No other late work is accepted. Quercus submission time will be used, not your local computer time. You can submit your labs as many times as you want before the deadline, so please submit often and early.

Grading TA: Justin Beland, Ali Khodadadi

This lab is based on assignments developed by Jonathan Rose, Harris Chan, Lisa Zhang, and Sinisa Colic.

This lab is a warm up to get you used to the PyTorch programming environment used in the course, and also to help you review and renew your knowledge of Python and relevant Python libraries. The lab must be done individually. Please recall that the University of Toronto plagarism rules apply.

By the end of this lab, you should be able to:

  1. Be able to perform basic PyTorch tensor operations.
  2. Be able to load data into PyTorch
  3. Be able to configure an Artificial Neural Network (ANN) using PyTorch
  4. Be able to train ANNs using PyTorch
  5. Be able to evaluate different ANN configuations

You will need to use numpy and PyTorch documentations for this assignment:

You can also reference Python API documentations freely.

What to submit

Submit a PDF file containing all your code, outputs, and write-up from parts 1-5. You can produce a PDF of your Google Colab file by going to File -> Print and then save as PDF. The Colab instructions has more information.

Do not submit any other files produced by your code.

Include a link to your colab file in your submission.

Please use Google Colab to complete this assignment. If you want to use Jupyter Notebook, please complete the assignment and upload your Jupyter Notebook file to Google Colab for submission.

Adjust the scaling to ensure that the text is not cutoff at the margins.

Submit make sure to include a link to your colab file here

Colab Link: https://drive.google.com/file/d/14lKBku4IenpGADHgFtxEef8082Y_ijnj/view?usp=sharing

Part 1. Python Basics [3 pt]

The purpose of this section is to get you used to the basics of Python, including working with functions, numbers, lists, and strings.

Note that we will be checking your code for clarity and efficiency.

If you have trouble with this part of the assignment, please review http://cs231n.github.io/python-numpy-tutorial/

Part (a) -- 1pt

Write a function sum_of_cubes that computes the sum of cubes up to n. If the input to sum_of_cubes invalid (e.g. negative or non-integer n), the function should print out "Invalid input" and return -1.

Part (b) -- 1pt

Write a function word_lengths that takes a sentence (string), computes the length of each word in that sentence, and returns the length of each word in a list. You can assume that words are always separated by a space character " ".

Hint: recall the str.split function in Python. If you arenot sure how this function works, try typing help(str.split) into a Python shell, or check out https://docs.python.org/3.6/library/stdtypes.html#str.split

Part (c) -- 1pt

Write a function all_same_length that takes a sentence (string), and checks whether every word in the string is the same length. You should call the function word_lengths in the body of this new function.

Part 2. NumPy Exercises [5 pt]

In this part of the assignment, you'll be manipulating arrays usign NumPy. Normally, we use the shorter name np to represent the package numpy.

Part (a) -- 1pt

The below variables matrix and vector are numpy arrays. Explain what you think <NumpyArray>.size and <NumpyArray>.shape represent.

Answer:

<NumpyArray>.size represents the number of elements in the <NumpyArray>.

<NumpyArray>.shape represents the dimensions of the <NumpyArray>.

Part (b) -- 1pt

Perform matrix multiplication output = matrix x vector by using for loops to iterate through the columns and rows. Do not use any builtin NumPy functions. Cast your output into a NumPy array, if it isn't one already.

Hint: be mindful of the dimension of output

Part (c) -- 1pt

Perform matrix multiplication output2 = matrix x vector by using the function numpy.dot.

We will never actually write code as in part(b), not only because numpy.dot is more concise and easier to read/write, but also performance-wise numpy.dot is much faster (it is written in C and highly optimized). In general, we will avoid for loops in our code.

Part (d) -- 1pt

As a way to test for consistency, show that the two outputs match.

Part (e) -- 1pt

Show that using np.dot is faster than using your code from part (c).

You may find the below code snippit helpful:

Part 3. Images [6 pt]

A picture or image can be represented as a NumPy array of “pixels”, with dimensions H × W × C, where H is the height of the image, W is the width of the image, and C is the number of colour channels. Typically we will use an image with channels that give the the Red, Green, and Blue “level” of each pixel, which is referred to with the short form RGB.

You will write Python code to load an image, and perform several array manipulations to the image and visualize their effects.

Part (a) -- 1 pt

This is a photograph of a dog whose name is Mochi.

alt text

Load the image from its url (https://drive.google.com/uc?export=view&id=1oaLVR2hr1_qzpKQ47i9rVUIklwbDcews) into the variable img using the plt.imread function.

Hint: You can enter the URL directly into the plt.imread function as a Python string.

Part (b) -- 1pt

Use the function plt.imshow to visualize img.

This function will also show the coordinate system used to identify pixels. The origin is at the top left corner, and the first dimension indicates the Y (row) direction, and the second dimension indicates the X (column) dimension.

Part (c) -- 2pt

Modify the image by adding a constant value of 0.25 to each pixel in the img and store the result in the variable img_add. Note that, since the range for the pixels needs to be between [0, 1], you will also need to clip img_add to be in the range [0, 1] using numpy.clip. Clipping sets any value that is outside of the desired range to the closest endpoint. Display the image using plt.imshow.

Part (d) -- 2pt

Crop the original image (img variable) to a 130 x 150 image including Mochi's face. Discard the alpha colour channel (i.e. resulting img_cropped should only have RGB channels)

Display the image.

Part 4. Basics of PyTorch [6 pt]

PyTorch is a Python-based neural networks package. Along with tensorflow, PyTorch is currently one of the most popular machine learning libraries.

PyTorch, at its core, is similar to Numpy in a sense that they both try to make it easier to write codes for scientific computing achieve improved performance over vanilla Python by leveraging highly optimized C back-end. However, compare to Numpy, PyTorch offers much better GPU support and provides many high-level features for machine learning. Technically, Numpy can be used to perform almost every thing PyTorch does. However, Numpy would be a lot slower than PyTorch, especially with CUDA GPU, and it would take more effort to write machine learning related code compared to using PyTorch.

Part (a) -- 1 pt

Use the function torch.from_numpy to convert the numpy array img_cropped into a PyTorch tensor. Save the result in a variable called img_torch.

Part (b) -- 1pt

Use the method <Tensor>.shape to find the shape (dimension and size) of img_torch.

Part (c) -- 1pt

How many floating-point numbers are stored in the tensor img_torch?

Part (d) -- 1 pt

What does the code img_torch.transpose(0,2) do? What does the expression return? Is the original variable img_torch updated? Explain.

Answer:

img_torch.transpose(0,2) computes the transpose of the img_torch tensor by swapping the input dimensions 0 and 2. This can be observed through differences in the shape of input and output tensors. The shape of img_torch is [150, 130, 3] and the shape of img_torch.transpose(0,2) is [3, 130, 150], where dimensions 0 and 2 are indeed swapped after the transpose.

The expression returns the tranposed version of img_torch as a tensor.

The original variable img_torch is unchanged with the usage of this expression, but the resulting img_torch.transpose(0,2) tensor shares its underlying storage with img_torch. This means that if additional changes are made to the contents of img_torch.transpose(0,2), the contents of img_torch will change accordingly.

Part (e) -- 1 pt

What does the code img_torch.unsqueeze(0) do? What does the expression return? Is the original variable img_torch updated? Explain.

Answer:

img_torch.unsqueeze(0) inserts a dimension of size 1 to the img_torch tensor at the specified position of 0. This can be observed through differences in the shape of input and output tensors. The shape of img_torch is [150, 130, 3] and the shape of img_torch.unsqueeze(0) is [1, 150, 130, 3], where a new dimension of size 1 is indeed inserted at the 0th index.

The expression returns the modified version of img_torch as a tensor.

The original variable img_torch is unchanged with the usage of this expression, but the resulting img_torch.unsqueeze(0) tensor shares its underlying storage with img_torch. This means that if additional changes are made to the contents of img_torch.unsqueeze(0), the contents of img_torch will change accordingly.

Part (f) -- 1 pt

Find the maximum value of img_torch along each colour channel? Your output should be a one-dimensional PyTorch tensor with exactly three values.

Hint: lookup the function torch.max.

Part 5. Training an ANN [10 pt]

The sample code provided below is a 2-layer ANN trained on the MNIST dataset to identify digits less than 3 or greater than and equal to 3. Modify the code by changing any of the following and observe how the accuracy and error are affected:

Part (a) -- 3 pt

Comment on which of the above changes resulted in the best accuracy on training data? What accuracy were you able to achieve?

Answer:

As shown in the code above, by simply increasing the number of training iterations from showing each image only once to 30 times, the best accuracy on the training data is achieved. The resulting training accuracy is 100%.

Part (b) -- 3 pt

Comment on which of the above changes resulted in the best accuracy on testing data? What accuracy were you able to achieve?

Answer:

As shown in the code above, on top of increasing the number of training iterations from (a), adding an additional hidden layer and increasing the number of hidden nodes produced the best accuracy on the testing data. The resulting test accuracy is 94.4%.

Part (c) -- 4 pt

Which model hyperparameters should you use, the ones from (a) or (b)?

Answer:

While it is tempting to use the model hyperparameters from (b) since it produces a higher test accuracy, the hyperparameters from (a) should be utilized. This is due to the fact that the hyperparameters from (b) are tuned using the test set, which defeats the purpose of the training/testing dataset split. This also leads to overfitting the model to the test data.