Implement a 2D Convolutional Filter
Below is a supplemental written solution that utilizes our interview framework and has been checked for completeness and accuracy. Check out the mock video and read through the written solution to evaluate how you would structure your answer.
Introduction
In this mock video, Angie prompts Vikram (ex-Meta) to "implement a 2D convolutional filter."
Step 1: Understand the problem
A 2D convolutional filter involves running/scanning a smaller 2D kernel matrix over a larger 2D data matrix. This process takes the following steps:
- Place the kernel aligned at the top left-hand corner (without overflow) on the data matrix.
- Compute the product of the matching data matrix numbers and the kernel matrix numbers.
- Take the sum of these values and store it at the position 0, 0 of the resulting output matrix.
- Slide the kernel one step to the right and recompute another product sum.
Several nuances and edge cases (e.g. padding, strides, stripes), must be considered for a fully functioning convolutional filter implementation. Due to time constraints, we’ll ignore these for this simpler implementation.
For simplicity, we’ll assume that the data matrix and the kernel matrices contain integer values (positive, negative, and zeros). We’ll also assume that the kernel is a k x k matrix, where k << m, n. M, n are the number of rows and cols of the data matrix).
We’ll perform this coding in plain Python rather than Numpy or framework libraries. Framework libraries have implemented convenience methods for these, which typically conceal some of the intricacies of these computations.
Function signature:
- Input: 2D data matrix with integers m x n, 2D kernel matrix with integers k x k
- Output: 2D matrix with size m - k, n - k
We'll pause here to check in with the interviewer regarding any course corrections or adjustments.
Let’s look at this example to check we are on the same page:
PythonData: [ [1, -1, 0], [-3, 0, 2], [8, 9, 1] ] Kernel: [ [1, -1], [-1, 1], ] Expected output: At cell (0,0) → 1*1 + -1*-1 + -3*-1+ 0* 1 = 5 At cell (0,1) → 1*-1 + -1*0 + -1*0 + 1* 2 = 1 At cell (1,0) → 1*-3 + -1*0 + -1*8 + 1* 9 = -2 At cell (1,1) → 1*0 + -1*2 + -1*9 + 1* 1 = -10 [ [5,1], [-2,-10] ]
Step 2: Discuss the approach
In simple Python, the high-level approach entails the following:
- Scan through the positions of the output cells in the range(0, m - k + 1) and range(0, n - k + 1)
- For each position, carefully identify the correspondence between the data matrix cells and the kernel matrix cells
- Compute the product and sum
- Capture into the output grid
- Time complexity: O((m-k+1)(n-k+1)kk). For each overlay position of the kernel matrix, we perform kk multiplications and 1 addition.
- Space complexity: O(1), no extra data structures are used
However, if we write it using PyTorch libraries, we would:
- Read the input and capture its shape
- Read the weight and capture the shape
- Use PyTorch nn.Conv2D with the input and weight, optionally specifying bias, stride, and padding
Step 3: Implement the algorithm
Implementation in simple Python:
Pythondef conv2d(data, kernel):
m, n = len(data), len(data[0])
k = len(kernel)
# assume that the input is valid otherwise assert res = []
res = []
for i in range(m - k + 1):
row = []
for j in range(n - k + 1):
val = 0
for p in range(k):
for q in range(k):
val += data[i+p][j+q] * kernel[p][q]
row.append(val)
res.append(row[::])
return res
Implementation in PyTorch:
Python# prompt: write a function that computes the 2D convolution using PyTorch
import torch
import torch.nn as nn
def conv2d(input, weight, bias=None, stride=1, padding=0):
"""
Computes the 2D convolution of an input tensor with a weight tensor.
Args:
input: A 4D tensor of shape (batch_size, in_channels, in_height, in_width).
weight: A 4D tensor of shape (out_channels, in_channels, kernel_height, kernel_width).
bias: A 1D tensor of shape (out_channels,) containing the bias values. stride: The stride of the convolution.
padding: The padding of the convolution.
Returns:
A 4D tensor of shape (batch_size, out_channels, out_height, out_width) containing the convolution result.
"""
# Get the dimensions of the input and weight tensors.
batch_size, in_channels, in_height, in_width = input.shape
out_channels, _, kernel_height, kernel_width = weight.shape
# Pad the input tensor.
# padded_input = nn.functional.pad(input, (padding, padding, padding, padding))
# Create a 2D convolution layer.
conv_layer = nn.Conv2d(in_channels, out_channels,
kernel_size=(kernel_height, kernel_width), bias=bias, stride=stride, padding=0)
# Initialize the convolution layer with the weight and bias tensors. conv_layer.weight = nn.Parameter(weight)
# Perform the convolution.
return conv_layer(input)
Step 4: Test code and discuss results
We’ll use assert statements to ensure that the actual outputs match the expected hand-computed outputs. It will be helpful to show the output for easily computable inputs and/or kernels (e.g. integer valued inputs, sparse and zero kernels).
Similarly, checking the outputs for strides greater than 1 will be helpful.
For the Python implementation of 2D convolution operation:
Pythonassert conv2d([[1, -1, 0],[-3, 0, 2],[8, 9, 1]], [[1, -1],[-1, 1]]) == [[5,1],[-2,-10]]
For the PyTorch implementation with batches, channels, etc:
Python# prompt: write a test function and 2 tests to test the above conv2d function
import unittest
import numpy as np
class Conv2dTest(unittest.TestCase):
def test_simple(self):
input = torch.tensor([[1,-1,0],[-3,0,2],[8,9,1]]).float()
# 1 batch, 1 channel
input = input.unsqueeze(0).unsqueeze(0)
print('input shape: ', input.shape)
# kernel
weight = torch.tensor([[1,-1],[-1,1]]).float()
# 1 out_channel, 1 in_channel
weight = weight.unsqueeze(0).unsqueeze(0)
print('weight shape: ', weight.shape)
output = conv2d(input, weight)
print(output.shape)
print(output)
self.assertEqual(output.shape, (1,1,2,2))
self.assertTrue(torch.equal(output, torch.tensor([[[[ 5., 1.], [ -2., -10.]]]])))
if __name__ == '__main__':
unittest.main(argv=['first-arg-is-ignored'], exit=False)
# prompt: write a test function and 2 tests to test the above conv2d function
import unittest
import numpy as np
class Conv2dTest(unittest.TestCase):
def test_simple(self):
input = torch.tensor([[1,-1,0],[-3,0,2],[8,9,1]]).float()
# 1 batch, 1 channel
input = input.unsqueeze(0).unsqueeze(0)
print('input shape: ', input.shape)
# kernel
weight = torch.tensor([[1,-1],[-1,1]]).float()
# 1 out_channel, 1 in_channel
weight = weight.unsqueeze(0).unsqueeze(0)
print('weight shape: ', weight.shape)
output = conv2d(input, weight)
print(output.shape)
print(output)
self.assertEqual(output.shape, (1,1,2,2))
self.assertTrue(torch.equal(output, torch.tensor([[[[ 5., 1.], [ -2., -10.]]]])))
if __name__ == '__main__':
unittest.main(argv=['first-arg-is-ignored'], exit=False)