Assignment 02 - Undergraduate

Objectives
Part 1: Modifying your image loader
Part 2: Implementing Filters
Part 3: Written Questions
Grading

Undergraduates will extend their basic image loading program to be able to load images, process them through a collection of image processing filters, and then save them to a file.

Objectives

This assignment is designed to teach you techniques that relate to:

Color spaces and representations of the image range space.
Processing color spaces to provide adjustments common to how images are displayed.
Implementing these adjustments through rescaling filters.
Processing images related to a signal processing framework.
Implementing convolution filters to better understand the connection between processing regions of data and how these relate to signal processing.

Part 1: Modifying your image loader

Starting with your previous assignment, you should modify your code so that you can both load and also save an image. To do so, you should implement basic file I/O to write a PPM file. You should also modify your main.cpp to give the user the option to save an image to a filename that they specify (specifically, your executable should accept parameters for both a filename to read from and a filename to write to).

Part 2: Implementing Filters

Next, you will modify your code to support two types of image processing operations:

Rescaling filters, that adjust the displayed colors on a per pixel basis. In particular, the user must be able to adjust the gain, bias, and gamma of the displayed image.
Convolution-based filters, that adjust the displayed colors for each pixel by analyzing a local region. In particular, the user should be able to apply three filters: a box filter, a Gaussian filter, and an unsharp mask. These filters should allow the user to both smooth (via the box or Gaussian) and sharpen (via the unsharp mask) the image. The user should be able to control the extent of these filters by specifying a radius for the filter.

Both of these filters will take a collection of parameters, and the user should be able to adjust these parameters while the program is executing, to dynamically set them before applying the filter. You are encouraged to use whatever interface you like for this, but please make sure your README documents both how to run and how to use your program.

After adjusting the image using a combination of the above filters, the user should then be able to save the modified image to the aforementioned filename.

Details on Rescaling

As discussing in class, your rescaling filter should modify the resulting RGB values of the input data. It is most straightforward to think about these filters processing data in the range \([0,1]\), so you may want to convert how your image class from Assignment 01 stores the underlying data. After specifying the gain, bias, and gamma, the user should be able to scale all color channels.

The easiest method to do this is compute a scale value that you will multiply each channel with separately. This scale value should be computed based on the luminance, \(L\), of the pixel. There are a variety of equations that one could use to go from \(RGB\) to \(L\), but for this assignment we will use one of the simplest:

\[L = \frac{1.0}{61.0}(20.0R + 40.0G + B)\]

This exact equation is somewhat different weights from the Y channel in YUV color or the B in HSB, but provides a good approximation to how humans perceive intensity from color.

After computing the luminance, you can use the gain, bias, and gamma to compute an updated luminance, \(L'\) (in particular \(L' = (\texttt{gain}*L + \texttt{bias})^\texttt{gamma}\). You can then compute your scale value by \(\texttt{scale} = L'/L\).

After computing the scale, you can update the RGB values by multiplying each channel by it. Note, this might produce values outside of the range \([0,1]\). In these situations you should clamp your values back into the appropriate range. If you fail to clamp the data, you will produce a variety of visual artifacts (which might be fun to test with, but you will be penalized if you do not correct them!).

Details on Convolution

For this family of filters, the user should be able to specify an integer radius that will be used to specify the size of the convolution kernel. In my implementation I used the convention that the kernel size was always \((2*\texttt{radius}+1) \times (2*\texttt{radius}+1)\). Thus, a radius=1 filter produced a filter of size \(3\times3\). A radius of 2 corrsponded to a \(5\times5\) filter, etc. Using only odd-sized filters makes coding a bit easier, since you always know precisely the filter center.

Your filters should be applied to each of the \(R\), \(G\), and \(B\) channels separately.

You will run into two edge cases that you must handle and document precisely how you handled them. The first is the case where you are working with a pixel near the boundary of the image. In the cases where pixels are close to the boundary, the kernel centered at that pixel will extend beyond the image extents. In these case, you must implement a boundary condition. There were three different ways discussed in class, and you are welcome to pick any of the three.

The second condition you must deal with is how to weight the filter appropriately. For smoothing filters like the box and Gaussian filter, you need to only divide by the sum of the kernel values to keep the range of data within useful bounds. However, without properly weighting, the unsharp mask you can produce values for RGB data that are outside of the range \([0,1]\). The typical convention, discussed in class, is to use the same weight value that would have used for just the smoothing filter. Even with this denominator, you will still have to clamp the RGB values back into the range \([0,1]\).

Part 3: Written Questions

Please answer the following written questions. You are not required to typeset these questions in any particular format, but you may want to take the opportunity to include images (either photographed hand-drawings or produced using an image editing tool).

These questions are both intended to provide you additional material to consider the conceptual aspects of the course as well as to provide sample questions in a similar format to the questions on the midterm and final exam. Most questions should able to be answered in 100 words or less of text.

Please create a commit a separate directory in your repo called written and post all files (text answers and written) to this directory.

What is a pixel? How big is a pixel? Both of these questions have multiple answers, briefly explain yours.
3 × 3 convolution kernels can create a variety of effects. Consider the following three kernels. First, list the appropriate scale factor you would use for this kernel (see the instructions in the slides and the lab for a definition). Next, briefly describe the output image that is produced as a result of convolution with each kernel:

a. \(H_a = \begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{bmatrix}\)

b. \(H_b = \begin{bmatrix} -1 & -1 & -1 \\ 0 & 0 & 0 \\ 1 & 1 & 1 \end{bmatrix}\)

c. \(H_c = \begin{bmatrix} 1 & 2 & 1 \\ 2 & 4 & 2 \\ 1 & 2 & 1 \end{bmatrix}\)
Given an image \(I\) of \(100 \times 200\), and a kernel \(K\) of size \(7 \times 7\), how many multiplications are required to compute \(K \otimes I\)? Be sure to state your boundary condition.
Draw and label a diagram of the HSV color space. Include a brief description of each variable, its role in the final color, and a possible numeric range.
The simplest possible approach to tone mapping is to take the HDR input data and normalize it to produce values between \([0,1]\). What are the potential problems with using this technique?

Grading

Deductions

Reason	Value

Program does not compile. (First instance across all assignments will receive a warning with a chance to resubmit, but subsequence non-compiling assignments will receive the full penalty)	-100
Program crashes due to bugs	-10 each bug at grader's discretion to fix

Point Breakdown of Features

Requirement

Value

Consistent modular coding style

External documentation (README.md), Providing a working CMakeLists.txt

Class documentation, Internal documentation (Block and Inline). Wherever applicable / for all files

Expected output / behavior based on the assignment specification, including

Implementing rescaling filters.	10
Converting the internally represented data range to a displayable representation, implementing clamping so that values do not overflow.	5
Correctly implementing convolution and boundary condition.	5
Providing implementations of all of the required convolution kernels	10
Correctly computing the weight of the kernel	5
Allowing the user a mechanism to vary parameters.	10
Supporting writing the output filtered image as a PPM.	5

Written Questions

Total

100