Undergraduates will extend their basic image loading program to be able to load images, process them with certain image processing filters, and then save the filtered image to a file.

Objectives

This assignment is designed to teach techniques that relate to:

  • Color spaces and representations of the image range space.
  • Processing color spaces to provide adjustments common to how images are displayed.
  • Implementing these adjustments through rescaling filters.
  • Processing images related to a signal processing framework.
  • Implementing a basic resizing filter that relies on signal processing of regions of data to control for sampling artifacts.

Part 1: Preliminaries

Note that this repository does not include any default code to start with. I have distributed a few new image files to test with (although you can and should test with some of the previous images as well). I expect that you will transfer over code as needed from the previous assignment.

Your main task is to modify your code from Assignment 01 so as to support two types of image processing operations:

  • Rescaling, by adjusting the displayed colors on a per pixel basis. In particular, the user must be able to adjust the gain, bias, and gamma of the displayed image.

  • Resizing, by producing and displaying the image at a different resolution and using signal processing concepts to reconstruct the input image and control for artifacts before saving the resulting output image.

Both of these filters will take a collection of parameters, and the user should be able to adjust these parameters while the program is executing, to dynamically set them before applying the filter. You are encouraged to use whatever interface you like for this (e.g. HTML GUI widgets such as inputs), but please make sure your README documents how to use your program.

After adjusting the image using a combination of all of the above filters (i.e. both rescaled and resized), the user should then be able to save the modified image to the aforementioned filename.

Part 2: Implementing Rescaling Filters

Your rescaling filter should modify the resulting RGB values of the input data. I find it is most straightforward to think about these filters processing data in the range \([0,1]\), so you may want to convert how you stored your image data in Assignment 01. In my implementation, I used a Float64Array internally for storage, and then map the values to \([0,255]\) only when displaying them.

After the user specifies values for gain, bias, and gamma, your program should scale all color channels. To do this, you’ll have to create some mechanism to update the underlying data and then redisplay the image. This should allow the user to test various combinations of these parameters. For this, I found combinations of HTML range and number inputs to be of use, see https://developer.mozilla.org/en-US/docs/Web/HTML/Element/input for more information. I then added event handlers to process the image on change.

For the actual computation, you should take each image channel \(C\) and separately update it using the following expression:

\[C' = (\texttt{gain}*C + \texttt{bias})^\texttt{gamma}\]

Note, this might produce values outside of the range \([0,1]\). In these situations you should clamp your values back into the appropriate range. If you fail to clamp the data, you will produce a variety of visual artifacts (which might be fun to test with, but you will be penalized if you do not correct them!). Alternatively, you might want to consider storing values outside of the range \([0,1]\) and then using a Uint8ClampedArray (the default type for ImageData’s data property to do the final clamping before display).

Part 3: Implementing Resizing Filters

Resizing is a bit more complicated because resizing the image on the fly requires reallocating an ImageData. I used an HTML number input for both the target width and height, so that these could be set independently.

To resize, I first load the input image from file and always maintain this original data. After the user has specified a target width and height, I next initialize a second image that I will display. I then populate the second image using the “inverse” approach described in class, computing the color for each output pixel using a reconstruction filter. When filtering, it should be applied to each of the \(R\), \(G\), and \(B\) channels separately.

Any basic resizing filter will receive partial credit. To receive full points for this portion of the assignment you should creatively design a reconstruction filter so as to best remove artifacts that are created upon resizing. Specifically, there are two flavors: artifacts caused by decreasing the size of the image and artifacts caused by increasing the size of the image.

The key here is to treat the reconstruction filter as a discrete-to-continuous operation. Initially, when decreasing the size of the image, you will want to do some amount of smoothing to account for high frequency features being impossible to represent with fewer samples. One can achieve this by first smoothing the image using a discrete-to-discrete convolution filter that roughly accounts for the scale of features that can be expressed as the reduced resolution. Any of a variety of smoothing kernels (box, tent, Gaussian, etc.) should achieve reasonable results.

However, for increasing the size of the image, the main key is to interpolate the data using more than just nearest neighbor interpolation. As discussed in class, this can be achieved by using a discrete-to-continuous convolution, where the filtering kernel is not a discrete array but a continuous function. Each pixel in the enlarged image can be thought of as having a floating point position \((x,y)\) in the original image, and you can use convolution to reconstruct what \(f(x,y)\) is.

You are encouraged to experiment with different types of filters and mechanisms for allowing the user to mitigate artifacts, for example, you may want to let the user adjust the size of the smoothing kernel and recomputing the output image while the program is running. To receive full credit for artifact mitigation, you must consider both the case of enlarging and shrinking the image and document your solution to artifact removal in the README.

Since you are using convolution for this task, you will run into an implementation edge case that you must also handle and document precisely how you handled it. This case occurs when you are working with a pixel near the boundary of the image. In the cases where pixels are close to the boundary, a convolution kernel centered at that pixel will extend beyond the image extents. To address this, you must implement a boundary condition. There were three different ways discussed in class, and you are welcome to pick any of the three.

Part 4: Written Questions

Please answer the following written questions. You are not required to typeset these questions in any particular format, but you may want to take the opportunity to include images (either photographed hand-drawings or produced using an image editing tool).

These questions are both intended to provide you additional material to consider the conceptual aspects of the course as well as to provide sample questions in a similar format to the questions on the midterm and final exam. Most questions should able to be answered in 100 words or less of text.

Please create a separate directory in your repo called written and post all files (text answers and written) to this directory. Recall that the written component is due BEFORE the programming component.

  1. Briefly describe your design choice(s) for BOTH how you store the color data as well as how you maintain the data between filtering operations.

  2. In rescaling images, individually adjusting only gain or bias typically is not sufficient to improve the image. Explain why we need both. In particular, discuss the resulting effects on the image when adjusting gain vs. bias.

  3. What is a pixel? How big is a pixel? Both of these questions have multiple answers, briefly explain yours.

  4. 3 × 3 convolution kernels can create a variety of effects. Consider the following three kernels. Briefly describe the output image that is produced as a result of convolution with each kernel (you may assume each are scaled differently if necessary):

    a. \(H_a = \begin{bmatrix} 0 & 1 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}\)

    b. \(H_b = \frac{1}{12} \begin{bmatrix} 1 & 2 & 1 \\ 1 & 2 & 1 \\ 1 & 2 & 1 \end{bmatrix}\)

    c. \(H_c = \begin{bmatrix} -1 & -1 & -1 \\ 0 & 0 & 0 \\ 1 & 1 & 1 \end{bmatrix}\)

  5. Draw and label a diagram of the HSV color space. Include a brief description of each variable, its role in the final color, and a possible numeric range.

Grading

Deductions

Reason Value
Program crashes due to bugs -10 each bug at grader's discretion to fix


Point Breakdown of Features

Requirement Value
Consistent modular coding style 10
External documentation (README.md) 5
Class documentation, Internal documentation (Block and Inline). Wherever applicable / for all files 15
Expected output / behavior based on the assignment specification, including

Dynamically updating the displayed image as a result of filtering it with a combination of filters10
Converting the internally represented data range to a displayable representation, implementing clamping so that values do not overflow.10
Allowing the user a mechanism to vary parameters.10
Correctly implementing rescaling filters to adjust gain/bias/gamma.15
Correctly implementing resizing in some way.10
Additional points based on how well artifacts caused by increasing the size of the image are mitigated.5
Additional points based on how well artifacts caused by reducing the size of the image are mitigated.5
Ensuring that the output, filtered image can be save to as a PPM.5

70
Total 100