Assignment 02 - Graduate
Images
Due: Feb. 13, 2018 11:59:59 PM
Graded: Feb. 20, 2018
Percentage of Grade: 10%
Assignment Description: Finalized
- Objectives
- Part 1: Reading HDR data
- Part 2: Basic Tone Mapping
- Part 3: Tone Mapping with Bilateral Filters
- Part 4: Written Questions
- Grading
Graduate students will implement two forms of tone mapping, based on topics discussed in class, in particular the 2002 SIGGRAPH paper Fast Bilateral Filtering for the Display of High-Dynamic-Range Images by Frédo Durand and Julie Dorsey.
Tone mapping is a process by which high dynamic range (HDR) images are converted to a form for which they can be displayed. Our approach will use the filtering techniques we’ve discussed in class. HDR images store a wide range of possible values, typically five or more orders of magnitude. Converting them to a display space (which is limited to two orders of magnitude) AND preserving interesting features can be quite tricky.
Objectives
This assignment is designed to teach you techniques that relate to:
- Color spaces and representations of the image range space.
- Processing color spaces to provide adjustments common to how images are displayed.
- Implementing these adjustments through rescaling filters that convert from HDR to low dynamic range.
- Implementing convolution filters, in particular the bilateral filter, to better understand the connection between processing regions of data and how these relate to signal processing.
Part 1: Reading HDR data
Starting with your previous assignment, you should modify your code so that you can both load and also save an image. To do so, you should implement basic file I/O to write a PPM file. You should also modify your main.cpp to give the user the option to save an image to a filename that they specify (specifically, your executable should accept parameters for both a filename to read from and a filename to write to).
In addition to being able to load PPM files, your code must also support the ability to load HDR images formatted using the Radiance RGBE .hdr
extension. For this assignment, you may either write your own file parser (the format is a bit more complicated than .ppm
, but not insane). I have also initialized your repositories to include a slightly modified version of the RGBE File Format parsing code from Bruce Walter.
Note that when you read an HDR file, you will read the data in as an array of floats of size \(3 \times \texttt{width} \times \texttt{height}\). However, since the data is high dynamic range, the values could be quite large. Your task will be to convert these to values that can be displayed through the following tone mapping approaches.
Your RGBE reader should be able to support all of the test images here:
- http://www.cs.utah.edu/~reinhard/cdrom/
- http://people.csail.mit.edu/fredo/PUBLI/Siggraph2002/
- http://www.anyhere.com/gward/hdrenc/pages/originals.html
Be sure to include which images you tested with. At a minimum, you should at least try lamp.hdr
, smalldesignCenter.hdr
, and memorial.hdr
from each of the first, second, and third links above.
Finally, your program must support the writing of these files, after applying tone mapping, as low dynamic range PPMs.
Part 2: Basic Tone Mapping
Your first tone mapping operator will be to employ a simple gamma correct method, in the log space of its luminance. This is a three step process, based on computing a scale
value that you will multiply each channel with separately.
-
This scale value should be computed based on the luminance, \(L\), of the pixel. There are a variety of equations that one could use to go from \(RGB\) to \(L\), but for this assignment we will use one of the simplest:
\[L = \frac{1.0}{61.0}(20.0R + 40.0G + B)\] -
Next, you goal will be to compute a target display luminance value \(L' = L^{\gamma}\). The user should be able to dynamically adjust the variable
gamma
of the displayed image. While you could directly use thepowf()
function for this, it turns out that if you work in the \(\log\) space you can achieve this (and the next part of the assignment) will less computational effort. Thus, to do so first compute and storelog(L)
for each pixel. We will rely on the equivalence that \(\log(L') = \gamma \times \log(L)\). Once we have computed \(\log(L')\) we can then recover \(L'\) by taking \(L' = \exp(\log(L'))\). -
Finally, you can then compute your
scale
value by \(\texttt{scale} = L'/L\). After computing thescale
, you can update the RGB values by multiplying each channel by it. Note, this might produce values outside of the range \([0,1]\). In these situations you should clamp your values back into the appropriate range. If you fail to clamp the data, you will produce a variety of visual artifacts (which might be fun to test with, but you will be penalized if you do not correct them!).
Part 3: Tone Mapping with Bilateral Filters
If you try varying gamma
, you should be able to improve the input over the original display, however, in general there will be a number of issues which we attempt to fix.
As discussed in class, one issue with using a gamma corrected space is while it compresses the HDR range space in a perceptually sensitive way, it fails to properly distinguish between features of different scales. To access this, we will modify our simple approach to also take advantage of a feature dependent measure using convolution. By separating the image into coarse scale and fine scale features, we can separately apply gamma correction to them.
In particular, you should first implement a convolution operator, \(g\), that performs low-pass smoothing by doing \(\log(L) \otimes g\). \(g\) can be anything you like, but I suggest using a box filter of varying sizes (even \(5\times5\) will improve the tone map, but up to \(21\times21\) may do better).
Our goal is to apply gamma correction only to the low-pass component of the luminance channel (we will call this \(B\)) and to recombine with the preserved high-pass component (we will call this \(S\)). The procedure is as follows:
-
\(B = \log(L) \otimes g\). (first compute low-pass
B
with convolution) -
\(S = \log(L) - B\). (next, separate
log(L)
into high-passS
and low-passB
) -
\(\log(L') = \gamma \times B + S\). (gamma correct
B
and recombine withS
) -
\(L' = \exp(\log(L'))\). (convert back from log space to original)
-
\(\texttt{scale} = L'/L\). (produce
scale
value)
After computing scale
, one can then update the RGB values by multiplying each channel by it as before.
Setting gamma
is this situation can be tricky to understand. In general, the idea is that you want to preserve some contrast threshold \(c\). On Durand’s website (near the bottom) he suggests using a gamma
, called “compression factor” set relative to the minimum and maximum of \(B\). Specifically, \(\gamma = \log(c) / (\max(B) - \min(B))\). I found that \(c \in [5,100]\) worked well. He also suggests subtracting an absolute scale from the formulation. I found that both of these changes improved my results.
While this simple approach improves the results, it also blurs features that cross over edges in the input. After getting to this stage, you should next modify your convolution-based tone mapper to instead use a bilateral filter instead of a box filter. The idea is that when you tone map with just convolution of a smoothing filter, you will create halos based on how big of a window you convolve against. These halos are the result of crossing edges in the image.
To fix this, you must modify your convolution to produce a non-linear operator instead of a standard box filter. Durand suggests quite a few options for this, you are welcome to experiment with your own. In my implementation, I multiplied by a weight of \(w = \exp(-\textrm{clamp}(d^2))\) where \(d\) equals the difference in \(\log(L)\) between the center pixel of the convolution and whatever other pixel you are summing.
Part 4: Written Questions
Please answer the following written questions. You are not required to typeset these questions in any particular format, but you may want to take the opportunity to include images (either photographed hand-drawings or produced using an image editing tool).
These questions are both intended to provide you additional material to consider the conceptual aspects of the course as well as to provide sample questions in a similar format to the questions on the midterm and final exam. Most questions should able to be answered in 100 words or less of text.
Please create a commit a separate directory in your repo called written
and post all files (text answers and written) to this directory.
-
What is a pixel? How big is a pixel? Both of these questions have multiple answers, briefly explain yours.
-
3 × 3 convolution kernels can create a variety of effects. Consider the following three kernels. First, list the appropriate scale factor you would use for this kernel (see the instructions in the slides and the lab for a definition). Next, briefly describe the output image that is produced as a result of convolution with each kernel:
a. \(H_a = \begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{bmatrix}\)
b. \(H_b = \begin{bmatrix} -1 & -1 & -1 \\ 0 & 0 & 0 \\ 1 & 1 & 1 \end{bmatrix}\)
c. \(H_c = \begin{bmatrix} 1 & 2 & 1 \\ 2 & 4 & 2 \\ 1 & 2 & 1 \end{bmatrix}\)
-
Given an image \(I\) of \(100 \times 200\), and a kernel \(K\) of size \(7 \times 7\), how many multiplications are required to compute \(K \otimes I\)? Be sure to state your boundary condition.
-
Draw and label a diagram of the HSV color space. Include a brief description of each variable, its role in the final color, and a possible numeric range.
-
The simplest possible approach to tone mapping is to take the HDR input data and normalize it to produce values between \([0,1]\). What are the potential problems with using this technique?
Grading
Deductions
Reason | Value |
Program does not compile. (First instance across all assignments will receive a warning with a chance to resubmit, but subsequence non-compiling assignments will receive the full penalty) | -100 |
Program crashes due to bugs | -10 each bug at grader's discretion to fix |
Point Breakdown of Features
Requirement | Value | ||||||||||||||
Consistent modular coding style | 10 | ||||||||||||||
External documentation (README.md), Providing a working CMakeLists.txt | 5 | ||||||||||||||
Class documentation, Internal documentation (Block and Inline). Wherever applicable / for all files | 15 | ||||||||||||||
Expected output / behavior based on the assignment specification, including
| 50 | ||||||||||||||
Written Questions | 20 | ||||||||||||||
Total | 100 |