CS 180 Project 1: Colorizing the Prokudin-Gorskii photo collection

Kevin Sheng


Project Overview

The goal of this project is to use the digitized Prokudin-Gorskii glass plate images to reconstruct the colored images. To achieve this, the image is divided into 3 chunks, with each corresponding to a color channel. The 3 filters are then overlaid on top of each other, which constructs the colored image. The challenge comes in the form of alignment, or rather, misalignemnt. Due to the age and handcrafted nature of the images, if the channels are simply layered on top of each other, the resulting image contains significant visual artifacts. As such, I needed to find a way to align these channels and minimize the visual artifacts.


Initial Approach

As the issue of the channels lie in misalignment, I first experimented with different ways to calculate such misalignemnts. I implemented L2 Norm and Normalized Cross Correlation as ways to calculate the difference between two channels, and I then exhaustively searched across a range of (-15, 15) pixels of rollover in order to find the aligment that gives the least visual distortions. As a note, since the images often contain borders, I also ignored 15% of the image on all four sides. This worked well for smaller images, such as cathedral.jpg, and other jpgs. However, images in .tif format have much higher resolution, and therefore may require a much larger shift in channels. For those, an exhausitve search is too time consuming to be practical.


Image Pyramids

Avoiding the prohibitively time-consuming exhaustive search approach, I implemented an image pyramid search to vastly speed up the processing of .tif images. I first downscaled the resolution of the images until both dimensions are no more than 256 pixels, then I performed the same exhausitve search as before. With the optimal shift now found for the lower resolution counterpart, I translated that shift to the higher resolution counterpart, then performed another exhaustive search, but only centered around the region of the shift. This vastly cuts down on processing time, as I kept the search range in the (-15, 15) region, and even though I may had to search multiple times, the time taken is still only a fraction of that if I expanded the range to be anywhere close to what a .tif image may need.


Results and Comparisons

As mentioned, I tried multiple different approaches when it comes to difference calculation, and they all gave varying results. With emir.tif as an example, both L2 and NCC gave subpar results when using the default configuration, where the red and green channels are aligned to the blue channel. With that said, aligning the image to the green channel did fix the issue, and I went with this setting for all other images.

emir_l2_blue
emir.tif with L2 difference and blue-alignment
emir_ncc
emir.tif with NCC difference and blue-alignment
emir_l2_green
emir.tif with L2 difference and green-alignment

Final Results

The final results displayed are all made with L2 difference and green-alignment. No additional post-processing or cropping are done.


cathedral | B: (-5, -2) R: (7, 1)
tobolsk | B: (-3, -3) R: (4, 1)
monastery | B: (3, -2) G: (6, 1)
church | B: (-25, -4) R: (33, -8)
emir | B: (-49, -24) R: (57, 17)
harvesters | B: (-59, -17) R: (65, -3)
icon | B: (-41, -17) R: (48, 5)
lady | B: (-55, -9) R: (62, 4)
melons | B: (-82, -11) R: (96, 3)
onion_church | B: (-51, -27) R: (57, 10)
sculpture | B: (-33, 11) R: (107, -16)
self_portrait | B: (-79, -29) R: (98, 8)
three_generations | B: (-53, -14) R: (59, -3)
train | B: (-43, -6) R: (43, 27)

Some other images from the Prokudin-Gorskii collection

house
arch
river

Other Experimentations

These are results from other forms of processing I tried. Some worked out, some didn't. L2 + green-alignment seems to be the most consistent parameters for all images, so that is the one I chose, but certain other approaches also produced interesting results.


Default alignment for church.jpg with no shifts at all.
NCC blue-alignment for harvesters.tif. In reality, most of the images turned out fine with all approaches.
L2 blue-alignment for church.tif. This was due to a bug in how I used np.roll to shift the channels.
L2 green-alignment for emir.tif. I accidentally inputted the blue channel as green, which caused his clothes to become green.