The goal of this project is to use the digitized Prokudin-Gorskii glass plate images to reconstruct the colored images. To achieve this, the image is divided into 3 chunks, with each corresponding to a color channel. The 3 filters are then overlaid on top of each other, which constructs the colored image. The challenge comes in the form of alignment, or rather, misalignemnt. Due to the age and handcrafted nature of the images, if the channels are simply layered on top of each other, the resulting image contains significant visual artifacts. As such, I needed to find a way to align these channels and minimize the visual artifacts.
As the issue of the channels lie in misalignment, I first experimented with different ways to calculate such misalignemnts. I implemented L2 Norm and Normalized Cross Correlation as ways to calculate the difference between two channels, and I then exhaustively searched across a range of (-15, 15) pixels of rollover in order to find the aligment that gives the least visual distortions.
As a note, since the images often contain borders, I also ignored 15% of the image on all four sides. This worked well for smaller images, such as cathedral.jpg
, and other jpgs
. However, images in .tif
format have much higher resolution, and therefore may require a much larger shift in channels. For those, an exhausitve search is too time consuming to be practical.
Avoiding the prohibitively time-consuming exhaustive search approach, I implemented an image pyramid search to vastly speed up the processing of .tif
images. I first downscaled the resolution of the images until both dimensions are no more than 256 pixels, then I performed the same exhausitve search as before.
With the optimal shift now found for the lower resolution counterpart, I translated that shift to the higher resolution counterpart, then performed another exhaustive search, but only centered around the region of the shift. This vastly cuts down on processing time, as I kept the search range in the (-15, 15) region, and even though I may had to search multiple times, the time taken is still only a fraction of that if I expanded the range to be anywhere close to what a .tif
image may need.
As mentioned, I tried multiple different approaches when it comes to difference calculation, and they all gave varying results. With emir.tif
as an example, both L2 and NCC gave subpar results when using the default configuration, where the red and green channels are aligned to the blue channel.
With that said, aligning the image to the green channel did fix the issue, and I went with this setting for all other images.
The final results displayed are all made with L2 difference and green-alignment. No additional post-processing or cropping are done.
Some other images from the Prokudin-Gorskii collection
These are results from other forms of processing I tried. Some worked out, some didn't. L2 + green-alignment seems to be the most consistent parameters for all images, so that is the one I chose, but certain other approaches also produced interesting results.