The goal of this project is to further my understanding of image representation and manipulation, and produce large sections of image that is the product of multiple individual images stitched together.
Before we begin processing the images, we need images that satisfy certain properties, as not any two images can be stitched together seamlessly. Ideally, the images should have a good amount of overlap and similar lighting. Below are some of the images I used:
The pairs of photos are shot from the same center of projection, but have different angles, so the first task is to recover the homography that warps one image into the perspective of the other. This can be done by solving a system of equations, similar to what we did in project 3. What is different, however, is the fact that this process is highly susceptible to noise. As such, although using 4 points as reference is technically fine, it would need to be extremely accurate. As such, I opted to use multiple correspondences (usually in the double digits), and solve the overconstrained system with least squares. The rest of the warping process is similar to project 3, where we use inverse warping to set both images into the same perspective.
To make sure that our warp function is functioning properly, we can take pictures of any object and warp them into arbitrary shapes. One example is taking pictures of rectangular objects from an angle, and rectifying the image such that the object rectangular again. Since we're only working with one image, we can set the correspondence of the image to be the corners of said rectangle, and use an arbitrary set of rectanglular or square coordinates. For the examples below, I used a set of square coordinates, so the door is also compressed into a square shape.
Now that we know our function is working as intended, I can use it to stitch images together. To do so, I warp images similarly to how I did it in part 1, and shift the second image so the features match up. To ensure a smooth transition, I also blended the overlapping regions of the two images. Unfortunately, even though the blending worked for most parts, some images still came out with visible edges, as the lighting for the two images were different.
The goal of this project is to create a system for automatically stitching images into a mosaic. A secondary goal is to learn how to read and implement a research paper.
Starting off, we want to find significant points in the image, which are corners that define certain changes. To do this, we use Harris Interest Point Detector. Below is an example of corners detected in an image.
As one can see, there are so many corners in the previous image they essentially cover everything. To improve our results and speed up computation, we pick out the important corners from all the ones detected above using ANMS. In my implementation, I calculated the L2 distance between any given corner and all other corners with a stronger h ("strength" of a corner as determined by HIPD). To suppress neighbors of a particularly strong corner, we first multiply the h by a constant to find those corners, then we calculate the distance to the nearest neighbor that is stronger. We take the first 500 corners with the largest neighbor distances to ensure an even spread of corners throughout the image.
To match corners between the images, we extract a patch of information around each corner, and use those information to find the closest match for a point in the other image. In my case, I picked a patch of 40 * 40 pixels around each corner, downscaled them to 8 * 8 pixels, then normalize them. I did this to each corner in both images, then cross compared the L2 distance between corners in either images. To minimize false positives, we use Lowe's trick and only admit a valid matching if the L2_nearest / L2_second is less than a given constant. After all this, the corners with reciprocal matches in both images are the true pairings.
The previous part provided a solid set of matching corners, but there are still outliers that can heavily skew the result. To further filter our matches, we use RANSAC to identify outliers and remove them from our set. For each RANSAC iteration, we select 4 random pairs of points, compute homography H from those, and compile the error of the said homography in the form of SSD. If the SSD is more than some threshold, we know those points are likely to be outliers, and thus reject them. I repeat this process for 10000 times to make sure no outliers are present in our final dataset. At last, we use the pure dataset to compute the overall H, and use that to stitch our images.
Below are the comparison of manualy stitching versus autostitching:
Overall, this project is extremely challenging, both in terms of difficulty and workload. With that said, the end results are also very interesting. The RANSAC results turned out particularly well, and I'm proud of how clean it looks.