In the first part of project 4, I took pictures with a fixed center of projection and rotated my cameras. Then, I recovered linear homographies and warp images according to that, and blend warped images together into a mosaic.
I rezied the images to have a proper size to avoid long runtime for the following part.
I took 3 groups of pictures:
(1) buildings scene from my balcony
(2) the door of my apartment
(3) Minecraft views
Images of buildings
Images of the door
Images from Minecraft
I used the tool provided in project 3 to manually define correspondence points. The points spread out the whole image.
Given a set of correspondence points:
The homography equation is:
These can be rearranged to:
Each point pair (x, y) and (x', y') generates two equations:
For the first equation:
For the second equation:
Thus, I build up a linear equation system to solve for the 8 unknowns for H using SVD and computed H.
Before warping the images according to the homography matrix computed in part 2, I added a alpha channel for the unwarped image: I set it to 1 at the center of each (unwarped) image and made it fall off linearly until it hits 0 at the edges. This alpha channel helps to identify the pixels that are not "painted" after warping, and can be used for weighted average when blending the images into a mosaic.
Then, I computed the output shape of warped image. If not specified, I warp the corners of original image and transform them using the homography. Then, find minimum and maximum coordinates in x and y directions, and take the difference between them as the output window size.
Finally, I used inverse warping to paint the warped image. I take all pixels i output window, add the offsets(min_x, min_y if offsets are not specified), and compute their corresponding location in the original image. Then, use scipy.interpolate.griddata to interpolate all 4 channels(R, G, B and alpha channel) and paint the pixels in a vectorized way. The function returns the warped image with 4 channels.
I got two examples, my keyboard and my notebook. I used the correspondence point tool to get the 4 corner points of each image. Then I manually made an array of the 4 corners of a sqaure, and expand the sqaure according to the height and width of the image "rectangle". Finally, use computeH and warpImage functions defined in previous parts to warp the object in the image to be a rectangle.
Original Notebook
Rectified Notebook
Original Keyboard
Rectified Keyboard
(1) Compute the output shape
Firstly, I need to compute the size for the mosaic. To do this, I iterate over all images and transform their corners, and then find the maximum and minimum in x and y directions of all corners. If min_x or min_y goes lower than 0, there should be a universal translation on all images to avoid cutting those parts. After that, the difference between maximum and minimum plus the potential universal translation should be the size of the result mosaic.
(2) Warp each image to the target image
I used computeH to compute the homography for each image using their correspondence points. There should be one image who needs no transform, whose homography should be the identity matrix. Then, add the universal translation computed before to the homography to let the correspondence points align in the final mosaic. warp the image respectively using warImage with the computed mosaic shape.
(3) Blend the warped images together
I used the extra alpha channel returned by warpImage to blend the warped images. Use warped alpha value as weight for each pixel, and record the sum of alpha values to calculate a weighted average in R,G,B channels. Finally, remove the alpha channel after blending.
Original images of buildings
Warped Left
Middle image (Unwarped)
Warped Right
Original mages of the door
Warped Up
Middle image (Unwarped)
Warped Down
Original images from Minecraft
Warped Left
Middle image (Unwarped)
Warped Right
In the second part of project 4, I find Harris corners, compute descriptors, and match features automatically to find the best homography. This enables automatic keypoints detection and matching, which can be used for producing mosaic.
I reused resized images from part A. I read the images in using cv2, saved a gray-scaled version for harris Interest point detection, and saved a RGB format for future mosaic. Using the get_harris_corners function provided, I adjusted sigma and min_dis to be 2 to avoid too many coordinate points (My image is large), and then I got the following results:
Image 1
Image 2
Image 1
Image 2
Image 1
Image 2
Too filter out points that are not corners, I used adaptive non-maximal suppression algorithm as described in the paper. Firstly, I precomputed pairwise distances between all corners for future use to make the function efficient. Then, initialize r to be inf. Iterate over each corner to find the minimum distance to a stronger corner, which means I update r[i] when there's a strong corner that is closer than current r[i]. Finally, sort the points according to its r, descending, and select the top 500 points to remain.
Here's my results:
Image 1
Image 2
Image 1
Image 2
Image 1
Image 2
For each selected keypoint, extract a 40*40 patch around it, and resize it to be 8*8. Normalize the 8*8 subimage and flatten it to be a descriptor.
To make the function more efficient, precompute the distance matrix containing the pairwise Euclidean distances between descriptors. Then, for each descriptor for image1, find the two nearest neighbors in descriptors2, and compute the Lowe's ratio. Lowe's ratio is 1-NN / 2-NN. If the 1-NN meets the threshold, add it to be a match.
After matching, there are still some outliers. So I implemented RANSAC to further filter out incorrect matchings.
To do RANSAC, each time I randomly pick out 4 matching pairs and compute homography according to them. Then, compute all keypoints1 using that H and check the result with keypoints2: for each point, if difference is small, it's a inliner with this homography. Through the loop, we keep the largest set of inliners. Finally, we compute the best homography using the largest set of inliners.
With the automatically computed homography, I can warp images and make mosaic just like part A. Here are my results:
Auto
Manual
Auto
Manual
Auto
Manual
The coolest thing I learned from the project is that, sometime we can think of things reversely. When warping images, inverse warping is more efficient because we get all pixels we need; When trying to filter outliers in the automatic matches, we loop over and over to keep the most inliers. It's very cool to think same thing in different directions.