CS180 Project 4

[Auto]Stitching Photo Mosaics

by Ruiqi Wang

Part A: Image Warping and Mosaicing

Overview

In the first part of project 4, I took pictures with a fixed center of projection and rotated my cameras. Then, I recovered linear homographies and warp images according to that, and blend warped images together into a mosaic.

Part 1: Shoot the Pictures

Preprocessing

I rezied the images to have a proper size to avoid long runtime for the following part.

I took 3 groups of pictures:
(1) buildings scene from my balcony
(2) the door of my apartment
(3) Minecraft views

Images of buildings

Images of the door

Images from Minecraft

Recover Homographies

Defining Correspondences

I used the tool provided in project 3 to manually define correspondence points. The points spread out the whole image.

Compute the Homography matrix

Given a set of correspondence points:

(x, y) in image 1
(x', y') in image 2

The homography equation is:

[x',y',1] = H [x,y,1]

These can be rearranged to:

h1 x + h2 y + h3 - x' (h7 x + h8 y + 1) = 0 h4 x + h5 y + h6 - y' (h7 x + h8 y + 1) = 0

Each point pair (x, y) and (x', y') generates two equations:

For the first equation:

[h1, h2, h3, h4, h5, h6, h7, h8] \cdot [-x, -y, -1, 0, 0, 0, x'x, x'y, x'] = 0

For the second equation:

[h1, h2, h3, h4, h5, h6, h7, h8] \cdot [0, 0, 0, -x, -y, -1, y'x, y'y, y'] = 0

Thus, I build up a linear equation system to solve for the 8 unknowns for H using SVD and computed H.

Part 3: Warp the Images

Approach

Before warping the images according to the homography matrix computed in part 2, I added a alpha channel for the unwarped image: I set it to 1 at the center of each (unwarped) image and made it fall off linearly until it hits 0 at the edges. This alpha channel helps to identify the pixels that are not "painted" after warping, and can be used for weighted average when blending the images into a mosaic.

Then, I computed the output shape of warped image. If not specified, I warp the corners of original image and transform them using the homography. Then, find minimum and maximum coordinates in x and y directions, and take the difference between them as the output window size.

Finally, I used inverse warping to paint the warped image. I take all pixels i output window, add the offsets(min_x, min_y if offsets are not specified), and compute their corresponding location in the original image. Then, use scipy.interpolate.griddata to interpolate all 4 channels(R, G, B and alpha channel) and paint the pixels in a vectorized way. The function returns the warped image with 4 channels.

Part 3: Image Rectification

Approach

I got two examples, my keyboard and my notebook. I used the correspondence point tool to get the 4 corner points of each image. Then I manually made an array of the 4 corners of a sqaure, and expand the sqaure according to the height and width of the image "rectangle". Finally, use computeH and warpImage functions defined in previous parts to warp the object in the image to be a rectangle.

Result:

Original Notebook

Rectified Notebook

Original Keyboard

Rectified Keyboard

Part 4: Blend the images into a mosaic

Approach

(1) Compute the output shape

Firstly, I need to compute the size for the mosaic. To do this, I iterate over all images and transform their corners, and then find the maximum and minimum in x and y directions of all corners. If min_x or min_y goes lower than 0, there should be a universal translation on all images to avoid cutting those parts. After that, the difference between maximum and minimum plus the potential universal translation should be the size of the result mosaic.

(2) Warp each image to the target image

I used computeH to compute the homography for each image using their correspondence points. There should be one image who needs no transform, whose homography should be the identity matrix. Then, add the universal translation computed before to the homography to let the correspondence points align in the final mosaic. warp the image respectively using warImage with the computed mosaic shape.

(3) Blend the warped images together

I used the extra alpha channel returned by warpImage to blend the warped images. Use warped alpha value as weight for each pixel, and record the sum of alpha values to calculate a weighted average in R,G,B channels. Finally, remove the alpha channel after blending.

Result for buildings

Original images of buildings

Warped Left

Middle image (Unwarped)

Warped Right

Result for the door (Vertical mosaic)

Original mages of the door

Warped Up

Middle image (Unwarped)

Warped Down

Result for Minecraft Images

Original images from Minecraft

Warped Left

Middle image (Unwarped)

Warped Right

Part B: FEATURE MATCHING for AUTOSTITCHING

Overview

In the second part of project 4, I find Harris corners, compute descriptors, and match features automatically to find the best homography. This enables automatic keypoints detection and matching, which can be used for producing mosaic.

Step 1: Harris Interest Point Detector

Appoach

I reused resized images from part A. I read the images in using cv2, saved a gray-scaled version for harris Interest point detection, and saved a RGB format for future mosaic. Using the get_harris_corners function provided, I adjusted sigma and min_dis to be 2 to avoid too many coordinate points (My image is large), and then I got the following results:

Image 1

Image 2

Image 1

Image 2

Image 1

Image 2

Step 2: Adaptive Non-Maximal Suppression

Appoach

Too filter out points that are not corners, I used adaptive non-maximal suppression algorithm as described in the paper. Firstly, I precomputed pairwise distances between all corners for future use to make the function efficient. Then, initialize r to be inf. Iterate over each corner to find the minimum distance to a stronger corner, which means I update r[i] when there's a strong corner that is closer than current r[i]. Finally, sort the points according to its r, descending, and select the top 500 points to remain.

Here's my results:

Image 1

Image 2

Image 1

Image 2

Image 1

Image 2

Step 3: Feature Descriptor Extraction and Matching

Feature Descriptor Extraction

For each selected keypoint, extract a 40*40 patch around it, and resize it to be 8*8. Normalize the 8*8 subimage and flatten it to be a descriptor.

Feature Descriptor Matching

To make the function more efficient, precompute the distance matrix containing the pairwise Euclidean distances between descriptors. Then, for each descriptor for image1, find the two nearest neighbors in descriptors2, and compute the Lowe's ratio. Lowe's ratio is 1-NN / 2-NN. If the 1-NN meets the threshold, add it to be a match.

Results:

Step 4: RANSAC

Approach

After matching, there are still some outliers. So I implemented RANSAC to further filter out incorrect matchings.

To do RANSAC, each time I randomly pick out 4 matching pairs and compute homography according to them. Then, compute all keypoints1 using that H and check the result with keypoints2: for each point, if difference is small, it's a inliner with this homography. Through the loop, we keep the largest set of inliners. Finally, we compute the best homography using the largest set of inliners.

Results:

Step 5: Automatic Mosaic

With the automatically computed homography, I can warp images and make mosaic just like part A. Here are my results:

Auto

Manual

Auto

Manual

Auto

Manual

What have I learned?

The coolest thing I learned from the project is that, sometime we can think of things reversely. When warping images, inverse warping is more efficient because we get all pixels we need; When trying to filter outliers in the automatic matches, we loop over and over to keep the most inliers. It's very cool to think same thing in different directions.