Applications of Seam Carving

CS 585 Project Report
Aaron Jacob Varghese
Rahul Ramachandran
Shivam Satwah
Raj Vipani


Background

Seam carving (or liquid rescaling) is an algorithm with the purpose of image retargeting, which is the problem of displaying images without distortion on media of various sizes (cell phones, projection screens) using document standards, like HTML, that already support dynamic changes in page layout and text but not images.


Problem Definition

Implementing the seam carving algorithm and using it to perform the following tasks of image manipulation


Method and Implementation

We divided the implementation of this project into ordered steps and the method of each is described below:

Energy Function

The algorithm uses an energy function to identify optimal paths of low importance pixels, which it can remove for resizing. Energy values depend on how much the surrounding pixels change color. Traditional energy calculation of pixel at pos (x,y) is simply the difference between pixel value at (x-1,y) and (x+1,y).

We used forward energy calculation algo which predicts what pixels will be adjacent after a seam removal, and uses that to suggest the best seam to remove. We have to consider which pixels are brought together by the removal of a particular pixel. This depends on if the current pixel is connected to a seam on the top-left, top or top-right.


Seam Carving Algorithm

Once the energy matrix is calculated for the entire image, our goal in each iteration is to find the lowest energy seam passing through the entire height/width of the image. At first it might make sense to use a greedy approach to solve this problem but it can lead to getting stuck in a high-energy region of the image in later stages, as showin in the image below.

This problem can be solved efficiently with a dynamic programming approach, with the recurrence relation shown below.

At each pixel, we look at the minimum energy seams encountered so far, ending at the three neighbor pixels in the row just before the target pixel. We then associate the target pixel with the least energy seam among these. At the end of the algorithm, the minimum energy seam in the entire image is represented by the least energy pixel in the last row. The rest of the seam can be constructed using the stored pointers.

Image Enlargement

We can apply the same technique of finding low energy seams to expand the image as well. This works on the same principle that the human eye is unlikely to detect the low energy content in an image and hence a small addition of such seams would be imperceptible.

  1. Run the original seam carving algo on a duplicate image to find n-seams and store these seams
  2. Iterate through each seam, inserting new pixels at the location of each seam element.
  3. The new pixel value would be the average of its neighbors on the left and right.

Object Deletion

For object deletion, we first used a Mask R-CNN model pre-trained on the COCO dataset. The model takes an image as input, produces regions of interest, classifies those regions and finally generates masks for them.

Using this, we can generate masks for target objects in the given image. We superimpose the mask on top of the calculated energy matrix, giving negative weights to the area corresponding to the objects in the image. By doing this, we force the seam to pass through the object we would like to remove.

Video Object Deletion

This was accomplished by automating the mask generation process using Mask R-CNN. By feeding the algorithm our video and the target class of the object to be removed, the algorithm will automatically run the network each time and generate a mask, which can then be applied on the energy matrix and used to delete the object.


Experiments

We experimented to find the best possible energy funcion since we were going to used the algo for a complex task like video object deletion as well. Out of all the energy functions we found the best one to be the forward energy function. The other energy function we tried were the traditional energy function and convolution filter energy function

The traditional energy function seemed to be have a lot of visual artifacts especially of a pixel with its top layer in case of big color changes. The horizontal distortions were also fairly visible. The processing time of this function was the fastest among all.

The convolution filter energy function as implemented in our code convolves a 3 X 3 filter for the whole image and the resultant values are the energy matrix. The filter is quite similar to the ones used for edge detection since the idea of big energy change is same. This filter provided better results. The horizontal distortions were no longer visible but some visual artifacts especially with the top layer were visible. The time taken for this function to calculate the energy matrix was more than the traditional energy functions.


Results

We ran our model on multiple images, resizing them and deleting objects. Some of our results are listed below:

Resizing

Trial Source Image Result Image
Reduce width
Reduce height
Increase width

Object Deletion

Trial Source Image Result Image
Remove person
Remove person, handbag
Remove person, motorcycle
Remove car

Video Object Deletion

Removing all objects shaped like human beings using RCNN for the Computer Vision Homework 3 video and then resizing by inserting seams low energy seams.


Discussion


Conclusions

We have built an automated system for content-aware image resizing that can also function as an object removal mechanism and content enhancer. With further improvement in the mask that we create, it would be possible to enhance/remove content with good precision. This project helped us appreciate the potential of computer vision systems.


Credits and Bibliography

  1. http://www.faculty.idc.ac.il/arik/SCWeb/imret/imret.pdf
  2. http://www.faculty.idc.ac.il/arik/SCWeb/vidret/vidret.pdf