Computational Photography Research Essay

Marc Levoy
Summary of current research

Last updated June 2009
(As of July 2014 I am Emeritus and no longer taking new PhD students.)


One of the great pleasures of being at Stanford is interacting with its talented and enthusiastic faculty, staff, and students. Described below are some of the research projects we are working on together. Aside from publishing technical papers about these projects, we often package and release our software and our data.

Measurement technologies for graphics. I like building things. Over the past 15 years, my students and I have built devices for measuring 3D shape, light fields, and reflectance functions. These particular devices all included a mechanical component. However, the trend is towards optoelectronic solutions. With this in mind, we've developed a real-time range scanner based on video projectors, a handheld camera for capturing instantaneous light fields, a multi-camera array for acquiring video light fields, and a light field microscope for capturing light fields of tiny biological specimens. The latter two projects were collaborations with Pat Hanrahan, Mark Horowitz, and their students.

Light fields and related ideas. A light field is a 2D array of (2D) images, each taken from a different viewpoint. Here's a gentle introduction to the idea. The resulting 4D array completely characterizes the passage of light through unoccluded space. By assembling pixels from several images, new views can be constructed from observer positions not present in the original array. This idea was described by Pat Hanrahan and myself in a 1996 Siggraph paper. The Stanford multi-camera array (mentioned above) has given us a unique device for exploring applications of light fields, including high-speed video using a dense camera array. Starting with the experiments leading up to our 1996 paper (see the historical note on this web page), we began playing with the optical effects of shearing the 4D light field array. This led us to discover two (digital) optical effects: synthetic aperture photography (SAP), in which one combines closely spaced views of an object, thereby letting us see through partially occluding objects like foliage, and synthetic aperture illumination (SAI), in which one physically projects multiple images onto a common plane in space, thereby creating a synthetic image with an extremely shallow depth of field. Here's a paper about calibrating our camera array for synthetic aperture photography, here are some examples of seeing through people, and here's an array of miniature video projectors we once built to further explore synthetic aperture illumination. Using these two optical effects, we have implemented a discrete approximate of confocal imaging, a technique borrowed from microscopy. This approximation lets us selectively image or illuminate one object in a complex scene, and it lets us see further through scattering environments (such as turbid water) than is otherwise possible. For a while we also explored techniques called dual photography and symmetric photography, which use Helmholtz reciprocity to swap the cameras and light sources in a scene. These techniques allow us to produce images from viewpoints in the scene where a camera never stood. As you can see, we enjoy playing with the theoretical aspects of light fields. Our latest forays in this vein are a paper on general linear cameras with finite aperture and a paper on Wigner distributions (familiar to the radar community) and how they relate to light fields. The latter effort won Best Paper at the First International Conference on Computational Photography (ICCP 2009). Finally, we worked for a while on multi-perspective imaging, with an eye towards visualizing urban landscapes, and animating light fields by interactively deforming its defining ray space. The CityBlock project has since been taken over by Google, where it is better known as StreetView.

Digital refocusing and its applications. Synthetic aperture photography (SAP) and synthetic aperture illumination (SAI), introduced in the foregoing paragraph (and in concurrent work by other groups in the early 2000's), are now more commonly known as digital refocusing (of views and light). In this guise, they have become hot topics in the computational photography community. In our lab, one of the outcomes of this research was the handheld light field camera (mentioned above) built by PhD student Ren Ng, which gives a photographer the ability to refocus an ordinary snapshot after it has been taken. This dramatic effect must be seen to be believed; check out the video on this web page, or the web page of Ren's startup company, Refocus Imaging. By the way, Ren's PhD thesis won the 2006 ACM Doctoral dissertation Award. Recently, we adapted this idea to microscopy, by inserting a microlens array into a conventional microscope. The resulting light field microscope (LFM) can produce perspective flybys, focal stacks, and volume datasets from a single photograph, and therefore at a single instant in time. By inserting a similar microlens array into the illumination path of the microscope, one creates a light field illumination (LFI) system, which can be used to reproduce exotic microscope illumination modalities or to create focused spots or shapes of light anywhere in 3D inside a specimen. Combining the LFM and LFI, we can measure and correct for the aberrations that arise when imaging through optically uncooperative specimens (as many of them are).

Computational photography. Computational photography refers broadly to sensing strategies and algorithmic techniques that enhance or extend the capabilities of digital photography. The output of these techniques is an ordinary photograph, but one that could not have been taken by a traditional camera. Our light field camera is a kind of computational photography, but so is high dynamic range imaging, flash-noflash imaging, coded aperture and coded exposure imaging, photography under structured illumination, multi-perspective and panoramic stitching, digital photomontage, and all-focus imaging. In our lab, aside from our work on light field imaging, we've worked on reducing veiling glare in cameras by inserting a mask between the camera and the scene, reducing the deer-in-the-headlights look of flash pictures by shaping the flash spatially using a video projector, and playing with high-dimensional bilateral filtering using Gaussian KD-Trees. Finally, in response to a growing feeling that progress in some aspects of computational photography has been hampered by the lack of a portable, programmable camera platform with enough image quality and computing power to be used for everyday photography, we have been looking into computational photography on cell phones. At the same time, we are building an open-source camera platform that runs Linux, is programmable and connected to the Internet, and accommodates SLR lenses and SLR-quality sensors. Our goal is to distribute this platform, which we call Frankencamera, at minimal cost to computational photography researchers and courses worldwide. We've lumped these last two projects under the banner Camera 2.0. By the way, the phrase "computational photography" has been re-invented several times over the last 20 years. Its current incarnation arose from a course by that name I first offered at Stanford in 2004 (here's the most recent version of that course) and from a symposium I co-organized with Fredo Durand and Rick Szeliski at MIT in 2005.

Graphics in the service of the humanities. My original training is in architecture (buildings, not computers), so whenever I can I try to blend technology and the humanities in my research. One such effort was the Digital Michelangelo Project, a 5-year effort to create a three-dimensional digital archive of the statues of Michelangelo. Although the project is now dormant, we managed to create reasonably good computer models of 2 of the 10 statues we scanned. These models, and our raw scan data for the other 8, are available to scholars through our online archive. Another project at the juncture of technology and the humanities was our digitization of the 1,186 fragments of the Forma Urbis Romae, a giant marble map of ancient Rome. With guidance from Jennifer Trimble in Classics, we created a geometric, photographic, and textual database of the map, and with help from Leo Guibas and his students, we tried to solve the jigsaw puzzle algorithmically. In the two years we worked on this problem, we found about 20-40 matches (depending on how many you believe). That may not sound like much, but it's more than human archaeologists have found in 30 years. (Here's a blurb on our first match, and here's an article from the Stanford Report describing the whole project.) Papers describing the project in more detail, and enumerating all the matches we found, appear in the Journal of Roman Archaeology and the Bullettino Della Commissione Archeologica Comunale di Roma. Another project with an archaeological flavor was the Cuneiform Tablet Visualization Project, in which we scanned, unwrapped, and non-photorealistically shaded the tablets' curved inscribed surfaces. Finally, the task of assembling archives of Michelangelo's statues and the Forma Urbis Romae led Hector Garcia-Molina and I to think about the problems of creating digital archives of 3D artworks. Our efforts in this area focused on real-time display of large models on low-cost PCs, efficient streaming of these models over networks of limited bandwidth, and protected viewing for non-licensed users via a remote rendering system, which is available for download (currently only for Windows PCs, I'm afraid).

Other fun follow-on projects. The ability to create high-resolution computer models of statues has presented us with some unexpected opportunities. For example, the Galleria dell'Accademia in Florence invited us to install a computer kiosk near Michelangelo's giant figure of David. Between November of 2002 and the present (2009), about 7 million visitors have seen this kiosk. Our hope is that by allowing museum visitors to interactively rotate (and relight) our model of the statue, they can examine parts of the statue that are hard to see from the ground, like his head and hands. We've also begun making physical replicas of the David; I have one sitting in my display case, right next to the notorious Stanford bunny. Something else we've looked at is projecting images onto scanned objects using powerful (and carefully aligned) video projectors. Possible applications include visually canceling dirt as a planning aid for art conservators, recoloring ancient statues (that were originally painted), and, inspired by the Son et Lumiere shows of France, non-photorealistically coloring a statue to look like a 3D drawing or painting.

Volume rendering, point-based rendering, and visualization. In a previous life, I worked on volume rendering. (We still maintain Phil Lacroute's popular Volpack package and an archive of volume datasets.) Volume rendering is now a mature research area and a core part of the billion-dollar data visualization software industry. (Too bad I didn't start up a company in this area!) Among the many books on the subject, my favorite is by Hadwiger, Kniss, Rezk-salama, and Engel (2006). Although I am no longer actively working in this area, I've become interested in alternative ways of visualizing the three-dimensional structure of the natural world. Inspired by landmark books like Robert Hooke's Micrographia (1665) and Harold Edgerton's Stopping Time (1964), I have begun working on a book of volume renderings. The book will be called Volumegraphica. Another inactive project is my spreadsheet for images. Many people have asked me for this software; one day I might clean it up for distribution. Finally, I continue to think about the use of points as a display primitive. Although this approach proved impractical in 1985, the highly successful QSplat system is based on it, and the first Symposium on Point-Based Graphics was held in June 2004. There's now a book on the subject, edited by Gross and Pfister (2007), to which I contributed a small chapter.


Index of research projects I'm working on

Click here for a list of my publications.
Click here for a list of all the publications from our laboratory.
Click here for a list of all the research projects going on in our laboratory.


© 1994-2009 Marc Levoy
Last update: July 1, 2014 06:41:09 PM

INSTRUCTOR:Alexei (Alyosha) Efros (Office hours: Wednesdays 2-3pm, at 724 Sutarja Dai Hall)
GSI: Shiry Ginosar  (Office hours: Fridays 2-4PM Soda 651, starting 9/19)
GSI: Shubham Tulsiani  (Office hours: Mondays 2:30-4PM Soda 651)
UNIVERSITY UNITS: 4
SEMESTER: Fall 2014
WEB PAGE:http://inst.eecs.berkeley.edu/~cs194-26/fa14/
Q&A:Piazza Course Website
HW SUBMISSIONS:how to submit
LOCATION: 306 Soda
TIME: M F 4:00-5:30 PM

COURSE OVERVIEW:
Computational Photography is an emerging new field created by the convergence of computer graphics, computer vision and photography. Its role is to overcome the limitations of the traditional camera by using computational techniques to produce a richer, more vivid, perhaps more perceptually meaningful representation of our visual world.

The aim of this advanced undergraduate course is to study ways in which samples from the real world (images and video) can be used to generate compelling computer graphics imagery. We will learn how to acquire, represent, and render scenes from digitized photographs. Several popular image-based algorithms will be presented, with an emphasis on using these techniques to build practical systems. This hands-on emphasis will be reflected in the programming assignments, in which students will have the opportunity to acquire their own images of indoor and outdoor scenes and develop the image analysis and synthesis tools needed to render and view the scenes on the computer.

TOPICS TO BE COVERED:

  • Cameras, Image Formation
  • Visual Perception
  • Image and Video Processing (filtering, anti-aliasing, pyramids)
  • Image Manipulation (warping, morphing, mosaicing, matting, compositing)
  • Modeling and Synthesis with Visual Big Data
  • Imaging and Tone Mapping
  • Image-Based Lighting
  • Image-Based Rendering
  • Non-photorealistic Rendering

PREREQUISITES:
Programming experience and familiarity with linear algebra and calculus is assumed.  Some background in computer graphics, computer vision, or image processing is helpful.  This class does not significantly overlap with cs184 (Computer Graphics) and can be taken concurrently.
PhD Students: a small number of PhD students will be allowed to take the graduate version of this course (CS294-26) with the permission of the instructor. Students taking CS294-26 will be required to do more substantial assignments as well as a research-level final paper.
Note: if the system doesn't let you sign up, or puts you on the waitlist, do talk to me.

PROGRAMMING ASSIGNMENTS:      

TEXT:
There is now a textbook that covers most (if not all) of the topics related to Computational Photography.  This will be the primary reference for the course:

            Computer Vision: Algorithms and Applications, Richard Szeliski, 2010

There is a number of other fine texts that you can use for general reference:

Photography (8th edition), London and Upton, (a great general guide to taking pictures)
Vision Science: Photons to Phenomenology, Stephen Palmer (great book on human visual perception)
Digital Image Processing, 2nd edition, Gonzalez and Woods (a good general image processing text)
The Art and Science of Digital Compositing, Ron Brinkmann(everything about compositing)
Multiple View Geometry in Computer Vision, Hartley & Zisserman(a bible on recovering 3D geometry)
The Computer Image, Watt and Policarpo(a nice "vision for graphics" text, somewhat dated)
3D Computer Graphics (3rd Edition), Watt (a good general graphics text)
Fundamentals of Computer Graphics, Peter Shirley (another good general graphics text)
Linear Algebra and its Applications, Gilbert Strang(a truly wonderful book on linear algebra)

CLASS NOTES
The instructor is extremely grateful to a large number of researchers for making their slides available for use in this course.  Steve Seitz and Rick Szeliski have been particularly kind in letting me use their wonderful lecture notes.  In addition, I would like to thank Paul Debevec, Stephen Palmer, Paul Heckbert, David Forsyth, Steve Marschner and others, as noted in the slides.  The instructor gladly gives permission to use and modify any of the slides for academic and research purposes. However, please do also acknowledge the original sources where appropriate.

   

TENTATIVE CLASS SCHEDULE:

CLASS DATE

TOPICS

Material

Fri Aug 29

Introduction

Fri Sept 5

Capturing Light... in man and machine

Mon Sept 8

The Camera

  • Szeliski Ch. 2
  • Slides: pdfppt

 

Fri Sept 12

Sampling and Reconstruction

Mon & Fri
Sept 15/19

Sampling and Reconstruction

  • Continue Szeliski Ch 3
  • Slides: pdfppt

Mon Sept 22

The Frequency Domain and Filtering

  • Continue Szeliski Ch 3
  • Slides: pdfpptx

Fri Sept 26

Image Blending and Compositing

  • Continue Szeliski Ch 3 + Sec 9.3.4
  • Slides: pdfppt
  • Additional Reading:
    • Burt and Adelson, A multiresolution spline with application to image mosaics, ACM ToG, 1983
    • McCann & Pollard, Real-Time Gradient-Domain Painting, SIGGRAPH 2008
    • Perez et al., Poisson Image Editing, SIGGRAPH 2003
    • Bhat et al., GradientShop: A Gradient-Domain Optimization Framework for Image and Video Filtering, SIGGRAPH 2010
    • Avidan and Shamir, Seam Carving for Content-Aware Image Resizing, SIGGRAPH 2007
    • Kwatra et al., Graphcut Textures: Image and Video Synthesis Using Graph Cuts, SIGGRAPH 2003
    • Agarwala et al., Interactive Digital Photomontage, SIGGRAPH 2004

Mon Sept 29

Point Processing and Image Warping

  • Continue Szeliski Ch 3, 2.1.2
  • Slides: pdfppt

Fri Oct 3
Mon Oct 6

Image Morphing

  • Continue Szeliski Ch 3
  • Slides: pdfppt

Fri Oct 10

Data-driven Methods: Faces

  • Slides: pdfppt
  • Galton, "Composite portraits made by combining those of many different persons into a single figure.", Nature, 1878
  • Rowland and Ferrett, "Manipulating Facial Appearance through Shape and Color", CG&A, 1995
  • Blanz and Vetter, "A Morphable Model for the Synthesis of 3D Faces", SIGGRAPH 1999
  • Cootes, Edwards, and Taylor, "Active Appearance Models", ECCV 1998

Mon Oct 13

Data-driven Methods: Visual Data on the Internet (part 1)

Fri Oct 17

Data-driven Methods: Visual Data on the Internet (part 2)

 

Mon Oct 20

Data-driven Methods: Visual Data on the Internet (part 3)

Fri Oct 24

Modeling Light

Mon Oct 27

Homographies and Mosaics

Fri Oct 31

More Mosaic Madness

Mon Nov 3

Guest Lecture - Mark Lescroart

Fri Nov 7

Automatic Alignment

Mon Nov 10

Single View Reconstruction

Fri Nov 14

Multi-perspective Panoramas

  • L. Zelnik-Manor et al., “Squaring the Circle in Panoramas”, ICCV 2005
  • L. Zelnik-Manor and P. Perona, “Automating Joiners”, NPAR 2007
  • A. Agarwala et al., “Photographing Long Scenes With Multi-Viewpoint Panoramas”, SIGGRAPH 2006
  • Y. Nomura et al., “Scene Collages and Flexible Camera Arrays”, Proc. of Eurographics Symposium on Rendering, 2007
  • Flickr Joiners Group
  • Slides: pdfppt

Mon Nov 17

High Dynamic Range Images

Fri Nov 21

Image-based Lighting

 

Mon Nov 24

Midterm exam!

Mon Dec 1

Image-based Lighting II

Fri Dec 5

What makes a great picture?

CAMERAS:
Although it is not required, students are highly encouraged to obtain a digital camera for use in the course.

METHOD OF EVALUATION:
Grading will be based on a set of programming and written assignments (60%), an exam (20%) and a final project (20%).  For the programming assignments, students will be allowed a total of 5 (five) late days per semester; each additional late day will incur a 10% penalty.

Students taking CS294-26 will also be required to submit a conference-style paper describing their final project.

MATLAB:
Students will be encouraged to use Matlab (with the Image Processing Toolkit) as their primary computing platform.  Besides being a great prototyping environment, Matlab is particularly well-suited for working with image data and offers tons of build-in image processing functions.  Here is a link to some useful Matlab resources

PREVIOUS OFFERINGS OF THIS COURSE:
Previous offerings of this course can be found here.

SIMILAR COURSES IN OTHER UNIVERSITIES:

0 thoughts on “Computational Photography Research Essay

Leave a Reply

Your email address will not be published. Required fields are marked *