Assistant Professor Philipp Krähenbühl

Philipp Krähenbühl teaches computers how to see.

Krähenbühl, an assistant professor who joined UTCS last fall, studies a subfield of machine learning called computer vision. By using machine learning techniques, such as deep networks, computers can learn how to differentiate between and manipulate images by running through a large data set of labeled images.

"One of the biggest issues there is that the amount of data you need to teach the computer about even tiny little things is enormous," he said.

Krähenbühl's research specifically focuses on how to reduce the number of labels needed.

"The issue is not getting the images, they're for free," he explained. "[The issue is] getting the labels, or getting the supervision to teach the computer what you want it to see."

In October, he and researchers from the University of California Berkeley, where Krähenbühl completed his postdoctoral work, presented their machine learning research at the European Conference on Computer Vision in Amsterdam. This research explored using machine learning to edit images to look realistic.  

"For example, you want to edit the image of a shoe and you want to make it white," he demonstrated. "If you make parts of the shoe white, it will automatically adjust the entire shoe to be white, because that’s a realistic change. If you deform parts of the shoe, it will automatically adjust the shape of the shoe so that it looks like a realistic-looking shoe."

Using programs like Photoshop to edit and create realistic images is a "pretty painful process," according to Krähenbühl, but using a computer with a machine learning program makes the process easier.

Krähenbühl also researches using these machine learning techniques to create entirely new images. For example, if a user tells the computer they want a landscape, by drawing a few simple lines that represent a mountain, the computer's algorithm will create an image of a mountain that resembles the input.

A machine learning algorithm can also fill in missing pieces from an image.

"For example, if I take this bottle and I hide it from you and I ask you, what might be behind my hands? If you are able to say that this is a bottle, you actually understand a lot about the visual world. You understand that bottles need to be on tables, you understand what bottles look like.

"It actually teaches you something about the visual world if you’re able to do this task very, very well," he added.

Krähenbühl believes computer vision is the most important field of machine learning. He hopes computers someday gain enough understanding through computer vision that they can observe and understand human actions.

"All of what machine learning does, in some sense, is trying to understand certain aspects of our world or certain patterns in our world," he explained. "To me, the visual signal is the most interesting one, because we humans heavily rely on it and it's how we define our world."

Krähenbühl has noticed that the field of computer vision has been growing over the past five years.

"The funniest indicator of this is the size of the conferences and the amount of industry interest in the kind of work we're doing," he said. "About five years ago, nobody was really interested in computer vision, because it was something that people in academia [did], it was never going to work in the real world for anything. I think machine learning now has become a center of attention for computer science in the past few years because things are starting to work."

News categories: