Abe Davis: New video technology that reveals an object's hidden properties
阿比·戴维斯: 揭示物体隐藏属性的视频新技术
Computer vision expert Abe Davis pioneers methods to extract audio from silent digital videos, even footage shot on ordinary consumer cameras. Full bio
Double-click the English transcript below to play the video.
as a very visual thing.
动作是明显可见的。
or gesture with my hands while I speak,
或者边做手势边说话,
that's too subtle for the human eye,
肉眼很难察觉到,
even when humans can't.
of a person's wrist,
of a sleeping infant,
that these were videos,
这只是两张普通的图片,
at two regular images,
almost completely still.
of subtle motion going on here,
有许多细微的运动变化,
the wrist on the left,
the infant on the right,
and fall of her chest
a lot of significance,
too subtle for us to see,
很难被我们察觉,
what they call a motion microscope,
开发出了一种被称为“动作显微镜”的软件,
these subtle motions in video
become large enough for us to see.
on the left video,
this person's heart rate.
on the right video,
that this infant takes,
to monitor her breathing.
because it takes these phenomena
因为它能帮助我们看到
to experience through touch
and non-invasively.
with the folks that created that software,
与这个软件的编写者们一起工作,
that we can use software
as a way to extend our sense of touch.
人类触觉感官的好方法。
with our ability to hear?
来增强我们的听觉呢?
to capture the vibrations of sound,
into a microphone?
in perspective for you.
work by converting the motion
into an electrical signal,
to move readily with sound
and interpreted as audio.
并还原成声音。
引起任何物体的振动。
too subtle and too fast for us to see.
通常很细微而且转瞬即逝。
with a high-speed camera
将这种振动录下来,
to extract tiny motions
what sounds created them?
弄清声音的来源,会怎么样呢?
into visual microphones from a distance.
可见物体转化为可视化麦克风。
that you see on the right
played this sound.
of frames per second,
just sitting there doing nothing,
by about a micrometer.
只能让叶子移动一微米,
a hundredth and a thousandth
perceptually invisible.
从感官上来说是不可见的。
can be perceptually invisible
seemingly still video
看似静止的视频中
out of so little motion?
得到如此丰富的信息?
move by just a single micrometer,
只移动了一微米,
by just a thousandth of a pixel.
of pixels in it,
of the tiny motions that we see
所有细微的运动
to something pretty significant.
when we figured this out.
这一点的时候真是乐疯了。
a pretty important piece of the puzzle.
that affect when and how well
and the lens that you use;
and how loud your sound is.
声音是否够大等等。
with our early experiments,
我们还是得万分谨慎,
any of these factors wrong,
what the problem was.
也查不出原因。
experiments looked like this.
see our high-speed camera,
by these bright lamps.
very careful in these early experiments,
在初期试验中我们需要十分小心,
Little lamb! Little lamb!
小羊羔!小羊羔!)
looks completely ridiculous.
we tried this on. (Laughter)
(笑声)
to recover this sound.
Little lamb! Little lamb!
小羊羔!小羊羔!)
we recovered intelligible human speech
从一段无声录像中
to modify the experiment,
or moving the object further away,
the limits of our technique,
to a bag of chips,
about 15 feet away,
(4.572米)远的室外,
by only natural sunlight.
from inside, next to the bag of chips.
在薯片旁说话的原声。
whose fleece was white as snow,
身上羊毛白又好,
that lamb was sure to go.
小羊都会跟着跑。)
to recover from our silent video
隔音玻璃后采集的无声影像
whose fleece was white as snow,
身上羊毛白又好,
that lamb was sure to go.
小羊都会跟着跑。)
that we can push these limits as well.
plugged into a laptop computer,
the music that was playing on that laptop
这对塑料耳机的
来识别出这段音乐。
by changing the hardware that we use.
来完善我们的成果。
I've shown you so far
a high-speed camera,
about a 100 times faster
to use this technique
of what's called a rolling shutter.
record images one row at a time,
during the recording of a single image,
物体发生了移动,
between each row,
is that by analyzing these artifacts,
using a modified version of our algorithm.
我们还是可以还原声音。
music from before,
store-bought camera,
能在店里买到的普通摄影机,
the sound that we recovered,
distorted this time,
recognize the music.
看你能否分辨出来这段音乐。
is that we were able to do this
我们这次用的是普通摄影机,
that you could literally run out
这样的电器商店
about surveillance.
this technology to spy on someone.
还真不是什么难事。
a lot of very mature technology
早就有很多成熟的技术
from a distance for decades.
已经出现几十年了。
to picture the vibrations of an object,
描绘物体振动的方法,
through which to look at the world,
去看这个世界。
that cause an object to vibrate,
比如声音,
the ways that we use video,
我们使用视频的方式,
to look at things,
that we learn about the world:
still won't let us do,
视频没法捕捉,
just a few months ago,
I've shown it to a public audience.
to use the vibrations in a video
我们会利用视频里的振动
that will let us interact with them
in the shape of a human,
with just a regular camera.
about this camera.
with my cell phone before.
on the surface where it's resting
台子上敲了几下,
of regular video,
五秒钟的普通视频,
the vibrations in that video
and material properties of our object,
结构特征和材料特征,
to create something new and interactive.
创造出一种新的具有互动性的东西。
and it's not a video,
也不是视频,
with the object.
that we've never seen before,
即使这种外力是初次施加的,
five seconds of regular video.
短短五秒钟的普通视频。
way to look at the world,
how objects will respond
looking at an old bridge
how would that bridge hold up
that you probably want to answer
最好在你开车上桥之前
across that bridge.
limitations to this technique,
with the visual microphone,
in a lot of situations
它能在许多场景下发挥作用,
here's a video that I captured
to create this simulation.
来完成这段模拟。
to a film director,
如果电影导演掌握了这项技术,
in a shot after it's been recorded.
at a hanging curtain,
我们拍摄了一副挂起来的窗帘,
any motion in this video,
你甚至看不出来窗帘在动,
imperceptible motions and vibrations
to create this simulation.
信息来完成这段模拟。
this kind of interactivity
and 3D models,
from real objects in the real world
普通的视频
真实物体进行采样,
a lot of potential.
具有广阔的应用前景。
who worked with me on these projects.
这项技术的优秀的同事。
is only the beginning.
只是一个技术雏形。
with this kind of imaging,
with common, accessible technology.
来记录周围事物的新方法。
really exciting to explore
ABOUT THE SPEAKER
Abe Davis - Computer scientistComputer vision expert Abe Davis pioneers methods to extract audio from silent digital videos, even footage shot on ordinary consumer cameras.
Why you should listen
MIT PhD student, computer vision wizard and rap artist Abe Davis has co-created the world’s most improbable audio instrument. In 2014, Davis and his collaborators debuted the “visual microphone,” an algorithm that samples the sympathetic vibrations of ordinary objects (such as a potato chip bag) from ordinary high-speed video footage and transduces them into intelligible audio tracks.
Davis is also the author of Caperture, a 3D-imaging app designed to create and share 3D images on any compatible smartphone.
Abe Davis | Speaker | TED.com