Blaise Agüera y Arcas: How computers are learning to be creative
布莱斯 · 阿尔卡斯: 计算机如何学习具有创造力
Blaise Agüera y Arcas works on machine learning at Google. Previously a Distinguished Engineer at Microsoft, he has worked on augmented reality, mapping, wearable computing and natural user interfaces. Full bio
Double-click the English transcript below to play the video.
that works on machine intelligence;
机器智能的项目组,
of making computers and devices
that brains do.
完成某些任务的电脑和设备。
interested in real brains
大脑以及神经科学
in the things that our brains do
to the performance of computers.
has been perception,
其中一个领域便是感知,
out there in the world --
for example, that our team makes,
让你在谷歌相册的图片
on Google Photos to become searchable,
out there into the world.
our work on machine perception
在机器感知能力方面的工作
with the world of machine creativity
had a penetrating insight
between perception and creativity.
has a statue inside of it,
is to discover it."
Michelangelo was getting at
is an act of imagination
and perceiving and imagining,
with a brief bit of history
the heart or the intestines,
about a brain by just looking at it,
就能看出点什么来,
of this thing all kinds of fanciful names,
取了各种充满想象力的名字。
doesn't tell us very much
developed some kind of insight
Santiago Ramón y Cajal,
伟大的神经解剖学家
(Santiago Ramón y Cajal),
or render in very high contrast
单个细胞填充或者渲染上
their morphologies.
that he made of neurons
of different sorts of cells,
本身还是新鲜事物。
was quite new at this point.
very, very long distances --
to some people in the 19th century;
来说是显而易见的,
were just getting underway.
of Ramón y Cajal's, like this one,
绘画,比如这一张,
that Ramón y Cajal started.
of Neuroscience.
神经科学研究所的合作者。
is about one cubic millimeter in size,
是1立方毫米左右,
very small piece of it here.
很小很小的一块区域。
tiny block of tissue.
of hair is about 100 microns.
100微米左右。
much, much smaller
electron microscopy slices,
神经元三维图像。
in 3D of neurons that look like these.
style as Ramón y Cajal.
所用的方式是一样的。
be able to see anything here.
one neuron to another.
纵横交错的通路。
ahead of his time,
有一点超前于他的时代,
over the next few decades.
神经元通过电流传导信息,
已取得了长足的进步,
was advanced enough
experiments on live neurons
神经元细胞上做电流实验,
when computers were being invented,
这个时候被发明了出来,
of modeling the brain --
as Alan Turing called it,
所称的“智能机器”理念,
looked at Ramón y Cajal's drawing
沃尔特 · 皮兹(Walter Pitts)看到了
大脑视觉皮层,
imagery that comes from the eye.
like a circuit diagram.
in McCulloch and Pitts's circuit diagram
of computational elements
就像一系列计算机元件
one to the next in a cascade,
visual information would need to do.
for us to do with our brains.
that for a computer,
just a few years ago.
this task is easy to do.
and the word "bird,"
connected to each other
inside our visual cortices,
存在于我们大脑视觉皮层里,
to have the capability
on the computer.
that actually looks like.
about as a first layer of neurons,
how it works in the eye --
眼睛内部的工作原理——
after another layer of neurons,
of different weights.
of all of those synapses.
properties of this network.
or a small group of neurons
those three things --
in the neural network,
these synapses in the neural network.
只有四个字母,对吧?
is just a simple formula,
going on there, of course,
of mathematical operations.
that if you have one equation,
by knowing the other two things.
你就能算出另一个未知数。
that the picture of a bird is a bird,
and w and x are known.
you know the pixels.
a relatively straightforward problem.
and you're done.
完成的人工神经网络,
doing exactly that.
on a mobile phone,
实时运行的神经网络,
amazing in its own right,
billions and trillions of operations
几十亿甚至几万亿次的
picture of a bird,
"Yes, it's a bird,"
“是的,这是一只鸟”,
来判断这些鸟的种类。
with a network of this sort.
and the y is the unknown.
difficult part, of course,
do we figure out the w,
of solving for w,
with the simple equation
it's the inverse to multiplication,
very non-linear operation;
复杂的非线性运算,
to solve the equation
is fairly straightforward.
a little algebra trick,
to the right-hand side of the equation.
about it as an error.
for w the right way,
能用正确的方法解出w,
假设去缩小这个误差,
to minimize the error,
computers are very good at.
sort of play Marco Polo,
successive approximations to w.
但是经过很多步运算以后,
but after about a dozen steps,
which is close enough.
已经足够精确了。
a lot of known x's and known y's
through an iterative process.
that we do our own learning.
在学习时所使用的方法。
会看到很多很多图像,
this is not a bird."
“这个是鸟,这个不是鸟。”
神经元之间的连接。
for those neural connections.
x and w fixed to solve for y;
x和w。再要去解出Y
亚历克斯 · 莫尔德温采夫
Alex Mordvintsev, on our team,
看如果给定已知的w和y,
with what happens if we try solving for x,
that you've trained on birds,
鸟类识别训练的神经网络,
the same error-minimization procedure,
将误差最小化的步骤,
trained to recognize birds,
训练的神经网络,
generated entirely by a neural network
鸟类识别训练的
rather than solving for y,
by Mike Tyka in our group,
of William Kentridge's artworks,
over the space of different animals,
to recognize and distinguish
动物图像的埃舍尔式变换效果。
morph from one animal to another.
have tried reducing
out of the space of all things
over that entire surface,
你就创造出了一种图像——
you make a kind of map --
分辨出来的所有对象的视觉图像。
the network knows how to recognize.
犰狳在那个点上。
"armadillo" is right in that spot.
of networks as well.
实现类似的目的。
to recognize faces,
in a y that says, "me,"
于一体,相当不可思议的,
psychedelic picture of me
multiple points of view at once
to get rid of the ambiguity
or another pose,
another kind of lighting.
this sort of reconstruction,
of different points of view,
令人困惑的多视角的图像,
his own face as a guide image
我的面部的优化流程中,
影像引导时所得到的图像。
to reconstruct my own face.
that optimization process.
作为渲染过程中的引导,
more like a coherent face,
更清晰的面孔了。
with a blank canvas
别的图像的x开始。
that is itself already some other image.
that is designed to categorize
man-made structures, animals ...
with just a picture of clouds,
它在云中看到了什么。
what it sees in the clouds.
you spend looking at this,
will see in the clouds.
to hallucinate into this,
神经网络去产生迷幻效果,
幻化再放大,幻化再放大.
zooms hallucinates, zooms.
这个网络的神游状态,
of the network, I suppose,
is eating its own tail.
下一张图的基础,决定了
接下来还会看到什么?”
What do I think I see next?"
called "Higher Education" --
“高等教育”的讲座上——
marijuana was legalized.
is not constrained.
这种技术是不受限的。
because they're really fun to look at.
因为它们看起来真的很有趣。
a camera that takes a picture,
writes a poem using neural networks,
基于这张照片的内容,
has been trained
are very intimately connected.
things in the world,
Michelangelo really did see
any being, any alien
perceptual acts of that sort
machinery that's used in both cases.
and creativity are by no means
that can do exactly these sorts of things.
完成这些事的电脑模型。
the brain is computational.
设计智能机器的一种练习。
in designing intelligent machinery.
of those early pioneers,
is not just about accounting
we modeled them after our minds.
仿照大脑来制造它们的。
更好的理解我们的大脑,
to understand our own minds better
ABOUT THE SPEAKER
Blaise Agüera y Arcas - Software architectBlaise Agüera y Arcas works on machine learning at Google. Previously a Distinguished Engineer at Microsoft, he has worked on augmented reality, mapping, wearable computing and natural user interfaces.
Why you should listen
Blaise Agüera y Arcas is principal scientist at Google, where he leads a team working on machine intelligence for mobile devices. His group works extensively with deep neural nets for machine perception and distributed learning, and it also investigates so-called "connectomics" research, assessing maps of connections within the brain.
Agüera y Arcas' background is as multidimensional as the visions he helps create. In the 1990s, he authored patents on both video compression and 3D visualization techniques, and in 2001, he made an influential computational discovery that cast doubt on Gutenberg's role as the father of movable type.
He also created Seadragon (acquired by Microsoft in 2006), the visualization technology that gives Photosynth its amazingly smooth digital rendering and zoom capabilities. Photosynth itself is a vastly powerful piece of software capable of taking a wide variety of images, analyzing them for similarities, and grafting them together into an interactive three-dimensional space. This seamless patchwork of images can be viewed via multiple angles and magnifications, allowing us to look around corners or “fly” in for a (much) closer look. Simply put, it could utterly transform the way we experience digital images.
He joined Microsoft when Seadragon was acquired by Live Labs in 2006. Shortly after the acquisition of Seadragon, Agüera y Arcas directed his team in a collaboration with Microsoft Research and the University of Washington, leading to the first public previews of Photosynth several months later. His TED Talk on Seadragon and Photosynth in 2007 is rated one of TED's "most jaw-dropping." He returned to TED in 2010 to demo Bing’s augmented reality maps.
Fun fact: According to the author, Agüera y Arcas is the inspiration for the character Elgin in the 2012 best-selling novel Where'd You Go, Bernadette?
Blaise Agüera y Arcas | Speaker | TED.com