Fei-Fei Li: How we're teaching computers to understand pictures
李飛飛: 我們如何教導電腦看懂圖像
As Director of Stanford’s Artificial Intelligence Lab and Vision Lab, Fei-Fei Li is working to solve AI’s trickiest problems -- including image recognition, learning and language processing. Full bio
Double-click the English transcript below to play the video.
sitting in a bed.
that are going on an airplane.
a three-year-old child
in a series of photos.
to learn about this world,
at one very important task:
technologically advanced than ever.
we make phones that talk to us
that can play only music we like.
machines and computers
to give you a progress report
in our research in computer vision,
and potentially revolutionary
that can drive by themselves,
they cannot really tell the difference
它將無法分辨同樣出現在馬路中,
on the road, which can be run over,
which should be avoided.
兩者有何不同。
sight to the blind.
the changes of the rainforests.
is drowning in a swimming pool.
對我們發出警訊。
an integral part of global life.
that's far beyond what any human,
都是TED這個活動裡頭的推手。
to that at this TED.
is still struggling at understanding
collectively as a society,
machines are still blind.
a two-dimensional array of numbers
the same as to listen,
the same as to see,
we really mean understanding.
540 million years of hard work
是大自然花了五億四千萬年的光陰
processing apparatus of our brains,
from my Ph.D. at Caltech
collaborators and students
computer vision and machine learning.
of artificial intelligence.
the machines to see just like we do:
inferring 3D geometry of things,
推論物體的幾何形態,
actions and intentions.
of people, places and things
is to teach a computer to see objects,
imagine this teaching process
some training images
from these training images.
a collection of shapes and colors,
in the early days of object modeling.
in a mathematical language
a chubby body,
and viewpoint to the object model.
加上新的形狀和不同的視野角度。
as a household pet
of variations to the object model,
changed my thinking.
real-world experiences and examples.
about every 200 milliseconds,
hundreds of millions of pictures
孩子們已經看過了真實世界中
on better and better algorithms,
應該以孩童的學習經驗法則,
the kind of training data
than we have ever had before,
Kai Li at Princeton University,
我們稱之為 ImageNet 的專案。
a camera on our head
that humans have ever created.
like the Amazon Mechanical Turk platform
這樣的群眾外包平台,
the biggest employers
of the imagery
in the early developmental years.
may seem obvious now,
for quite a while.
我們在這個旅途中孤獨地踽踽而行,
to do something more useful for my tenure,
與其苦苦掙扎於研究經費的募集,
for research funding.
my dry cleaner's shop to fund ImageNet.
my college years.
of objects and things
of domestic and wild cats.
to have put together ImageNet,
to benefit from it,
we opened up the entire data set
research community for free.
to nourish our computer brain,
to the algorithms themselves.
of information provided by ImageNet
of machine learning algorithms
不謀而合,
Geoff Hinton, and Yann LeCun
of billions of highly connected neurons,
or even millions of nodes
to train our object recognition model,
我們用作訓練的物品辨識模型
to train such a humongous model,
這個龐然大物,
in object recognition.
a boy and a teddy bear;
in the background;
railings, a lampost, and so on.
is not so confident about what it sees,
instead of committing too much,
is remarkable at telling us
of Google Street View images
really interesting:
also correlate well
or even surpassed human capabilities?
甚至超越人類了嗎?
the computer to see objects.
learning to utter a few nouns.
milestone will be hit,
to communicate in sentences.
this is a cat in the picture,
來描述圖片,
telling us this is a cat lying on a bed.
to see a picture and generate sentences,
and machine learning algorithm
from both pictures
vision and language,
that connects parts of visual things
它可以連結不同的可視物體,
computer vision models
a human-like sentence
what the computer says
at the beginning of this talk.
它又是如何理解的。
next to an elephant.
of an airport runway.
to improve our algorithms,
on a bed in a blanket.
too many cats,
might look like a cat.
is holding a baseball bat.
會把它與球棒混淆。
it confuses it with a baseball bat.
down a street next to a building.
to the computers.
in a field of grass.
the stunning beauty of nature
from three to 13 and far beyond.
甚至到更遠的階段。
of the boy and the cake again.
the computer to see objects
when seeing a picture.
at a table with a cake.
to this picture
is that this is a special Italian cake
after a trip to Sydney,
at that moment.
extra pairs of tireless eyes
and take care of patients.
and safer on the road.
to save the trapped and wounded.
better materials,
與更好的材料,
with the help of the machines.
這一切都可仰賴機器的協助。
to the machines.
won't be the only ones
for their intelligence,
in ways that we cannot even imagine.
for Leo and for the world.
ABOUT THE SPEAKER
Fei-Fei Li - Computer scientistAs Director of Stanford’s Artificial Intelligence Lab and Vision Lab, Fei-Fei Li is working to solve AI’s trickiest problems -- including image recognition, learning and language processing.
Why you should listen
Using algorithms built on machine learning methods such as neural network models, the Stanford Artificial Intelligence Lab led by Fei-Fei Li has created software capable of recognizing scenes in still photographs -- and accurately describe them using natural language.
Li’s work with neural networks and computer vision (with Stanford’s Vision Lab) marks a significant step forward for AI research, and could lead to applications ranging from more intuitive image searches to robots able to make autonomous decisions in unfamiliar situations.
Fei-Fei was honored as one of Foreign Policy's 2015 Global Thinkers.
Fei-Fei Li | Speaker | TED.com