Fei-Fei Li: How we're teaching computers to understand pictures
Fei-Fei Li: Kako učimo računala da razumiju slike
As Director of Stanford’s Artificial Intelligence Lab and Vision Lab, Fei-Fei Li is working to solve AI’s trickiest problems -- including image recognition, learning and language processing. Full bio
Double-click the English transcript below to play the video.
sitting in a bed.
koja sjedi na krevetu.
that are going on an airplane.
koji idu u avion.
trogodišnje dijete
a three-year-old child
in a series of photos.
na ovim slikama.
to learn about this world,
što mora naučiti o svijetu
at one very important task:
u nečemu važnom:
technologically advanced than ever.
naprednije no ikada.
we make phones that talk to us
izrađujemo telefone koji pričaju s nama
that can play only music we like.
koje puštaju samo glazbu koju volimo.
machines and computers
uređaj i računala
to give you a progress report
kako bi vas izvijestila
in our research in computer vision,
u istraživanju računalnog vida,
and potentially revolutionary
i potencijalno revolucionarnih
that can drive by themselves,
koji se sami voze,
they cannot really tell the difference
ne mogu zapravo vidjeti razliku
on the road, which can be run over,
na putu, koju mogu pregaziti,
which should be avoided.
koji treba izbjeći.
sight to the blind.
vid slijepima.
the changes of the rainforests.
promjene u kišnim šumama
is drowning in a swimming pool.
dijete utaplja u bazenu.
an integral part of global life.
integralni dio globalnog života.
that's far beyond what any human,
daleko veća od od one koji bi čovjek
to that at this TED.
ovdje na TED-u.
is still struggling at understanding
se i dalje muči oko razumjevanja
collectively as a society,
zajedno kao društvo,
uređaji i dalje slijepi.
machines are still blind.
možda se pitate.
dvodimenzionalne redove brojeva
a two-dimensional array of numbers
the same as to listen,
ne znači isto što i čuti,
the same as to see,
što i vidjeti,
we really mean understanding.
mislimo na razumijevanje.
540 million years of hard work
540 milijuna godina teškog posla
processing apparatus of our brains,
za obradu vida u našem mozgu,
from my Ph.D. at Caltech
mog doktorata u Caltech-u
laboratorij za vid,
collaborators and students
suradnicima i studentima
computer vision and machine learning.
računarni vid i strojno učenje.
of artificial intelligence.
umjetne inteligencije.
the machines to see just like we do:
uređaje da vide kao što mi vidimo:
inferring 3D geometry of things,
razumjevanje trodimenzionalnosti objekata,
actions and intentions.
akcija i namjera.
of people, places and things
ljudi, mjesta i stvari
is to teach a computer to see objects,
naučiti računala da vide objekte,
imagine this teaching process
zamislite ovaj proces učenja
some training images
raznih prizora za trening
from these training images.
iz ovih prikaza za .
a collection of shapes and colors,
skup oblika i boja,
in the early days of object modeling.
u početcima modeliranja objekta.
in a mathematical language
u matematičkom jeziku
a chubby body,
debeljuškasto tijelo,
and viewpoint to the object model.
i pogled modelnom objektu.
as a household pet
poput kućnog ljubimca
of variations to the object model,
varijacija modelnog objekta,
changed my thinking.
promjenilo mi je razmišljanje.
real-world experiences and examples.
iskustvo i primjere iz stvarnog svijeta.
about every 200 milliseconds,
svakih 200 milisekundi,
za pokret oka.
hundreds of millions of pictures
stotine milijuna slika
on better and better algorithms,
sve bolje i bolje algoritme,
the kind of training data
nekakakve podatke za vježbu
than we have ever had before,
no što smo mi imali ikad prije,
Kai Li at Princeton University,
Kai Li na sveučilištu Princeton,
a camera on our head
kamere na naše glave
that humans have ever created.
čovječanstvo stvorilo.
milijardu slika i
like the Amazon Mechanical Turk platform
poput platforme Amazon Mechanical Turk
the biggest employers
najvećih poslodavaca
sortiramo i označimo
of the imagery
in the early developmental years.
u ranim godinama razvoja.
korištenja mnogo podataka
may seem obvious now,
se možda sada čini očiglednim,
for quite a while.
poprilično sami na tom putu.
to do something more useful for my tenure,
me savjetovale da radim nešto korisnije,
for research funding.
za financiranje istraživanja.
my dry cleaner's shop to fund ImageNet.
kako bih mogla financirati ImageNet.
my college years.
svoj studij.
of objects and things
objekata i stvari
engleske riječi.
of domestic and wild cats.
domaćih i divljih mačaka.
to have put together ImageNet,
što smo sastavili ImageNet,
to benefit from it,
ima koristi od njega,
we opened up the entire data set
otvorili cijeli skup podataka
research community for free.
to nourish our computer brain,
da opskrbimo mozgove naših računala,
to the algorithms themselves.
same algoritme.
of information provided by ImageNet
s ImageNet-a
of machine learning algorithms
algoritama za strojno učenje
Geoff Hinton, and Yann LeCun
Geoff Hintona i Yann LeCuna
of billions of highly connected neurons,
od milijardu vrlo povezanih neurona,
neuronskih mreža
čak milijuni čvorova
or even millions of nodes
hijerarhijskim slojevima
to train our object recognition model,
u učenju prepoznavanja modela,
mnoštvom podataka s ImageNet-a
to train such a humongous model,
kako bi istrenirao ove ogrome modele,
koji nitko nije očekivao.
in object recognition.
rezultata u prepoznavanju objekata.
a boy and a teddy bear;
dječaka i medvjedića;
in the background;
u pozadini;
railings, a lampost, and so on.
ograde, lampe itd.
is not so confident about what it sees,
nije sigurno što vidi,
instead of committing too much,
is remarkable at telling us
alogoritam nam besprijekorno kaže
of Google Street View images
Google Street View prikaza
really interesting:
also correlate well
koreliraju također sa
Je li to, to?
or even surpassed human capabilities?
ili čak prestiglo u našim sposobnostima?
the computer to see objects.
računalo da vidi objekte.
learning to utter a few nouns.
učite reći nekoliko imenica.
milestone will be hit,
biti dosegnuto,
to communicate in sentences.
this is a cat in the picture,
kako je mačka na slici,
telling us this is a cat lying on a bed.
koja govori da mačka leži na krevetu.
to see a picture and generate sentences,
da vidi sliku i stvori rečenice,
and machine learning algorithm
i algoritama strojnog učenja
from both pictures
učiti i iz slika
vision and language,
vid i jezik,
that connects parts of visual things
koji spaja vidljive dijelove
computer vision models
modela računalnog vida
a human-like sentence
rečenicu sličnu ljudskoj
what the computer says
što računalo kaže
at the beginning of this talk.
na početku govora.
next to an elephant.
pored slona.
of an airport runway.
avionske piste.
to improve our algorithms,
unaprijediti naše algoritme,
na krevetu u deci.
on a bed in a blanket.
previše mačaka,
too many cats,
might look like a cat.
izgledati kao mačka.
is holding a baseball bat.
drži bejzbolsku palicu.
it confuses it with a baseball bat.
pomiješat će je s bejzbolskom palicom.
down a street next to a building.
niz ulicu pored zgrade.
to the computers.
neke osnove umjetnosti.
in a field of grass.
u polju trave.
the stunning beauty of nature
prekrasnoj ljepoti prirode
from three to 13 and far beyond.
od treće do trinaeste godine, i dalje.
of the boy and the cake again.
slikom dječaka i kolača.
the computer to see objects
da vidi objekte
when seeing a picture.
onoga što je na slici.
at a table with a cake.
za stolom s kolačem.
na ovoj slici
to this picture
da je to poseban talijanski kolač
is that this is a special Italian cake
nakon putovanja u Sidney,
after a trip to Sydney,
at that moment.
u ovom trenu.
extra pairs of tireless eyes
dodatan par neumornih očiju
and take care of patients.
i pobrinuti se za pacijenta.
and safer on the road.
i sigurnije na putu.
to save the trapped and wounded.
kako bi spasili zatočene i ozljeđene.
bolje materijale,
better materials,
with the help of the machines.
uz pomoć uređaja.
to the machines.
vid uređajima.
won't be the only ones
neće biti jedino
for their intelligence,
zbog njihove inteligencije,
in ways that we cannot even imagine.
na načine koje ne možemo zamisliti.
for Leo and for the world.
za Lea i za svijet.
ABOUT THE SPEAKER
Fei-Fei Li - Computer scientistAs Director of Stanford’s Artificial Intelligence Lab and Vision Lab, Fei-Fei Li is working to solve AI’s trickiest problems -- including image recognition, learning and language processing.
Why you should listen
Using algorithms built on machine learning methods such as neural network models, the Stanford Artificial Intelligence Lab led by Fei-Fei Li has created software capable of recognizing scenes in still photographs -- and accurately describe them using natural language.
Li’s work with neural networks and computer vision (with Stanford’s Vision Lab) marks a significant step forward for AI research, and could lead to applications ranging from more intuitive image searches to robots able to make autonomous decisions in unfamiliar situations.
Fei-Fei was honored as one of Foreign Policy's 2015 Global Thinkers.
Fei-Fei Li | Speaker | TED.com