ABOUT THE SPEAKER

Fei-Fei Li - Computer scientist
As Director of Stanford’s Artificial Intelligence Lab and Vision Lab, Fei-Fei Li is working to solve AI’s trickiest problems -- including image recognition, learning and language processing.

Why you should listen

Using algorithms built on machine learning methods such as neural network models, the Stanford Artificial Intelligence Lab led by Fei-Fei Li has created software capable of recognizing scenes in still photographs -- and accurately describe them using natural language.

Li’s work with neural networks and computer vision (with Stanford’s Vision Lab) marks a significant step forward for AI research, and could lead to applications ranging from more intuitive image searches to robots able to make autonomous decisions in unfamiliar situations.

Fei-Fei was honored as one of Foreign Policy's 2015 Global Thinkers.

More profile about the speaker
Fei-Fei Li | Speaker | TED.com

TED2015

Fei-Fei Li: How we're teaching computers to understand pictures

フェイフェイ・リー: コンピュータが写真を理解するようになるまで

Filmed: 2015-03-17

Readability: 4.5

2,702,344 views

小さな子供は写真を見て「ネコ」や「本」や「椅子」のような簡単な要素を識別できます。今やコンピュータも同じことができるくらいに賢くなりました。次は何でしょう？この胸躍る講演で、コンピュータビジョンの専門家であるフェイフェイ・リーが、写真を理解できるようコンピュータに「教える」ために構築された1500万の画像データベースをはじめとする、この分野の最先端と今後について語ります。

Fei-Fei Li - Computer scientist
As Director of Stanford’s Artificial Intelligence Lab and Vision Lab, Fei-Fei Li is working to solve AI’s trickiest problems -- including image recognition, learning and language processing. Full bio

Double-click the English transcript below to play the video.

00:14

Let me showショー you something.

0

2366

3738

まずこのビデオを
ご覧ください

00:18

(Videoビデオ) Girl女の子: Okay, that's a catネコ
sitting座っている in a bedベッド.

1

6104

4156

(女の子の声) ネコがベッドに座ってる

00:22

The boy男の子 is pettingペッティング the elephant象.

2

10260

4040

男の子が象をなでてる

00:26

Those are people
that are going on an airplane飛行機.

3

14300

4354

飛行機へ行く人たち

大きな飛行機よ

00:30

That's a big大きい airplane飛行機.

4

18654

2810

(講演者) これは３歳児が

00:33

Fei-FeiFei-Fei Li李: This is
a three-year-old3歳 child子

5

21464

2206

00:35

describing記述 what she sees見える
in a seriesシリーズ of photos写真.

6

23670

3679

見た写真を
説明しているところです

00:39

She mightかもしれない still have a lot
to learn学ぶ about this world世界,

7

27349

2845

彼女にはこの世界で学ぶことが
まだまだあるかもしれませんが

00:42

but she's already既に an expert専門家
at one very important重要 task仕事:

8

30194

4549

ひとつの重要な作業については
すでにエキスパートです

00:46

to make senseセンス of what she sees見える.

9

34743

2846

見たものを理解する
ということです

00:50

Our society社会 is more
technologically技術的に advanced高度な than ever.

10

38229

4226

私たちの社会は技術的に
かつてなく進歩しています

00:54

We send送信する people to the moon月,
we make phones電話機 that talk to us

11

42455

3629

月へと人を送り込み
人に話しかける電話を作り

00:58

or customizeカスタマイズ radio無線 stations駅
that can play遊びます only music音楽 we like.

12

46084

4946

自分の好きな曲だけがかかるように
ラジオをカスタマイズしています

01:03

Yetまだ, our most最も advanced高度な
machines機械 and computersコンピュータ

13

51030

4055

しかしながら最先端の
コンピュータでも

01:07

still struggle闘争 at this task仕事.

14

55085

2903

まだこの作業には
手こずっているんです

01:09

So I'm here today今日
to give you a progress進捗 report報告する

15

57988

3459

私は今日コンピュータビジョンの

01:13

on the latest最新 advances進歩
in our research研究 in computerコンピューター visionビジョン,

16

61447

4047

最新動向について
お伝えするために来ました

01:17

one of the most最も frontierフロンティア
and potentially潜在的 revolutionary革命的な

17

65494

4161

これはコンピュータサイエンスの中でも
先端にあって

01:21

technologiesテクノロジー in computerコンピューター science科学.

18

69655

3206

画期的なものになる
可能性のある技術です

01:24

Yes, we have prototyped試作品 cars車
that can driveドライブ by themselves自分自身,

19

72861

4551

自分で運転する車の
プロトタイプが作られていますが

01:29

but withoutなし smartスマート visionビジョン,
they cannotできない really tell the difference差

20

77412

3853

知的な視覚処理能力がなかったら

01:33

betweenの間に a crumpledくしゃくしゃ paper紙 bagバッグ
on the road道路, whichどの can be run走る over,

21

81265

3970

踏みつぶしても問題のない
道路上の丸めた紙袋と

01:37

and a rock岩 that sizeサイズ,
whichどの should be avoided避ける.

22

85235

3340

避けて通るべき同じ大きさの石とを
見分けることもできません

01:41

We have made製 fabulous素晴らしい megapixelメガピクセル camerasカメラ,

23

89415

3390

すごいメガピクセルの
カメラが作られていますが

01:44

but we have not delivered配信された
sight視力 to the blindブラインド.

24

92805

3135

盲目の人に視力を与えることは
できていません

01:48

Dronesドロンズ can fly飛ぶ over massive大規模 land土地,

25

96420

3305

無人機を広大な土地の上に
飛ばすことはできても

01:51

but don't have enough十分な visionビジョン technology技術

26

99725

2134

熱帯雨林の変化を
追跡できるだけの

01:53

to help us to trackトラック
the changes変更 of the rainforests熱帯雨林.

27

101859

3461

画像技術はまだありません

01:57

Securityセキュリティ camerasカメラ are everywhereどこにでも,

28

105320

2950

監視カメラが至る所に
設置されていますが

02:00

but they do not alertアラート us when a child子
is drowning溺死 in a swimming水泳 poolプール.

29

108270

5067

プールで溺れている子がいても
警告してはくれません

02:06

Photos写真 and videosビデオ are becoming〜になる
an integral積分 part部 of globalグローバル life.

30

114167

5595

写真やビデオは世界において
生活に不可欠な一部をなしています

どんな個人であれチームであれ
見切れないほどのペースで

02:11

They're beingであること generated生成された at a paceペース
that's far遠い beyond超えて what any human人間,

31

119762

4087

02:15

or teamsチーム of humans人間, could hope希望 to view見る,

32

123849

2783

映像が量産されています

02:18

and you and I are contributing貢献する
to that at this TEDTED.

33

126632

3921

そして私たちもここTEDで
それに貢献しています

02:22

Yetまだ our most最も advanced高度な softwareソフトウェア
is still struggling苦しい at understanding理解

34

130553

5232

しかし最も進んだ
ソフトウェアでさえ

この膨大な映像を理解し管理するのに
手こずっています

02:27

and managing管理します this enormous巨大な contentコンテンツ.

35

135785

3876

02:31

So in other words言葉,
collectively集合的に as a society社会,

36

139661

5272

言ってみれば

私たちの社会は
集合的に盲目であり

02:36

we're very much blindブラインド,

37

144933

1746

それは最も知的な機械が
いまだ盲目だからです

02:38

because our smartest最もスマートな
machines機械 are still blindブラインド.

38

146679

3387

なぜそんなに難しいのかと
思うかもしれません

02:43

"Why is this so hardハード?" you mayかもしれない ask尋ねる.

39

151526

2926

02:46

Camerasカメラ can take picturesピクチャー like this one

40

154452

2693

カメラはこのような写真を撮って

光をピクセルと呼ばれる

02:49

by converting変換する lightsライト into
a two-dimensional二次元 arrayアレイ of numbers数字

41

157145

3994

数字の２次元配列へと
変換しますが

02:53

known既知の as pixelsピクセル,

42

161139

1650

02:54

but these are just lifeless生命のない numbers数字.

43

162789

2251

これは死んだ数字の列に
過ぎません

02:57

They do not carryキャリー meaning意味 in themselves自分自身.

44

165040

3111

数字自体に意味はありません

単に音が耳に入ってくるのと
「聴く」のとは違うように

03:00

Just like to hear聞く is not
the same同じ as to listen,

45

168151

4343

03:04

to take picturesピクチャー is not
the same同じ as to see,

46

172494

4040

「写真を撮る」のと「見る」のとは
同じではありません

03:08

and by seeing見る,
we really mean understanding理解.

47

176534

3829

「見る」ということには
理解することが含まれているのです

03:13

In fact事実, it took取った Mother母 Nature自然
540 million百万 years年 of hardハード work

48

181293

6177

実際この仕事を
成し遂げられるようにするために

母なる自然は 5億4千万年という
長い歳月を必要としたのです

03:19

to do this task仕事,

49

187470

1973

03:21

and much of that effort努力

50

189443

1881

そしてその努力の多くは

03:23

went行った into developing現像 the visualビジュアル
processing処理 apparatus装置 of our brains頭脳,

51

191324

5271

目そのものではなく

脳の視覚処理能力を発達させるために
費やされました

03:28

not the eyes目 themselves自分自身.

52

196595

2647

03:31

So visionビジョン begins始まる with the eyes目,

53

199242

2747

視覚というのは
目から始まりますが

03:33

but it truly真に takes place場所 in the brain脳.

54

201989

3518

それが本当に起きているのは
脳の中なのです

03:38

So for 15 years年 now, starting起動
from my PhPh.D. at Caltechカルテック

55

206287

5060

これまで15年間
カリフォルニア工科大学の博士課程の頃から

スタンフォード大でコンピュータビジョン研究室を
率いている今に到るまで

03:43

and then leading先導 Stanford'sスタンフォード Visionビジョン Labラボ,

56

211347

2926

03:46

I've been workingワーキング with my mentorsメンター,
collaborators協力者 and students学生の

57

214273

4396

私は指導教官や共同研究者や
学生達とともに

03:50

to teach教える computersコンピュータ to see.

58

218669

2889

コンピュータに見ることを
教えようとしてきました

03:54

Our research研究 fieldフィールド is calledと呼ばれる
computerコンピューター visionビジョン and machine機械 learning学習.

59

222658

3294

私たちの研究領域は
コンピュータビジョンと機械学習で

03:57

It's part部 of the general一般 fieldフィールド
of artificial人工的な intelligenceインテリジェンス.

60

225952

3878

これは人工知能の分野の一部です

04:03

So ultimately最終的に, we want to teach教える
the machines機械 to see just like we do:

61

231000

5493

最終的に私たちがしたいのは
機械も人間のようにものを見られるようにすることです

04:08

namingネーミング objectsオブジェクト, identifying識別 people,
inferring推論 3D geometryジオメトリ of things,

62

236493

5387

物が何か言い当て人を識別し
３次元的な配置を推量し

04:13

understanding理解 relations関係, emotions感情,
actions行動 and intentions意図.

63

241880

5688

関係や感情や行動や意図を
理解するということです

04:19

You and I weave織る together一緒に entire全体 stories物語
of people, places場所 and things

64

247568

6153

私たち人間は一目見ただけで

人場所物の織りなす物語全体を
捉えることができます

04:25

the moment瞬間 we lay寝る our gaze視線 on them.

65

253721

2164

この目標に向けた第一歩は

04:28

The first stepステップ towards方向 this goalゴール
is to teach教える a computerコンピューター to see objectsオブジェクト,

66

256955

5583

コンピュータに視覚世界の構成要素である物を
見られるようにすることです

04:34

the building建物 blockブロック of the visualビジュアル world世界.

67

262538

3368

04:37

In its simplest最も単純な terms条項,
imagine想像する this teaching教える processプロセス

68

265906

4434

簡単に言うと

ネコのような特定の物の

04:42

as showing表示 the computersコンピュータ
some trainingトレーニング imagesイメージ

69

270340

2995

訓練用画像を
コンピュータに与えて

04:45

of a particular特に objectオブジェクト, let's say cats猫,

70

273335

3321

04:48

and designing設計 a modelモデル that learns学ぶ
from these trainingトレーニング imagesイメージ.

71

276656

4737

それらの画像から学習する
モデルを設計するんです

04:53

How hardハード can this be?

72

281393

2044

簡単そうに聞こえますよね？

04:55

After all, a catネコ is just
a collectionコレクション of shapes形 and colors色,

73

283437

4052

ネコの画像は色と形の
集まりに過ぎません

04:59

and this is what we did
in the early早い days日々 of objectオブジェクト modelingモデリング.

74

287489

4086

これは初期のオブジェクト・モデリングで
私たちがやっていたことでした

05:03

We'd結婚した tell the computerコンピューター algorithmアルゴリズム
in a mathematical数学 language言語

75

291575

3622

数学的な言語を使って
コンピュータアルゴリズムに

05:07

that a catネコ has a round円形 face面,
a chubbyふわふわした body体,

76

295197

3343

ネコには丸い顔と
ぽっちゃりした体と

05:10

two pointy尖った ears耳, and a long tail尾,

77

298540

2299

２つのとがった耳と
長いしっぽがあると教え

05:12

and that looked見た all fine.

78

300839

1410

それでうまくいきそうでした

05:14

But what about this catネコ?

79

302859

2113

でもこのネコはどうでしょう？

05:16

(Laughter笑い)

80

304972

1091

(笑)

05:18

It's all curledカールした up.

81

306063

1626

体がすっかり反り返っています

05:19

Now you have to add追加する another別の shape形状
and viewpoint観点 to the objectオブジェクト modelモデル.

82

307689

4719

オブジェクトモデルに新しい形と視点を
追加する必要があります

05:24

But what if cats猫 are hidden隠された?

83

312408

1715

でもネコが一部隠れていたら
どうでしょう？

05:27

What about these silly愚かな cats猫?

84

315143

2219

このおかしなネコたちはどうでしょう？

05:31

Now you get my pointポイント.

85

319112

2417

言いたいこと分かりますよね？

05:33

Even something as simple単純
as a household家庭 petペット

86

321529

3367

身近なペットのネコという
シンプルなものでさえ

05:36

can presentプレゼント an infinite無限 number数
of variationsバリエーション to the objectオブジェクト modelモデル,

87

324896

4504

オブジェクトモデルに
無数のバリエーションを定義する必要があり

しかもこれは沢山あるものの
１つに過ぎないんです

05:41

and that's just one objectオブジェクト.

88

329400

2233

05:44

So about eight8 years年 ago前,

89

332573

2492

８年ほど前

05:47

a very simple単純 and profound深遠な observation観察
changedかわった my thinking考え.

90

335065

5030

とてもシンプルながら本質的なある観察が
私の考え方を変えました

05:53

No one tells伝える a child子 how to see,

91

341425

2685

子供は教えられなくても

成長の初期に
ものの見方を身に付けるということです

05:56

especially特に in the early早い years年.

92

344110

2261

05:58

They learn学ぶ this throughを通して
real-world現実の世界 experiences経験 and examples例.

93

346371

5000

子供は現実の世界における
経験と例を通して学ぶのです

06:03

If you consider検討する a child's子供の eyes目

94

351371

2740

子供の目が
生きたカメラで

200ミリ秒ごとに１枚

06:06

as a pairペア of biological生物学的 camerasカメラ,

95

354111

2554

06:08

they take one picture画像
about everyすべて 200 millisecondsミリ秒,

96

356665

4180

写真を撮っていると
考えてみましょう

06:12

the average平均 time an eye眼 movement移動 is made製.

97

360845

3134

これは目が動く
平均時間です

06:15

So by age年齢 three三, a child子 would have seen見た
hundreds数百 of millions何百万 of picturesピクチャー

98

363979

5550

すると子供は３歳になるまでに
何億枚という

現実世界の写真を
見ていることになります

06:21

of the realリアル world世界.

99

369529

1834

膨大な量の訓練例です

06:23

That's a lot of trainingトレーニング examples例.

100

371363

2280

06:26

So instead代わりに of focusingフォーカス solely単独で
on better and better algorithmsアルゴリズム,

101

374383

5989

それで気が付いたのは
アルゴリズムの改良ばかりに集中するのではなく

06:32

my insight洞察力 was to give the algorithmsアルゴリズム
the kind種類 of trainingトレーニング dataデータ

102

380372

5272

子供が経験を通じて
受け取るような

06:37

that a child子 was given与えられた throughを通して experiences経験

103

385644

3319

量と質の訓練データを

06:40

in bothどちらも quantity量 and quality品質.

104

388963

3878

アルゴリズムに与えてはどうか
ということでした

06:44

Once一度 we know this,

105

392841

1858

このことに気付いた時

06:46

we knew知っていた we needed必要な to collect集める a dataデータ setセット

106

394699

2971

私たちが持っているよりも
遙かに多くの画像データを

06:49

that has far遠い more imagesイメージ
than we have ever had before,

107

397670

4459

集めなければならないことが
明らかでした

06:54

perhapsおそらく thousands千 of times回 more,

108

402129

2577

何千倍も必要です

06:56

and together一緒に with Professor教授
Kaiカイ Li李 at Princetonプリンストン University大学,

109

404706

4111

それで私はプリンストン大学の
カイ・リー教授と一緒に

07:00

we launched打ち上げ the ImageNetImageNet projectプロジェクト in 2007.

110

408817

4752

2007年にImageNetプロジェクトを
立ち上げました

07:05

Luckily幸いにも, we didn't have to mountマウント
a cameraカメラ on our head頭

111

413569

3838

幸い私たちは
頭にカメラを付けて

何年も歩き回る必要は
ありませんでした

07:09

and wait for manyたくさんの years年.

112

417407

1764

07:11

We went行った to the Internetインターネット,

113

419171

1463

人類がかつて作った
最大の画像の宝庫

07:12

the biggest最大 treasure宝 trove〜する of picturesピクチャー
that humans人間 have ever created作成した.

114

420634

4436

インターネットに
向かったのです

07:17

We downloadedダウンロードした nearlyほぼ a billion億 imagesイメージ

115

425070

3041

私たちは10億枚近い画像を
ダウンロードし

07:20

and used crowdsourcingクラウドソーシング technology技術
like the Amazonアマゾン Mechanical機械的 Turkターク platformプラットフォーム

116

428111

5880

アマゾン・メカニカル・タークのような
クラウドソーシング技術を使って

07:25

to help us to labelラベル these imagesイメージ.

117

433991

2339

それらの画像に
ラベル付けをしました

07:28

At its peakピーク, ImageNetImageNet was one of
the biggest最大 employers雇用主

118

436330

4900

最盛期にはImageNetは
アマゾン・メカニカル・ターク作業者の

07:33

of the Amazonアマゾン Mechanical機械的 Turkターク workers労働者:

119

441230

2996

最大の雇用者の１つに
なっていました

07:36

together一緒に, almostほぼ 50,000 workers労働者

120

444226

3854

167カ国の

07:40

from 167 countries国 around the world世界

121

448080

4040

５万人近い作業者が

07:44

helped助けた us to cleanクリーン, sortソート and labelラベル

122

452120

3947

10億枚近い画像を
整理しラベル付けする作業に

07:48

nearlyほぼ a billion億 candidate候補者 imagesイメージ.

123

456067

3575

携わりました

07:52

That was how much effort努力 it took取った

124

460612

2653

子供がその成長の初期に
受け取るのに

07:55

to captureキャプチャー even a fraction分数
of the imageryイメージ

125

463265

3900

匹敵する量の画像を
用意するためには

07:59

a child's子供の mindマインド takes in
in the early早い developmental発達する years年.

126

467165

4171

それほどの労力が
必要だったのです

08:04

In hindsight後見人, this ideaアイディア of usingを使用して big大きい dataデータ

127

472148

3902

コンピュータアルゴリズムの訓練に
ビッグデータを使うというアイデアは

08:08

to train列車 computerコンピューター algorithmsアルゴリズム
mayかもしれない seem思われる obvious明らか now,

128

476050

4550

今からすると
自明なものに見えるでしょうが

08:12

but back in 2007, it was not so obvious明らか.

129

480600

4110

2007年当時は
そうではありませんでした

08:16

We were fairlyかなり alone単独で on this journey旅
for quiteかなり a while.

130

484710

3878

かなり長い間こんなことをやっている人は
私たち以外にいませんでした

08:20

Some very friendlyフレンドリーな colleagues同僚 advisedアドバイス me
to do something more useful有用 for my tenure在籍,

131

488588

5003

親切な同僚が将来の職のためにもう少し有用なことを
した方がいいとアドバイスしてくれたくらいです

08:25

and we were constantly常に struggling苦しい
for research研究 funding資金調達.

132

493591

4342

研究資金には
いつも困っていました

ImageNetの資金調達のために
クリーニング屋をまた開こうかしらと

08:29

Once一度, I even joked冗談を言った to my graduate卒業 students学生の

133

497933

2485

08:32

that I would just reopen再開
my dryドライ cleaner'sクリーナー shopショップ to fund基金 ImageNetImageNet.

134

500418

4063

学生に冗談で言ったくらいです

08:36

After all, that's how I funded資金提供
my collegeカレッジ years年.

135

504481

4761

私が学生の頃学費のために
やっていたことです

私たちは進み続け

08:41

So we carried運ばれた on.

136

509242

1856

08:43

In 2009, the ImageNetImageNet projectプロジェクト delivered配信された

137

511098

3715

2009年に
ImageNetプロジェクトは

08:46

a databaseデータベース of 15 million百万 imagesイメージ

138

514813

4042

日常的な英語を使って
2万2千のカテゴリに分類した

08:50

across横断する 22,000 classesクラス
of objectsオブジェクト and things

139

518855

4805

1500万枚の画像の
データベースを

完成させました

08:55

organized組織された by everyday毎日 English英語 words言葉.

140

523660

3320

08:58

In bothどちらも quantity量 and quality品質,

141

526980

2926

これは量という点でも
質という点でも

09:01

this was an unprecedented前例のない scale規模.

142

529906

2972

かつてないスケールのものでした

09:04

As an example例, in the case場合 of cats猫,

143

532878

3461

一例を挙げると

ネコの画像は
6万2千点以上あって

09:08

we have more than 62,000 cats猫

144

536339

2809

09:11

of all kinds種類 of looks外見 and posesポーズ

145

539148

4110

様々な見かけや
ポーズのネコがいて

09:15

and across横断する all species種
of domestic国内の and wild野生 cats猫.

146

543258

5223

飼い猫から山猫まで
あらゆる種類を網羅しています

09:20

We were thrilled興奮した
to have put together一緒に ImageNetImageNet,

147

548481

3344

私たちはImageNetが
できあがったことを喜び

09:23

and we wanted the whole全体 research研究 world世界
to benefit利益 from it,

148

551825

3738

世界の研究者にも
その恩恵を受けて欲しいと思い

09:27

so in the TEDTED fashionファッション,
we opened開かれた up the entire全体 dataデータ setセット

149

555563

4041

TEDの流儀で
データセットをまるごと

09:31

to the worldwide世界的に
research研究 communityコミュニティ for free無料.

150

559604

3592

無償で世界の研究者コミュニティに
公開しました

(拍手)

09:36

(Applause拍手)

151

564636

4000

09:41

Now that we have the dataデータ
to nourish栄養を与える our computerコンピューター brain脳,

152

569416

4538

こうしてコンピュータの脳を
育てるためのデータができ

09:45

we're ready準備完了 to come back
to the algorithmsアルゴリズム themselves自分自身.

153

573954

3737

アルゴリズムに取り組む
用意が整いました

09:49

As it turned回した out, the wealth富
of information情報 provided提供された by ImageNetImageNet

154

577691

5178

それで分かったのは
ImageNetが提供する豊かな情報に適した

09:54

was a perfect完璧な match一致 to a particular特に classクラス
of machine機械 learning学習 algorithmsアルゴリズム

155

582869

4806

機械学習アルゴリズムがあることです

09:59

calledと呼ばれる convolutional畳み込み neuralニューラル networkネットワーク,

156

587675

2415

畳み込みニューラルネットワークと言って

10:02

pioneered開拓者 by Kunihikoクニヒコ Fukushima福島,
Geoffジェフ Hintonヒントン, and Yannヤン LeCunLeCun

157

590090

5248

福島邦彦ジェフリー・ヒントン
ヤン・ルカンといった人たちが

10:07

back in the 1970s and '80s.

158

595338

3645

1970年代から1980年代にかけて
開拓した領域です

10:10

Just like the brain脳 consists〜する
of billions何十億 of highly高く connected接続された neuronsニューロン,

159

598983

5619

脳が何十億という高度に結合し合った
ニューロンからできているように

10:16

a basic基本的な operatingオペレーティング unit単位 in a neuralニューラル networkネットワーク

160

604602

3854

ニューラルネットワークの
基本要素となっているのは

10:20

is a neuron-likeニューロンのような nodeノード.

161

608456

2415

ニューロンのようなノードです

10:22

It takes input入力 from other nodesノード

162

610871

2554

他のノードからの入力を受けて

10:25

and sendsセンド output出力 to othersその他.

163

613425

2718

他のノードへ出力を渡します

10:28

Moreoverさらに, these hundreds数百 of thousands千
or even millions何百万 of nodesノード

164

616143

4713

何十万何百万という
このようなノードが

10:32

are organized組織された in hierarchical階層的 layers層,

165

620856

3227

これも脳と同様に

階層的に組織化されています

10:36

alsoまた、 similar類似 to the brain脳.

166

624083

2554

10:38

In a typical典型的な neuralニューラル networkネットワーク we use
to train列車 our objectオブジェクト recognition認識 modelモデル,

167

626637

4783

物を認識するモデルを訓練するために
私たちが通常使うニューラルネットワークには

10:43

it has 24 million百万 nodesノード,

168

631420

3181

2千4百万のノード

10:46

140 million百万 parametersパラメーター,

169

634601

3297

1億4千万のパラメータ

150億の結合があります

10:49

and 15 billion億 connections接続.

170

637898

2763

ものすごく大きなモデルです

10:52

That's an enormous巨大な modelモデル.

171

640661

2415

10:55

PoweredPowered by the massive大規模 dataデータ from ImageNetImageNet

172

643076

3901

ImageNetの膨大なデータと

10:58

and the modernモダン CPUsCPU and GPUsGPU
to train列車 suchそのような a humongous膨大な modelモデル,

173

646977

5433

現代のCPUやGPUの性能を使って
このような巨大なモデルを訓練することで

11:04

the convolutional畳み込み neuralニューラル networkネットワーク

174

652410

2369

畳み込みニューラルネットワークは

11:06

blossomed開花した in a way that no one expected期待される.

175

654779

3436

誰も予想しなかったくらいに
大きく花開きました

11:10

It becameなりました the winning勝つ architecture建築

176

658215

2508

これは物の認識において
目覚ましい結果を出す

11:12

to generate生成する excitingエキサイティング new新しい results結果
in objectオブジェクト recognition認識.

177

660723

5340

大当たりのアーキテクチャとなっています

11:18

This is a computerコンピューター telling伝える us

178

666063

2810

ここではコンピュータが

11:20

this picture画像 contains含まれる a catネコ

179

668873

2300

写真の中にネコがいることと

11:23

and where the catネコ is.

180

671173

1903

その場所を示しています

11:25

Of courseコース there are more things than cats猫,

181

673076

2112

もちろんネコ以外のものも
認識できます

11:27

so here'sここにいる a computerコンピューター algorithmアルゴリズム telling伝える us

182

675188

2438

こちらではコンピュータアルゴリズムが

11:29

the picture画像 contains含まれる
a boy男の子 and a teddyテディ bearくま;

183

677626

3274

写真の中に男の子とテディベアが
写っていることを教えています

11:32

a dog犬, a person人, and a small小さい kite凧
in the backgroundバックグラウンド;

184

680900

4366

犬と人物と後方に小さな凧が
あることを示しています

11:37

or a picture画像 of very busy忙しい things

185

685266

3135

とても沢山のものが
写った写真から

11:40

like a man, a skateboardスケートボード,
railings手すり, a lampostランプスト, and so on.

186

688401

4644

男性スケートボード手すり
街灯などを見分けています

11:45

Sometimes時々, when the computerコンピューター
is not so confident自信を持って about what it sees見える,

187

693045

5293

写っているものが何なのかコンピュータが
そんなに自信を持てない場合もあります [動物]

11:51

we have taught教えた it to be smartスマート enough十分な

188

699498

2276

コンピュータには
当て推量をするよりは

11:53

to give us a safe安全 answer回答
instead代わりに of committingコミットする too much,

189

701774

3878

確かなところを答えるよう
教えています

11:57

just like we would do,

190

705652

2811

ちょうど私たち自身がするように

12:00

but other times回 our computerコンピューター algorithmアルゴリズム
is remarkable顕著 at telling伝える us

191

708463

4666

一方で何が写っているかについて
コンピュータアルゴリズムが

12:05

what exactly正確に the objectsオブジェクト are,

192

713129

2253

驚くほど正確に
言い当てることもあります

12:07

like the make, modelモデル, year年 of the cars車.

193

715382

3436

たとえば自動車の車種や
モデルや年式のような

12:10

We applied適用された this algorithmアルゴリズム to millions何百万
of GoogleGoogle Street通り Viewビュー imagesイメージ

194

718818

5386

このアルゴリズムを
アメリカの数百都市の

12:16

across横断する hundreds数百 of Americanアメリカ人 cities都市,

195

724204

3135

何百万という
Googleストリートビュー画像に適用した結果

12:19

and we have learned学んだ something
really interesting面白い:

196

727339

2926

面白い発見がありました

12:22

first, it confirmed確認済み our common一般 wisdom知恵

197

730265

3320

まず車の値段は

家計収入とよく相関しているという

12:25

that car車 prices価格 correlate相関する very well

198

733585

3290

12:28

with household家庭 incomes収入.

199

736875

2345

予想が裏付けられました

12:31

But surprisingly驚くほど, car車 prices価格
alsoまた、 correlate相関する well

200

739220

4527

でも驚いたことに
車の値段は

街の犯罪率とも
よく相関していたんです

12:35

with crime犯罪 rates料金 in cities都市,

201

743747

2300

それはまた郵便番号区域ごとの
投票傾向とも相関しています

12:39

or voting投票 patternsパターン by zipジップ codesコード.

202

747007

3963

12:44

So wait a minute分. Is that it?

203

752060

2206

それではコンピュータは

12:46

Has the computerコンピューター already既に matched一致する
or even surpassed超越 human人間 capabilities能力?

204

754266

5153

既に人間の能力に追いつき
追い越しているのでしょうか？

12:51

Not so fast速い.

205

759419

2138

結論を急がないで

12:53

So far遠い, we have just taught教えた
the computerコンピューター to see objectsオブジェクト.

206

761557

4923

これまでのところ私たちは
コンピュータに物の見方を教えただけです

12:58

This is like a small小さい child子
learning学習 to utter発声する a few少数 nouns名詞.

207

766480

4644

小さな子供が名詞をいくつか
言えるようになったようなものです

13:03

It's an incredible信じられない accomplishment達成,

208

771124

2670

ものすごい成果ですが

13:05

but it's only the first stepステップ.

209

773794

2460

まだ第一歩にすぎず

13:08

Soonすぐに, another別の developmental発達する
milestoneマイルストーン will be hitヒット,

210

776254

3762

次の開発目標があります

13:12

and children子供 beginベギン
to communicate通信する in sentences文章.

211

780016

3461

子供は文章でコミュニケーションを
するようになります

13:15

So instead代わりに of saying言って
this is a catネコ in the picture画像,

212

783477

4224

だから写真を見て小さな女の子が
単にネコと言わずに

13:19

you already既に heard聞いた the little girl女の子
telling伝える us this is a catネコ lying嘘つき on a bedベッド.

213

787701

5202

ネコがベッドに座っていると
言うのを聞いたわけです

13:24

So to teach教える a computerコンピューター
to see a picture画像 and generate生成する sentences文章,

214

792903

5595

コンピュータが写真を見て
文章を作れるよう教えるために

13:30

the marriage結婚 betweenの間に big大きい dataデータ
and machine機械 learning学習 algorithmアルゴリズム

215

798498

3948

このビッグデータと
機械学習の結びつきが

13:34

has to take another別の stepステップ.

216

802446

2275

新たなステップを
踏む必要があります

13:36

Now, the computerコンピューター has to learn学ぶ
from bothどちらも picturesピクチャー

217

804721

4156

コンピュータは
写真だけでなく

13:40

as well as naturalナチュラル language言語 sentences文章

218

808877

2856

人が発する自然言語の文章も

13:43

generated生成された by humans人間.

219

811733

3322

学ぶ必要があります

13:47

Just like the brain脳 integrates統合する
visionビジョン and language言語,

220

815055

3853

脳が視覚と言語を
結びつけるように

13:50

we developed発展した a modelモデル
that connects接続する parts部品 of visualビジュアル things

221

818908

5201

画像の断片のような
視覚的なものの一部と

13:56

like visualビジュアル snippetsスニペット

222

824109

1904

文章の中の単語やフレーズを
繋ぎ合わせるモデルを

13:58

with words言葉 and phrasesフレーズ in sentences文章.

223

826013

4203

私たちは開発しました

14:02

About four4つの months数ヶ月 ago前,

224

830216

2763

４ヶ月ほど前

14:04

we finally最後に tied結ばれた all this together一緒に

225

832979

2647

ついに私たちは
すべてをまとめ

14:07

and produced生産された one of the first
computerコンピューター visionビジョン modelsモデル

226

835626

3784

初めて見た写真について

人が書いたような
記述文を生成できる

14:11

that is capable可能な of generating生成する
a human-like人間のような sentence文

227

839410

3994

14:15

when it sees見える a picture画像 for the first time.

228

843404

3506

最初のコンピュータ・ビジョン・
モデルを作り上げました

14:18

Now, I'm ready準備完了 to showショー you
what the computerコンピューター says言う

229

846910

4644

冒頭で小さな女の子が説明したのと
同じ写真を見て

14:23

when it sees見える the picture画像

230

851554

1975

そのコンピュータが何と言ったか

14:25

that the little girl女の子 saw
at the beginning始まり of this talk.

231

853529

3830

お見せしましょう

「ゾウの横に立っている男」

14:31

(Videoビデオ) Computerコンピューター: A man is standing立っている
next次 to an elephant象.

232

859519

3344

14:36

A large大 airplane飛行機 sitting座っている on top上
of an airport空港 runway滑走路.

233

864393

3634

「空港の滑走路にいる大きな飛行機」

14:41

FFLFFL: Of courseコース, we're still workingワーキング hardハード
to improve改善する our algorithmsアルゴリズム,

234

869057

4212

私たちは今もアルゴリズムを改良しようと
熱心に取り組んでいて

14:45

and it still has a lot to learn学ぶ.

235

873269

2596

学ぶべきことは
まだまだあります

14:47

(Applause拍手)

236

875865

2291

(拍手)

14:51

And the computerコンピューター still makes作る mistakes間違い.

237

879556

3321

コンピュータは
まだ間違いを犯します

14:54

(Videoビデオ) Computerコンピューター: A catネコ lying嘘つき
on a bedベッド in a blanket毛布.

238

882877

3391

「ベッドの上の毛布の中のネコ」

14:58

FFLFFL: So of courseコース, when it sees見える
too manyたくさんの cats猫,

239

886268

2553

ネコを沢山見過ぎたせいで

15:00

it thinks考える everything
mightかもしれない look like a catネコ.

240

888821

2926

何でもネコみたいに
見えるのかもしれません

15:05

(Videoビデオ) Computerコンピューター: A young若い boy男の子
is holdingホールディング a baseball野球 batコウモリ.

241

893317

2864

「野球バットを持つ小さな男の子」

15:08

(Laughter笑い)

242

896181

1765

(笑)

15:09

FFLFFL: Or, if it hasn't持っていない seen見た a toothbrush歯ブラシ,
it confuses混乱 it with a baseball野球 batコウモリ.

243

897946

4583

歯ブラシを見たことがないと
野球バットと混同してしまいます

15:15

(Videoビデオ) Computerコンピューター: A man ridingライディング a horseうま
down a street通り next次 to a building建物.

244

903309

3434

「建物脇の道を馬に乗って行く男」

15:18

(Laughter笑い)

245

906743

2023

(笑)

15:20

FFLFFL: We haven't持っていない taught教えた Artアート 101
to the computersコンピュータ.

246

908766

3552

美術はまだコンピュータに
教えていませんでした

15:25

(Videoビデオ) Computerコンピューター: A zebraシマウマ standing立っている
in a fieldフィールド of grass草.

247

913768

2884

「草原に立つシマウマ」

15:28

FFLFFL: And it hasn't持っていない learned学んだ to appreciate感謝する
the stunning見事な beauty美しさ of nature自然

248

916652

3367

私たちのように
自然の美を慈しむことは

15:32

like you and I do.

249

920019

2438

まだ学んでいません

15:34

So it has been a long journey旅.

250

922457

2832

長い道のりでした

15:37

To get from age年齢 zeroゼロ to three三 was hardハード.

251

925289

4226

０歳から３歳まで行くのは
大変でした

15:41

The realリアル challengeチャレンジ is to go
from three三 to 13 and far遠い beyond超えて.

252

929515

5596

でも本当の挑戦は３歳から13歳
さらにその先へと行くことです

15:47

Let me remind思い出させる you with this picture画像
of the boy男の子 and the cakeケーキ again.

253

935111

4365

あの男の子とケーキの写真を
もう一度見てみましょう

15:51

So far遠い, we have taught教えた
the computerコンピューター to see objectsオブジェクト

254

939476

4064

私たちはコンピュータに
物を識別することを教え

15:55

or even tell us a simple単純 storyストーリー
when seeing見る a picture画像.

255

943540

4458

写真を簡単に説明することさえ
教えました

15:59

(Videoビデオ) Computerコンピューター: A person人 sitting座っている
at a table表 with a cakeケーキ.

256

947998

3576

「ケーキのあるテーブルにつく人」

16:03

FFLFFL: But there's so much more
to this picture画像

257

951574

2630

しかしこの写真には
単に人とケーキというよりも

16:06

than just a person人 and a cakeケーキ.

258

954204

2270

遙かに多くのものがあります

16:08

What the computerコンピューター doesn't see
is that this is a special特別 Italianイタリアの cakeケーキ

259

956474

4467

コンピュータが見なかったのは
このケーキが特別なイタリアのケーキで

16:12

that's only servedサービスされた during中 Easterイースター time.

260

960941

3217

イースターの時に
食べるものだということです

16:16

The boy男の子 is wearing着る his favoriteお気に入り t-shirtTシャツ

261

964158

3205

男の子が着ているのは
お気に入りのTシャツで

16:19

given与えられた to him as a gift贈り物 by his fatherお父さん
after a trip旅行 to Sydneyシドニー,

262

967363

3970

お父さんがシドニー旅行の
おみやげにくれたものだということ

16:23

and you and I can all tell how happyハッピー he is

263

971333

3808

私たちはみんな
この男の子がどんなに喜んでいるか

16:27

and what's exactly正確に on his mindマインド
at that moment瞬間.

264

975141

3203

何を思っているかが分かります

16:31

This is my son息子 Leoレオ.

265

979214

3125

これは息子のレオです

16:34

On my questクエスト for visualビジュアル intelligenceインテリジェンス,

266

982339

2624

視覚的な知性を
追い求める探求の中で

16:36

I think of Leoレオ constantly常に

267

984963

2391

私はいつもレオのことや

16:39

and the future未来 world世界 he will liveライブ in.

268

987354

2903

レオが住むであろう
未来の世界のことを考えています

16:42

When machines機械 can see,

269

990257

2021

機械に見ることが
できるようになれば

16:44

doctors医師 and nurses看護師 will have
extra余分な pairsペア of tireless疲れない eyes目

270

992278

4712

医師や看護師は疲れを知らない
別の目を手に入れて

16:48

to help them to diagnose診断する
and take careお手入れ of patients患者.

271

996990

4092

患者の診断や世話に
役立てられるでしょう

16:53

Cars車 will run走る smarterスマートな
and saferより安全な on the road道路.

272

1001082

4383

自動車は道路をより賢明に
安全に走行するようになるでしょう

16:57

Robotsロボット, not just humans人間,

273

1005465

2694

人間だけでなくロボットも

17:00

will help us to brave勇敢な the disaster災害 zonesゾーン
to saveセーブ the trappedトラップされた and wounded負傷した.

274

1008159

4849

災害地域に取り残され負傷した人々を救出する
手助けができるようになるでしょう

17:05

We will discover発見する new新しい species種,
better materials材料,

275

1013798

3796

私たちは機械の助けを借りて
新種の生物やより優れた素材を発見し

17:09

and explore探検する unseen見えない frontiersフロンティア
with the help of the machines機械.

276

1017594

4509

未だ見ぬフロンティアを
探検するようになるでしょう

17:15

Little by little, we're giving与える sight視力
to the machines機械.

277

1023113

4167

私たちは少しずつ機械に
視覚を与えています

17:19

First, we teach教える them to see.

278

1027280

2798

最初に私たちが
機械に見ることを教え

それから機械がより良く見られるよう
私たちを助けてくれることでしょう

17:22

Then, they help us to see better.

279

1030078

2763

17:24

For the first time, human人間 eyes目
won't〜されません be the only onesもの

280

1032841

4165

歴史上初めて
人間以外の目が

17:29

pondering熟考 and exploring探検する our world世界.

281

1037006

2934

世界について考察し
探求するようになるのです

17:31

We will not only use the machines機械
for their彼らの intelligenceインテリジェンス,

282

1039940

3460

私たちは機械の知性を
利用するだけでなく

17:35

we will alsoまた、 collaborate協力する with them
in ways方法 that we cannotできない even imagine想像する.

283

1043400

6179

想像もできないような方法で
機械と人間が協力し合うようになるでしょう

17:41

This is my questクエスト:

284

1049579

2161

私が追い求めているのは

17:43

to give computersコンピュータ visualビジュアル intelligenceインテリジェンス

285

1051740

2712

コンピュータに視覚的な知性を与え

17:46

and to create作成する a better future未来
for Leoレオ and for the world世界.

286

1054452

5131

レオや世界のために
より良い未来を作り出すということです

17:51

Thank you.

287

1059583

1811

ありがとうございました

17:53

(Applause拍手)

288

1061394

3785

(拍手)

Translated by Yasushi Aoki
Reviewed by Tadashi Koyama

ABOUT THE SPEAKER

Fei-Fei Li - Computer scientist
As Director of Stanford’s Artificial Intelligence Lab and Vision Lab, Fei-Fei Li is working to solve AI’s trickiest problems -- including image recognition, learning and language processing.

Why you should listen

Using algorithms built on machine learning methods such as neural network models, the Stanford Artificial Intelligence Lab led by Fei-Fei Li has created software capable of recognizing scenes in still photographs -- and accurately describe them using natural language.

Li’s work with neural networks and computer vision (with Stanford’s Vision Lab) marks a significant step forward for AI research, and could lead to applications ranging from more intuitive image searches to robots able to make autonomous decisions in unfamiliar situations.

Fei-Fei was honored as one of Foreign Policy's 2015 Global Thinkers.

More profile about the speaker
Fei-Fei Li | Speaker | TED.com

THE ORIGINAL VIDEO ON TED.COM

フェイフェイ・リー: コンピュータが写真を理解するようになるまで | TED Talk | TED.com