TEDxCERN

Sinan Aral: How we can protect truth in the age of misinformation

Filmed: 2018-11-20

Readability: 5.3

1,289,169 views

Fake news can sway elections, tank economies and sow discord in everyday life. Data scientist Sinan Aral demystifies how and why it spreads so quickly -- citing one of the largest studies on misinformation -- and identifies five strategies to help us unweave the tangled web between true and false.

Double-click the English transcript below to play the video.

00:13

So, on April 23 of 2013,

0

1468

5222

00:18

the Associated Press
put out the following tweet on Twitter.

1

6714

5514

00:24

It said, "Breaking news:

2

12252

2397

00:26

Two explosions at the White House

3

14673

2571

00:29

and Barack Obama has been injured."

4

17268

2333

00:32

This tweet was retweeted 4,000 times
in less than five minutes,

5

20212

5425

00:37

and it went viral thereafter.

6

25661

2217

00:40

Now, this tweet wasn't real news
put out by the Associated Press.

7

28760

4350

00:45

In fact it was false news, or fake news,

8

33134

3333

00:48

that was propagated by Syrian hackers

9

36491

2825

00:51

that had infiltrated
the Associated Press Twitter handle.

10

39340

4694

00:56

Their purpose was to disrupt society,
but they disrupted much more.

11

44407

3889

01:00

Because automated trading algorithms

12

48320

2476

01:02

immediately seized
on the sentiment on this tweet,

13

50820

3360

01:06

and began trading based on the potential

14

54204

2968

01:09

that the president of the United States
had been injured or killed

15

57196

3381

01:12

in this explosion.

16

60601

1200

01:14

And as they started tweeting,

17

62188

1992

01:16

they immediately sent
the stock market crashing,

18

64204

3349

01:19

wiping out 140 billion dollars
in equity value in a single day.

19

67577

5167

01:25

Robert Mueller, special counsel
prosecutor in the United States,

20

73062

4476

01:29

issued indictments
against three Russian companies

21

77562

3892

01:33

and 13 Russian individuals

22

81478

2619

01:36

on a conspiracy to defraud
the United States

23

84121

3167

01:39

by meddling in the 2016
presidential election.

24

87312

3780

01:43

And what this indictment tells as a story

25

91855

3564

01:47

is the story of the Internet
Research Agency,

26

95443

3142

01:50

the shadowy arm of the Kremlin
on social media.

27

98609

3594

01:54

During the presidential election alone,

28

102815

2777

01:57

the Internet Agency's efforts

29

105616

1889

01:59

reached 126 million people
on Facebook in the United States,

30

107529

5167

02:04

issued three million individual tweets

31

112720

3277

02:08

and 43 hours' worth of YouTube content.

32

116021

3842

02:11

All of which was fake --

33

119887

1652

02:13

misinformation designed to sow discord
in the US presidential election.

34

121563

6323

02:20

A recent study by Oxford University

35

128996

2650

02:23

showed that in the recent
Swedish elections,

36

131670

3270

02:26

one third of all of the information
spreading on social media

37

134964

4375

02:31

about the election

38

139363

1198

02:32

was fake or misinformation.

39

140585

2087

02:35

In addition, these types
of social-media misinformation campaigns

40

143037

5078

02:40

can spread what has been called
"genocidal propaganda,"

41

148139

4151

02:44

for instance against
the Rohingya in Burma,

42

152314

3111

02:47

triggering mob killings in India.

43

155449

2303

02:49

We studied fake news

44

157776

1494

02:51

and began studying it
before it was a popular term.

45

159294

3219

02:55

And we recently published
the largest-ever longitudinal study

46

163030

5040

03:00

of the spread of fake news online

47

168094

2286

03:02

on the cover of "Science"
in March of this year.

48

170404

3204

03:06

We studied all of the verified
true and false news stories

49

174523

4161

03:10

that ever spread on Twitter,

50

178708

1753

03:12

from its inception in 2006 to 2017.

51

180485

3818

03:16

And when we studied this information,

52

184612

2314

03:18

we studied verified news stories

53

186950

2876

03:21

that were verified by six
independent fact-checking organizations.

54

189850

3918

03:25

So we knew which stories were true

55

193792

2762

03:28

and which stories were false.

56

196578

2126

03:30

We can measure their diffusion,

57

198728

1873

03:32

the speed of their diffusion,

58

200625

1651

03:34

the depth and breadth of their diffusion,

59

202300

2095

03:36

how many people become entangled
in this information cascade and so on.

60

204419

4142

03:40

And what we did in this paper

61

208942

1484

03:42

was we compared the spread of true news
to the spread of false news.

62

210450

3865

03:46

And here's what we found.

63

214339

1683

03:48

We found that false news
diffused further, faster, deeper

64

216046

3979

03:52

and more broadly than the truth

65

220049

1806

03:53

in every category of information
that we studied,

66

221879

3003

03:56

sometimes by an order of magnitude.

67

224906

2499

03:59

And in fact, false political news
was the most viral.

68

227842

3524

04:03

It diffused further, faster,
deeper and more broadly

69

231390

3147

04:06

than any other type of false news.

70

234561

2802

04:09

When we saw this,

71

237387

1293

04:10

we were at once worried but also curious.

72

238704

2841

04:13

Why?

73

241569

1151

04:14

Why does false news travel
so much further, faster, deeper

74

242744

3373

04:18

and more broadly than the truth?

75

246141

1864

04:20

The first hypothesis
that we came up with was,

76

248339

2961

04:23

"Well, maybe people who spread false news
have more followers or follow more people,

77

251324

4792

04:28

or tweet more often,

78

256140

1557

04:29

or maybe they're more often 'verified'
users of Twitter, with more credibility,

79

257721

4126

04:33

or maybe they've been on Twitter longer."

80

261871

2182

04:36

So we checked each one of these in turn.

81

264077

2298

04:38

And what we found
was exactly the opposite.

82

266691

2920

04:41

False-news spreaders had fewer followers,

83

269635

2436

04:44

followed fewer people, were less active,

84

272095

2254

04:46

less often "verified"

85

274373

1460

04:47

and had been on Twitter
for a shorter period of time.

86

275857

2960

04:50

And yet,

87

278841

1189

04:52

false news was 70 percent more likely
to be retweeted than the truth,

88

280054

5033

04:57

controlling for all of these
and many other factors.

89

285111

3363

05:00

So we had to come up
with other explanations.

90

288498

2690

05:03

And we devised what we called
a "novelty hypothesis."

91

291212

3467

05:07

So if you read the literature,

92

295038

1960

05:09

it is well known that human attention
is drawn to novelty,

93

297022

3754

05:12

things that are new in the environment.

94

300800

2519

05:15

And if you read the sociology literature,

95

303343

1985

05:17

you know that we like to share
novel information.

96

305352

4300

05:21

It makes us seem like we have access
to inside information,

97

309676

3838

05:25

and we gain in status
by spreading this kind of information.

98

313538

3785

05:29

So what we did was we measured the novelty
of an incoming true or false tweet,

99

317792

6452

05:36

compared to the corpus
of what that individual had seen

100

324268

4055

05:40

in the 60 days prior on Twitter.

101

328347

2952

05:43

But that wasn't enough,
because we thought to ourselves,

102

331323

2659

05:46

"Well, maybe false news is more novel
in an information-theoretic sense,

103

334006

4208

05:50

but maybe people
don't perceive it as more novel."

104

338238

3258

05:53

So to understand people's
perceptions of false news,

105

341849

3927

05:57

we looked at the information
and the sentiment

106

345800

3690

06:01

contained in the replies
to true and false tweets.

107

349514

4206

06:06

And what we found

108

354022

1206

06:07

was that across a bunch
of different measures of sentiment --

109

355252

4214

06:11

surprise, disgust, fear, sadness,

110

359490

3301

06:14

anticipation, joy and trust --

111

362815

2484

06:17

false news exhibited significantly more
surprise and disgust

112

365323

5857

06:23

in the replies to false tweets.

113

371204

2806

06:26

And true news exhibited
significantly more anticipation,

114

374392

3789

06:30

joy and trust

115

378205

1547

06:31

in reply to true tweets.

116

379776

2547

06:34

The surprise corroborates
our novelty hypothesis.

117

382347

3786

06:38

This is new and surprising,
and so we're more likely to share it.

118

386157

4609

06:43

At the same time,
there was congressional testimony

119

391092

2925

06:46

in front of both houses of Congress
in the United States,

120

394041

3036

06:49

looking at the role of bots
in the spread of misinformation.

121

397101

3738

06:52

So we looked at this too --

122

400863

1354

06:54

we used multiple sophisticated
bot-detection algorithms

123

402241

3598

06:57

to find the bots in our data
and to pull them out.

124

405863

3074

07:01

So we pulled them out,
we put them back in

125

409347

2659

07:04

and we compared what happens
to our measurement.

126

412030

3119

07:07

And what we found was that, yes indeed,

127

415173

2293

07:09

bots were accelerating
the spread of false news online,

128

417490

3682

07:13

but they were accelerating
the spread of true news

129

421196

2651

07:15

at approximately the same rate.

130

423871

2405

07:18

Which means bots are not responsible

131

426300

2858

07:21

for the differential diffusion
of truth and falsity online.

132

429182

4713

07:25

We can't abdicate that responsibility,

133

433919

2849

07:28

because we, humans,
are responsible for that spread.

134

436792

4259

07:34

Now, everything
that I have told you so far,

135

442472

3334

07:37

unfortunately for all of us,

136

445830

1754

07:39

is the good news.

137

447608

1261

07:42

The reason is because
it's about to get a whole lot worse.

138

450670

4450

07:47

And two specific technologies
are going to make it worse.

139

455850

3682

07:52

We are going to see the rise
of a tremendous wave of synthetic media.

140

460207

5172

07:57

Fake video, fake audio
that is very convincing to the human eye.

141

465403

6031

08:03

And this will powered by two technologies.

142

471458

2754

08:06

The first of these is known
as "generative adversarial networks."

143

474236

3833

08:10

This is a machine-learning model
with two networks:

144

478093

2563

08:12

a discriminator,

145

480680

1547

08:14

whose job it is to determine
whether something is true or false,

146

482251

4200

08:18

and a generator,

147

486475

1167

08:19

whose job it is to generate
synthetic media.

148

487666

3150

08:22

So the synthetic generator
generates synthetic video or audio,

149

490840

5102

08:27

and the discriminator tries to tell,
"Is this real or is this fake?"

150

495966

4675

08:32

And in fact, it is the job
of the generator

151

500665

2874

08:35

to maximize the likelihood
that it will fool the discriminator

152

503563

4435

08:40

into thinking the synthetic
video and audio that it is creating

153

508022

3587

08:43

is actually true.

154

511633

1730

08:45

Imagine a machine in a hyperloop,

155

513387

2373

08:47

trying to get better
and better at fooling us.

156

515784

2803

08:51

This, combined with the second technology,

157

519114

2500

08:53

which is essentially the democratization
of artificial intelligence to the people,

158

521638

5722

08:59

the ability for anyone,

159

527384

2189

09:01

without any background
in artificial intelligence

160

529597

2830

09:04

or machine learning,

161

532451

1182

09:05

to deploy these kinds of algorithms
to generate synthetic media

162

533657

4103

09:09

makes it ultimately so much easier
to create videos.

163

537784

4547

09:14

The White House issued
a false, doctored video

164

542355

4421

09:18

of a journalist interacting with an intern
who was trying to take his microphone.

165

546800

4288

09:23

They removed frames from this video

166

551427

1999

09:25

in order to make his actions
seem more punchy.

167

553450

3287

09:29

And when videographers
and stuntmen and women

168

557157

3385

09:32

were interviewed
about this type of technique,

169

560566

2427

09:35

they said, "Yes, we use this
in the movies all the time

170

563017

3828

09:38

to make our punches and kicks
look more choppy and more aggressive."

171

566869

4763

09:44

They then put out this video

172

572268

1867

09:46

and partly used it as justification

173

574159

2500

09:48

to revoke Jim Acosta,
the reporter's, press pass

174

576683

3999

09:52

from the White House.

175

580706

1339

09:54

And CNN had to sue
to have that press pass reinstated.

176

582069

4809

10:00

There are about five different paths
that I can think of that we can follow

177

588538

5603

10:06

to try and address some
of these very difficult problems today.

178

594165

3739

10:10

Each one of them has promise,

179

598379

1810

10:12

but each one of them
has its own challenges.

180

600213

2999

10:15

The first one is labeling.

181

603236

2008

10:17

Think about it this way:

182

605268

1357

10:18

when you go to the grocery store
to buy food to consume,

183

606649

3611

10:22

it's extensively labeled.

184

610284

1904

10:24

You know how many calories it has,

185

612212

1992

10:26

how much fat it contains --

186

614228

1801

10:28

and yet when we consume information,
we have no labels whatsoever.

187

616053

4278

10:32

What is contained in this information?

188

620355

1928

10:34

Is the source credible?

189

622307

1453

10:35

Where is this information gathered from?

190

623784

2317

10:38

We have none of that information

191

626125

1825

10:39

when we are consuming information.

192

627974

2103

10:42

That is a potential avenue,
but it comes with its challenges.

193

630101

3238

10:45

For instance, who gets to decide,
in society, what's true and what's false?

194

633363

6451

10:52

Is it the governments?

195

640387

1642

10:54

Is it Facebook?

196

642053

1150

10:55

Is it an independent
consortium of fact-checkers?

197

643601

3762

10:59

And who's checking the fact-checkers?

198

647387

2466

11:02

Another potential avenue is incentives.

199

650427

3084

11:05

We know that during
the US presidential election

200

653535

2634

11:08

there was a wave of misinformation
that came from Macedonia

201

656193

3690

11:11

that didn't have any political motive

202

659907

2337

11:14

but instead had an economic motive.

203

662268

2460

11:16

And this economic motive existed,

204

664752

2148

11:18

because false news travels
so much farther, faster

205

666924

3524

11:22

and more deeply than the truth,

206

670472

2010

11:24

and you can earn advertising dollars
as you garner eyeballs and attention

207

672506

4960

11:29

with this type of information.

208

677490

1960

11:31

But if we can depress the spread
of this information,

209

679474

3833

11:35

perhaps it would reduce
the economic incentive

210

683331

2897

11:38

to produce it at all in the first place.

211

686252

2690

11:40

Third, we can think about regulation,

212

688966

2500

11:43

and certainly, we should think
about this option.

213

691490

2325

11:45

In the United States, currently,

214

693839

1611

11:47

we are exploring what might happen
if Facebook and others are regulated.

215

695474

4848

11:52

While we should consider things
like regulating political speech,

216

700346

3801

11:56

labeling the fact
that it's political speech,

217

704171

2508

11:58

making sure foreign actors
can't fund political speech,

218

706703

3819

12:02

it also has its own dangers.

219

710546

2547

12:05

For instance, Malaysia just instituted
a six-year prison sentence

220

713522

4878

12:10

for anyone found spreading misinformation.

221

718424

2734

12:13

And in authoritarian regimes,

222

721696

2079

12:15

these kinds of policies can be used
to suppress minority opinions

223

723799

4666

12:20

and to continue to extend repression.

224

728489

3508

12:24

The fourth possible option
is transparency.

225

732680

3543

12:28

We want to know
how do Facebook's algorithms work.

226

736843

3714

12:32

How does the data
combine with the algorithms

227

740581

2880

12:35

to produce the outcomes that we see?

228

743485

2838

12:38

We want them to open the kimono

229

746347

2349

12:40

and show us exactly the inner workings
of how Facebook is working.

230

748720

4214

12:44

And if we want to know
social media's effect on society,

231

752958

2779

12:47

we need scientists, researchers

232

755761

2086

12:49

and others to have access
to this kind of information.

233

757871

3143

12:53

But at the same time,

234

761038

1547

12:54

we are asking Facebook
to lock everything down,

235

762609

3801

12:58

to keep all of the data secure.

236

766434

2173

13:00

So, Facebook and the other
social media platforms

237

768631

3159

13:03

are facing what I call
a transparency paradox.

238

771814

3134

13:07

We are asking them, at the same time,

239

775266

2674

13:09

to be open and transparent
and, simultaneously secure.

240

777964

4809

13:14

This is a very difficult needle to thread,

241

782797

2691

13:17

but they will need to thread this needle

242

785512

1913

13:19

if we are to achieve the promise
of social technologies

243

787449

3787

13:23

while avoiding their peril.

244

791260

1642

13:24

The final thing that we could think about
is algorithms and machine learning.

245

792926

4691

13:29

Technology devised to root out
and understand fake news, how it spreads,

246

797641

5277

13:34

and to try and dampen its flow.

247

802942

2331

13:37

Humans have to be in the loop
of this technology,

248

805824

2897

13:40

because we can never escape

249

808745

2278

13:43

that underlying any technological
solution or approach

250

811047

4038

13:47

is a fundamental ethical
and philosophical question

251

815109

4047

13:51

about how do we define truth and falsity,

252

819180

3270

13:54

to whom do we give the power
to define truth and falsity

253

822474

3180

13:57

and which opinions are legitimate,

254

825678

2460

14:00

which type of speech
should be allowed and so on.

255

828162

3706

14:03

Technology is not a solution for that.

256

831892

2328

14:06

Ethics and philosophy
is a solution for that.

257

834244

3698

14:10

Nearly every theory
of human decision making,

258

838950

3318

14:14

human cooperation and human coordination

259

842292

2761

14:17

has some sense of the truth at its core.

260

845077

3674

14:21

But with the rise of fake news,

261

849347

2056

14:23

the rise of fake video,

262

851427

1443

14:24

the rise of fake audio,

263

852894

1882

14:26

we are teetering on the brink
of the end of reality,

264

854800

3924

14:30

where we cannot tell
what is real from what is fake.

265

858748

3889

14:34

And that's potentially
incredibly dangerous.

266

862661

3039

14:38

We have to be vigilant
in defending the truth

267

866931

3948

14:42

against misinformation.

268

870903

1534

14:44

With our technologies, with our policies

269

872919

3436

14:48

and, perhaps most importantly,

270

876379

1920

14:50

with our own individual responsibilities,

271

878323

3214

14:53

decisions, behaviors and actions.

272

881561

3555

14:57

Thank you very much.

273

885553

1437

14:59

(Applause)

274

887014

3517

Translated by Ivana Korom
Reviewed by Krystian Aparta

THE ORIGINAL VIDEO ON TED.COM

Sinan Aral: How we can protect truth in the age of misinformation | TED Talk | TED.com