WEBVTT

00:00:00.000 --> 00:00:02.640 align:middle line:90%
[LIVEWORX THEME]

00:00:02.640 --> 00:00:05.100 align:middle line:90%
Welcome back to LiveWorx 2020.

00:00:05.100 --> 00:00:07.200 align:middle line:84%
My name is Andia Winslow
and I'm your host

00:00:07.200 --> 00:00:09.420 align:middle line:90%
for today's livestream event.

00:00:09.420 --> 00:00:10.920 align:middle line:84%
Already we've been
hearing from some

00:00:10.920 --> 00:00:12.810 align:middle line:84%
of the top technology
thought leaders

00:00:12.810 --> 00:00:14.580 align:middle line:90%
from the industrial world.

00:00:14.580 --> 00:00:17.730 align:middle line:84%
Now, if you weren't able to
attend those earlier sessions,

00:00:17.730 --> 00:00:21.090 align:middle line:84%
be sure to check them out on
the LiveWorx on-demand catalog,

00:00:21.090 --> 00:00:24.180 align:middle line:84%
which is available to you
from today through June 19th.

00:00:24.180 --> 00:00:27.810 align:middle line:84%
This online library has
over 100 additional sessions

00:00:27.810 --> 00:00:29.340 align:middle line:90%
of great content.

00:00:29.340 --> 00:00:31.800 align:middle line:84%
Also, keep an eye out
for an upcoming email

00:00:31.800 --> 00:00:35.800 align:middle line:84%
to gain access to our content
archive for an entire year.

00:00:35.800 --> 00:00:38.220 align:middle line:84%
Now, let's kick off
our next session--

00:00:38.220 --> 00:00:40.770 align:middle line:84%
artificial emotional
intelligence or emotion

00:00:40.770 --> 00:00:44.100 align:middle line:84%
AI, what it is and
why it matters.

00:00:44.100 --> 00:00:48.810 align:middle line:84%
Dr. Rana el Kaliouby is the CEO
and co-founder of Affectiva.

00:00:48.810 --> 00:00:52.890 align:middle line:84%
An MIT Media Lab spinoff,
Affectiva created and defined

00:00:52.890 --> 00:00:54.750 align:middle line:90%
the emotion AI category.

00:00:54.750 --> 00:00:57.630 align:middle line:84%
Software that can detect
nuanced human emotions

00:00:57.630 --> 00:01:01.740 align:middle line:84%
and complex cognitive states
from the face and voice.

00:01:01.740 --> 00:01:05.099 align:middle line:84%
Dr. el Kaliouby is a pioneer
in emotion AI and the author

00:01:05.099 --> 00:01:07.680 align:middle line:84%
of Girl Decoded: A
Scientist's Quest

00:01:07.680 --> 00:01:10.080 align:middle line:84%
to Reclaim our Humanity
by Bringing Emotional

00:01:10.080 --> 00:01:12.240 align:middle line:90%
Intelligence to Technology.

00:01:12.240 --> 00:01:17.690 align:middle line:84%
I am super excited to
welcome Dr. Rana el Kaliouby.

00:01:17.690 --> 00:01:18.740 align:middle line:90%
Thank you, Andia.

00:01:18.740 --> 00:01:22.020 align:middle line:84%
It's a pleasure to be
with you all today.

00:01:22.020 --> 00:01:24.770 align:middle line:84%
I'm on a mission to
humanize technology

00:01:24.770 --> 00:01:29.180 align:middle line:84%
before it dehumanizes us by
building emotional intelligence

00:01:29.180 --> 00:01:32.900 align:middle line:84%
and empathy into our devices
and our technologies.

00:01:32.900 --> 00:01:35.420 align:middle line:84%
And in doing so my
goal is to re-imagine

00:01:35.420 --> 00:01:38.690 align:middle line:84%
human-computer
interfaces as well as

00:01:38.690 --> 00:01:42.100 align:middle line:90%
human to human connection.

00:01:42.100 --> 00:01:45.250 align:middle line:84%
We've all been catapulted
into this universe where

00:01:45.250 --> 00:01:46.480 align:middle line:90%
we're working virtually.

00:01:46.480 --> 00:01:49.210 align:middle line:84%
That's how we're connecting
with our team members.

00:01:49.210 --> 00:01:50.800 align:middle line:84%
We're learning
online, I mean, that's

00:01:50.800 --> 00:01:52.420 align:middle line:90%
how my kids are learning.

00:01:52.420 --> 00:01:55.870 align:middle line:84%
And we're connecting with
friends and families virtually.

00:01:55.870 --> 00:01:57.680 align:middle line:90%
However, something is missing.

00:01:57.680 --> 00:01:59.410 align:middle line:90%
It's not quite the same.

00:01:59.410 --> 00:02:03.610 align:middle line:84%
And it's really this concept
of all of our nonverbal signals

00:02:03.610 --> 00:02:07.180 align:middle line:84%
missing from these
virtual environments.

00:02:07.180 --> 00:02:11.570 align:middle line:84%
AI is taking on roles that were
traditionally done by humans,

00:02:11.570 --> 00:02:13.990 align:middle line:84%
such as assisting
with driving our cars,

00:02:13.990 --> 00:02:17.620 align:middle line:84%
assisting with our health care,
helping us be more productive,

00:02:17.620 --> 00:02:21.310 align:middle line:84%
and perhaps even hiring
your next co-worker.

00:02:21.310 --> 00:02:25.240 align:middle line:84%
The problem is we need a new
social contract between humans

00:02:25.240 --> 00:02:26.380 align:middle line:90%
and AI.

00:02:26.380 --> 00:02:30.130 align:middle line:84%
One that is based on mutual
and reciprocal trust.

00:02:30.130 --> 00:02:32.200 align:middle line:84%
Of course, we need
to trust in the AI

00:02:32.200 --> 00:02:34.450 align:middle line:84%
and there's a lot of
conversations around that.

00:02:34.450 --> 00:02:40.030 align:middle line:84%
But more importantly, AI
needs to trust in us humans.

00:02:40.030 --> 00:02:43.060 align:middle line:84%
After all, we don't always
have a perfect track record

00:02:43.060 --> 00:02:46.420 align:middle line:90%
of doing the right thing.

00:02:46.420 --> 00:02:49.480 align:middle line:84%
Unfortunately, there are
already numerous examples where

00:02:49.480 --> 00:02:51.650 align:middle line:90%
this trust has gone wrong.

00:02:51.650 --> 00:02:56.500 align:middle line:84%
A Twitter chatbot that has
turned racist overnight,

00:02:56.500 --> 00:02:59.470 align:middle line:84%
self-driving cars getting
into fatal accidents,

00:02:59.470 --> 00:03:03.040 align:middle line:84%
and facial recognition
technology that discriminates

00:03:03.040 --> 00:03:05.230 align:middle line:90%
against certain populations--

00:03:05.230 --> 00:03:08.740 align:middle line:90%
especially women of color.

00:03:08.740 --> 00:03:13.090 align:middle line:84%
To rebuild this trust, let's
look at how humans do it.

00:03:13.090 --> 00:03:15.160 align:middle line:84%
Every day we make
thousands of decisions

00:03:15.160 --> 00:03:17.150 align:middle line:84%
that involve
trusting each other,

00:03:17.150 --> 00:03:21.330 align:middle line:84%
both in our personal and our
professional relationships.

00:03:21.330 --> 00:03:24.060 align:middle line:84%
Sometimes this trust
is based on legalize

00:03:24.060 --> 00:03:25.890 align:middle line:90%
and terms and conditions.

00:03:25.890 --> 00:03:30.360 align:middle line:84%
But more often, it's based on
the implicit, nonverbal, subtle

00:03:30.360 --> 00:03:33.930 align:middle line:84%
cues that we exchange with
one another with empathy

00:03:33.930 --> 00:03:36.450 align:middle line:84%
at the core of
building this trust.

00:03:36.450 --> 00:03:38.970 align:middle line:84%
Technology today
has a lot of IQ--

00:03:38.970 --> 00:03:42.690 align:middle line:84%
a lot of cognitive
intelligence, but no EQ--

00:03:42.690 --> 00:03:44.410 align:middle line:90%
no emotional intelligence.

00:03:44.410 --> 00:03:47.340 align:middle line:90%
It's the missing component.

00:03:47.340 --> 00:03:50.950 align:middle line:84%
So my entire career,
I've been asking,

00:03:50.950 --> 00:03:54.990 align:middle line:84%
what if technology could
identify human emotions just

00:03:54.990 --> 00:03:56.942 align:middle line:90%
as we can?

00:03:56.942 --> 00:03:58.900 align:middle line:84%
What if your computer
could tell the difference

00:03:58.900 --> 00:04:01.750 align:middle line:90%
between a smile and a smirk?

00:04:01.750 --> 00:04:03.890 align:middle line:84%
Both involve the lower
half of the face,

00:04:03.890 --> 00:04:06.770 align:middle line:84%
but they have very
different meanings.

00:04:06.770 --> 00:04:11.690 align:middle line:84%
So how can we build AI
that understands human?

00:04:11.690 --> 00:04:14.570 align:middle line:84%
The way to do this is to
look at how humans do it.

00:04:14.570 --> 00:04:18.320 align:middle line:84%
Only 7% of how we
communicate our mental states

00:04:18.320 --> 00:04:21.760 align:middle line:84%
is based on the actual
choice of words we use.

00:04:21.760 --> 00:04:26.960 align:middle line:84%
93% is non-verbal, split
between our facial expressions

00:04:26.960 --> 00:04:27.950 align:middle line:90%
and our gestures--

00:04:27.950 --> 00:04:30.420 align:middle line:90%
I do a lot of these--

00:04:30.420 --> 00:04:34.040 align:middle line:84%
and 38% of vocal
intonations, how fast

00:04:34.040 --> 00:04:37.640 align:middle line:84%
are you speaking, how much
energy is in your voice?

00:04:37.640 --> 00:04:40.080 align:middle line:84%
A lot of my career has
been focused on the face.

00:04:40.080 --> 00:04:44.240 align:middle line:84%
It's a very powerful canvas of
communicating human emotion.

00:04:44.240 --> 00:04:49.740 align:middle line:84%
The science of facial emotions
has existed for over 200 years.

00:04:49.740 --> 00:04:52.700 align:middle line:84%
This guy, Duchenne, used
to electrically stimulate

00:04:52.700 --> 00:04:57.740 align:middle line:84%
our facial muscles to map how
these facial muscles move.

00:04:57.740 --> 00:05:00.790 align:middle line:84%
We don't do that
anymore, thankfully.

00:05:00.790 --> 00:05:04.740 align:middle line:84%
And then in the late 1970s,
Paul Ekman and his team

00:05:04.740 --> 00:05:07.770 align:middle line:84%
published the Facial
Action Coding System--

00:05:07.770 --> 00:05:11.760 align:middle line:84%
an objective method of mapping
each facial muscle movement

00:05:11.760 --> 00:05:13.960 align:middle line:90%
to an action unit, to a code.

00:05:13.960 --> 00:05:16.650 align:middle line:84%
So, for example, when you
smile-- and try this with me--

00:05:16.650 --> 00:05:19.800 align:middle line:84%
when you smile, you're
pulling the zygomaticus muscle

00:05:19.800 --> 00:05:24.180 align:middle line:84%
and that's basically the lip
corner pull or action unit 12.

00:05:24.180 --> 00:05:28.210 align:middle line:84%
When you furrow or frown,
this is the action unit 4,

00:05:28.210 --> 00:05:29.910 align:middle line:90%
it's the brow furrow.

00:05:29.910 --> 00:05:32.310 align:middle line:84%
And it's typically an
indicator of a negative emotion

00:05:32.310 --> 00:05:36.080 align:middle line:90%
like confusion or anger.

00:05:36.080 --> 00:05:39.530 align:middle line:84%
There's about 45 of these
facial muscles and it

00:05:39.530 --> 00:05:42.290 align:middle line:84%
takes about 100
hours of training

00:05:42.290 --> 00:05:47.090 align:middle line:84%
to become a certified fax
coder or face reader--

00:05:47.090 --> 00:05:49.790 align:middle line:84%
very laborious,
very time intensive.

00:05:49.790 --> 00:05:51.650 align:middle line:84%
And to code every
minute of video

00:05:51.650 --> 00:05:54.140 align:middle line:84%
takes you about five minutes
of watching the video

00:05:54.140 --> 00:05:56.890 align:middle line:84%
in slow motion and say,
oh, ah-ha, I see a,

00:05:56.890 --> 00:05:59.570 align:middle line:84%
you know, an eyebrow
raise or a squint.

00:05:59.570 --> 00:06:01.040 align:middle line:84%
We don't need to
do that anymore.

00:06:01.040 --> 00:06:04.370 align:middle line:84%
Instead, we use computer vision
and machine learning and deep

00:06:04.370 --> 00:06:07.670 align:middle line:84%
learning to automatically
train algorithms to detect

00:06:07.670 --> 00:06:10.720 align:middle line:90%
these facial expressions.

00:06:10.720 --> 00:06:13.450 align:middle line:84%
We use hundreds of
thousands of examples

00:06:13.450 --> 00:06:17.140 align:middle line:84%
of people smiling and smirking
and furrowing their eyebrows

00:06:17.140 --> 00:06:18.670 align:middle line:90%
to train these algorithms.

00:06:18.670 --> 00:06:23.290 align:middle line:84%
And the deep learning
network is able to distill

00:06:23.290 --> 00:06:25.480 align:middle line:84%
what is common between
all these smiles, what's

00:06:25.480 --> 00:06:29.710 align:middle line:84%
common between all these frowns,
and that's how it learns.

00:06:29.710 --> 00:06:32.320 align:middle line:84%
And so, to simplify it, the
first step of the process

00:06:32.320 --> 00:06:34.930 align:middle line:84%
is to triangulate
where the face is,

00:06:34.930 --> 00:06:39.130 align:middle line:84%
find facial landmarks like
your eyebrows, your mouth,

00:06:39.130 --> 00:06:40.240 align:middle line:90%
or your nose.

00:06:40.240 --> 00:06:43.750 align:middle line:84%
And then you feed that region
into a deep neural network

00:06:43.750 --> 00:06:46.300 align:middle line:84%
which is able to distill what
expressions are happening

00:06:46.300 --> 00:06:49.450 align:middle line:84%
on the face and mapping
those into a number

00:06:49.450 --> 00:06:51.760 align:middle line:84%
of emotional and
cognitive states.

00:06:51.760 --> 00:06:55.090 align:middle line:84%
Everything from joy,
surprise, anger, disgust

00:06:55.090 --> 00:06:58.720 align:middle line:84%
to more complex states
like fatigue, attention,

00:06:58.720 --> 00:07:02.470 align:middle line:84%
cognitive overload,
confusion, and more.

00:07:02.470 --> 00:07:04.540 align:middle line:84%
In our work over the
past number of years,

00:07:04.540 --> 00:07:08.260 align:middle line:84%
we have amassed the world's
largest emotion repository.

00:07:08.260 --> 00:07:11.680 align:middle line:84%
9 and 1/2 million face
videos that we've collected--

00:07:11.680 --> 00:07:13.930 align:middle line:84%
with everybody's
opt-in and consent--

00:07:13.930 --> 00:07:16.280 align:middle line:84%
in 90 countries
around the world.

00:07:16.280 --> 00:07:19.900 align:middle line:84%
This roughly translates to
about 5 billion facial frames.

00:07:19.900 --> 00:07:22.450 align:middle line:84%
It's by far the
largest repository

00:07:22.450 --> 00:07:25.090 align:middle line:84%
of actual emotional
responses out there.

00:07:25.090 --> 00:07:27.280 align:middle line:84%
And we use that data
to train, but also

00:07:27.280 --> 00:07:30.690 align:middle line:90%
validate our algorithms.

00:07:30.690 --> 00:07:33.800 align:middle line:84%
There are so many applications
of this technology

00:07:33.800 --> 00:07:37.340 align:middle line:90%
that transforms industries.

00:07:37.340 --> 00:07:39.980 align:middle line:84%
My big vision is that
in the next few years,

00:07:39.980 --> 00:07:43.160 align:middle line:84%
we're going to see
emotion AI becoming the de

00:07:43.160 --> 00:07:44.900 align:middle line:90%
facto human-machine interface.

00:07:44.900 --> 00:07:47.810 align:middle line:84%
So basically we would be
interacting with our devices

00:07:47.810 --> 00:07:50.150 align:middle line:84%
just the way we interact
with one another.

00:07:50.150 --> 00:07:51.260 align:middle line:90%
Through conversation.

00:07:51.260 --> 00:07:53.600 align:middle line:84%
We're already seeing that
with conversational devices

00:07:53.600 --> 00:07:55.550 align:middle line:90%
like Alexa and Siri.

00:07:55.550 --> 00:07:56.540 align:middle line:90%
Through perception.

00:07:56.540 --> 00:07:58.520 align:middle line:84%
Again, we're already
starting to see devices

00:07:58.520 --> 00:08:00.530 align:middle line:90%
that have cameras on them.

00:08:00.530 --> 00:08:04.070 align:middle line:84%
But perhaps most importantly,
through empathy and emotional

00:08:04.070 --> 00:08:05.550 align:middle line:90%
intelligence.

00:08:05.550 --> 00:08:06.800 align:middle line:90%
There's a lot of applications.

00:08:06.800 --> 00:08:09.120 align:middle line:90%
I'm going to focus on a few.

00:08:09.120 --> 00:08:11.420 align:middle line:84%
The first application
is around quantifying

00:08:11.420 --> 00:08:14.960 align:middle line:84%
how consumers emotionally
engage with products and brands

00:08:14.960 --> 00:08:16.730 align:middle line:90%
around them.

00:08:16.730 --> 00:08:20.240 align:middle line:84%
The way this works is we
send out surveys to people,

00:08:20.240 --> 00:08:22.710 align:middle line:84%
we ask them to watch
a piece of content.

00:08:22.710 --> 00:08:26.120 align:middle line:84%
It could be an online video ad,
it could be a movie trailer,

00:08:26.120 --> 00:08:30.380 align:middle line:84%
it could be an actual TV show,
it could be learning content.

00:08:30.380 --> 00:08:32.299 align:middle line:84%
But the idea is
we want to capture

00:08:32.299 --> 00:08:35.510 align:middle line:84%
the emotional engagement and
the emotional response people

00:08:35.510 --> 00:08:37.789 align:middle line:90%
have to this content.

00:08:37.789 --> 00:08:39.990 align:middle line:84%
We asked people to
turn their cameras on--

00:08:39.990 --> 00:08:42.182 align:middle line:84%
consent and opt-in
is really important--

00:08:42.182 --> 00:08:44.390 align:middle line:84%
and then we're able to
capture these moment-by-moment

00:08:44.390 --> 00:08:46.710 align:middle line:90%
responses.

00:08:46.710 --> 00:08:52.170 align:middle line:84%
Our technology is being used by
25% of the Fortune Global 500,

00:08:52.170 --> 00:08:55.080 align:middle line:84%
as well as leading market
research firms that

00:08:55.080 --> 00:08:59.400 align:middle line:84%
use our technology to quantify
the emotional response

00:08:59.400 --> 00:09:02.898 align:middle line:84%
that consumers and viewers
have to their content.

00:09:02.898 --> 00:09:04.440 align:middle line:84%
I thought it would
be fun to actually

00:09:04.440 --> 00:09:08.580 align:middle line:84%
show you one of the video
ads that we have tested

00:09:08.580 --> 00:09:11.200 align:middle line:84%
and we have permission
to share publicly.

00:09:11.200 --> 00:09:12.250 align:middle line:90%
So let's watch together.

00:09:12.250 --> 00:09:15.666 align:middle line:84%
[MUSIC - TONY DALLARA, "COME
 PRIMA"]

00:09:15.666 --> 00:09:18.106 align:middle line:90%


00:09:18.106 --> 00:09:21.522 align:middle line:90%
[ITALIAN SINGING]

00:09:21.522 --> 00:10:13.124 align:middle line:90%


00:10:13.124 --> 00:10:16.390 align:middle line:90%
Uh, mom got there first.

00:10:16.390 --> 00:10:21.380 align:middle line:84%
So today, we have tested
over 50,000 ads worldwide.

00:10:21.380 --> 00:10:27.520 align:middle line:84%
And this particular ad
scores in the top 10%

00:10:27.520 --> 00:10:30.760 align:middle line:84%
of all of these ads in
the ninetieth percentile.

00:10:30.760 --> 00:10:34.060 align:middle line:84%
It garners very strong
emotional engagement, which

00:10:34.060 --> 00:10:36.040 align:middle line:90%
is our expressiveness score.

00:10:36.040 --> 00:10:37.570 align:middle line:90%
It garners a lot of smiles.

00:10:37.570 --> 00:10:40.960 align:middle line:84%
And you can see the moment
by moment smile curve

00:10:40.960 --> 00:10:43.660 align:middle line:84%
for everybody that's
watched that ad that we

00:10:43.660 --> 00:10:46.400 align:middle line:90%
were able to record data from.

00:10:46.400 --> 00:10:48.710 align:middle line:84%
What is really fascinating
about this particular ad

00:10:48.710 --> 00:10:51.410 align:middle line:84%
is if you compare the
first time people saw it,

00:10:51.410 --> 00:10:53.300 align:middle line:90%
which is the solid green line.

00:10:53.300 --> 00:10:55.130 align:middle line:84%
With the second
time, people saw it,

00:10:55.130 --> 00:10:57.050 align:middle line:90%
which is the dotted green line.

00:10:57.050 --> 00:11:00.140 align:middle line:84%
You can actually see that
people have memory of the ad.

00:11:00.140 --> 00:11:02.420 align:middle line:84%
They're anticipating
when it's really funny

00:11:02.420 --> 00:11:05.420 align:middle line:84%
and they're laughing even
before that scene starts, which

00:11:05.420 --> 00:11:07.640 align:middle line:90%
is exactly what you want.

00:11:07.640 --> 00:11:09.920 align:middle line:84%
And, more importantly,
we're taking the viewers

00:11:09.920 --> 00:11:12.050 align:middle line:84%
on an emotional journey
which culminates

00:11:12.050 --> 00:11:15.650 align:middle line:84%
with a very positive response
at the end which coincides

00:11:15.650 --> 00:11:16.700 align:middle line:90%
with the brand reveal.

00:11:16.700 --> 00:11:19.280 align:middle line:84%
And, again, we know
from our research

00:11:19.280 --> 00:11:22.990 align:middle line:84%
that these kinds of
metrics are very positive.

00:11:22.990 --> 00:11:25.020 align:middle line:84%
I love this particular
example, because it's

00:11:25.020 --> 00:11:29.130 align:middle line:84%
an example of where brands
like Coca-Cola and Unilever

00:11:29.130 --> 00:11:32.820 align:middle line:84%
are using this technology
to really push online video

00:11:32.820 --> 00:11:36.720 align:middle line:84%
advertising and become more
inclusive and more progressive.

00:11:36.720 --> 00:11:39.780 align:middle line:84%
But they have to do that
in a very thoughtful way.

00:11:39.780 --> 00:11:42.750 align:middle line:84%
And they're able to capture
these subconscious visceral

00:11:42.750 --> 00:11:45.830 align:middle line:90%
responses to their ads.

00:11:45.830 --> 00:11:48.800 align:middle line:84%
We also know, from a lot of work
we've done with these brands

00:11:48.800 --> 00:11:51.650 align:middle line:84%
where we've correlated
the emotional responses

00:11:51.650 --> 00:11:54.860 align:middle line:84%
to actual consumer
behavior, that this kind

00:11:54.860 --> 00:11:58.460 align:middle line:84%
of emotional journey and
positive emotional responses

00:11:58.460 --> 00:12:00.620 align:middle line:84%
correlate very
highly with things

00:12:00.620 --> 00:12:03.800 align:middle line:84%
like sales lift, purchase
intent, and vitality.

00:12:03.800 --> 00:12:06.790 align:middle line:90%


00:12:06.790 --> 00:12:09.910 align:middle line:84%
And because we've tested
50,000 ads worldwide,

00:12:09.910 --> 00:12:12.280 align:middle line:90%
we're able to create benchmarks.

00:12:12.280 --> 00:12:14.680 align:middle line:84%
So, for example, we
know that in the US,

00:12:14.680 --> 00:12:19.070 align:middle line:84%
pet care and baby care ads
elicit the most enjoyment.

00:12:19.070 --> 00:12:22.400 align:middle line:84%
Interestingly, in Canada,
it's the cereal ads

00:12:22.400 --> 00:12:24.980 align:middle line:90%
that elicit the most enjoyment.

00:12:24.980 --> 00:12:28.950 align:middle line:84%
And, unfortunately, telecom
ads elicit almost no enjoyment

00:12:28.950 --> 00:12:29.450 align:middle line:90%
at all.

00:12:29.450 --> 00:12:32.010 align:middle line:90%
They're pretty boring to watch.

00:12:32.010 --> 00:12:34.470 align:middle line:84%
So this data is really,
really insightful,

00:12:34.470 --> 00:12:36.510 align:middle line:84%
and provides brands
and marketers

00:12:36.510 --> 00:12:39.120 align:middle line:84%
with novel insights
that helps them

00:12:39.120 --> 00:12:41.400 align:middle line:84%
make decisions
around media spend

00:12:41.400 --> 00:12:44.510 align:middle line:84%
as well as how to optimize
their advertising content.

00:12:44.510 --> 00:12:46.140 align:middle line:84%
And it's not just
advertising, you

00:12:46.140 --> 00:12:50.803 align:middle line:84%
can generate that same type
of data with TV shows as well.

00:12:50.803 --> 00:12:52.470 align:middle line:84%
And this is one of
my favorite examples.

00:12:52.470 --> 00:12:56.430 align:middle line:84%
It's a sitcom that we
tested a while back for CBS.

00:12:56.430 --> 00:12:58.260 align:middle line:84%
It was called Friends
With Better Lives.

00:12:58.260 --> 00:12:59.610 align:middle line:90%
I don't think it aired for long.

00:12:59.610 --> 00:13:01.120 align:middle line:90%
It didn't do very well.

00:13:01.120 --> 00:13:03.810 align:middle line:84%
But here, again, you're
seeing the smile curve.

00:13:03.810 --> 00:13:06.690 align:middle line:84%
And we superimpose
the characters

00:13:06.690 --> 00:13:08.850 align:middle line:90%
of the scene on that curve.

00:13:08.850 --> 00:13:11.490 align:middle line:84%
And you can see that there
are two particular characters,

00:13:11.490 --> 00:13:14.770 align:middle line:84%
every time they show up
they're just not funny.

00:13:14.770 --> 00:13:19.120 align:middle line:84%
They're basically the
trough in this smile curve.

00:13:19.120 --> 00:13:21.210 align:middle line:84%
And, again, very,
very interesting data

00:13:21.210 --> 00:13:23.730 align:middle line:84%
that the producers
were able to use

00:13:23.730 --> 00:13:27.780 align:middle line:90%
to switch out these characters.

00:13:27.780 --> 00:13:32.240 align:middle line:84%
Moving on from the
world of understanding

00:13:32.240 --> 00:13:33.840 align:middle line:84%
the emotional
engagement people have

00:13:33.840 --> 00:13:36.240 align:middle line:84%
with their content to
another area that we're

00:13:36.240 --> 00:13:37.830 align:middle line:90%
spending a lot of time in.

00:13:37.830 --> 00:13:41.430 align:middle line:84%
And it's the future of
mobility and transportation.

00:13:41.430 --> 00:13:43.320 align:middle line:84%
When we first started
working in the space,

00:13:43.320 --> 00:13:45.780 align:middle line:84%
we got approached by
a number of automakers

00:13:45.780 --> 00:13:49.710 align:middle line:84%
around the world who wanted to
repurpose our emotions sensing

00:13:49.710 --> 00:13:51.780 align:middle line:90%
technology for the car.

00:13:51.780 --> 00:13:54.360 align:middle line:84%
And so to do that we
wanted to see, well,

00:13:54.360 --> 00:13:57.640 align:middle line:84%
how did people behave in the
car anyways, especially drivers.

00:13:57.640 --> 00:14:00.330 align:middle line:84%
And so we set out to
collect some data.

00:14:00.330 --> 00:14:03.683 align:middle line:84%
A lot of us in the company were
very skeptical that perhaps

00:14:03.683 --> 00:14:05.100 align:middle line:84%
because people
knew that there was

00:14:05.100 --> 00:14:06.725 align:middle line:84%
going to be a camera
in their dashboard

00:14:06.725 --> 00:14:09.990 align:middle line:84%
that there wouldn't be
any interesting emotional

00:14:09.990 --> 00:14:12.210 align:middle line:90%
or cognitive responses.

00:14:12.210 --> 00:14:13.060 align:middle line:90%
And we were wrong.

00:14:13.060 --> 00:14:16.920 align:middle line:84%
So I'm going to show you
some examples of video clips.

00:14:16.920 --> 00:14:19.920 align:middle line:84%
Again, these people
knew that the camera was

00:14:19.920 --> 00:14:22.320 align:middle line:84%
in their vehicle,
they installed it,

00:14:22.320 --> 00:14:27.090 align:middle line:84%
they consented, and yet we
saw very interesting driving

00:14:27.090 --> 00:14:28.030 align:middle line:90%
behaviors.

00:14:28.030 --> 00:14:32.610 align:middle line:84%
In this particular case, this
dad is driving the vehicle.

00:14:32.610 --> 00:14:33.920 align:middle line:90%
He's extremely drowsy.

00:14:33.920 --> 00:14:35.700 align:middle line:84%
There are four
levels of drowsiness.

00:14:35.700 --> 00:14:38.370 align:middle line:90%
He's basically asleep.

00:14:38.370 --> 00:14:40.170 align:middle line:84%
And you can also
see that his toddler

00:14:40.170 --> 00:14:41.460 align:middle line:90%
daughter is in the backseat.

00:14:41.460 --> 00:14:45.330 align:middle line:90%
So very, very unsafe driving.

00:14:45.330 --> 00:14:48.480 align:middle line:84%
And I can certainly
empathize with that.

00:14:48.480 --> 00:14:52.200 align:middle line:84%
There are many times
back when we actually

00:14:52.200 --> 00:14:56.190 align:middle line:84%
traveled when I'd be back from
a long trip and jet-lagged

00:14:56.190 --> 00:14:57.870 align:middle line:84%
and exhausted while
I'm driving my kids.

00:14:57.870 --> 00:14:59.940 align:middle line:84%
So I certainly can
resonate with that.

00:14:59.940 --> 00:15:03.597 align:middle line:84%
But that's such an easy example
that the technology can pick.

00:15:03.597 --> 00:15:05.430 align:middle line:84%
And you can see here
that the technology was

00:15:05.430 --> 00:15:08.745 align:middle line:84%
able to detect things like
eye closure, mouth open,

00:15:08.745 --> 00:15:10.650 align:middle line:90%
and levels of drowsiness.

00:15:10.650 --> 00:15:14.200 align:middle line:84%
And you could imagine how the
car can interject in that case.

00:15:14.200 --> 00:15:15.730 align:middle line:90%
Here's another example.

00:15:15.730 --> 00:15:17.880 align:middle line:90%
So she's driving.

00:15:17.880 --> 00:15:21.960 align:middle line:84%
She's also texting while
driving, so very distracted.

00:15:21.960 --> 00:15:24.510 align:middle line:84%
But, oh, she has two
phones in her hand,

00:15:24.510 --> 00:15:27.330 align:middle line:90%
so she's very distracted.

00:15:27.330 --> 00:15:28.680 align:middle line:90%
Her eyes are off the road.

00:15:28.680 --> 00:15:30.780 align:middle line:90%
Her hands are off the wheel.

00:15:30.780 --> 00:15:33.960 align:middle line:84%
You do not want to be
driving next to this woman.

00:15:33.960 --> 00:15:37.080 align:middle line:84%
But, again, that's an example
that technology can pick up

00:15:37.080 --> 00:15:38.860 align:middle line:84%
by just looking at
her glance behavior,

00:15:38.860 --> 00:15:41.670 align:middle line:84%
her head pose information,
but also combining it

00:15:41.670 --> 00:15:43.270 align:middle line:84%
with things like
object detection.

00:15:43.270 --> 00:15:45.810 align:middle line:84%
So we can detect that she has
not only one, but two cell

00:15:45.810 --> 00:15:47.070 align:middle line:90%
phones in her hand.

00:15:47.070 --> 00:15:50.160 align:middle line:84%
We can flag that as a very
highly distracted driver.

00:15:50.160 --> 00:15:52.800 align:middle line:84%
And, again, the car can
intervene one way or another.

00:15:52.800 --> 00:15:55.740 align:middle line:84%
You can imagine if
this were a Tesla

00:15:55.740 --> 00:15:58.710 align:middle line:84%
and it can jump into
semi-autonomous mode,

00:15:58.710 --> 00:16:00.360 align:middle line:84%
it can basically say,
hang on a second,

00:16:00.360 --> 00:16:02.610 align:middle line:84%
I'm going to be a better
driver at this moment in time

00:16:02.610 --> 00:16:06.030 align:middle line:84%
than you are, I'm
taking control.

00:16:06.030 --> 00:16:08.670 align:middle line:84%
We have done a lot of
research in collaboration

00:16:08.670 --> 00:16:11.970 align:middle line:84%
with MIT'S advanced
vehicle technology group,

00:16:11.970 --> 00:16:15.900 align:middle line:84%
where we did the first-ever
large scale study looking

00:16:15.900 --> 00:16:17.590 align:middle line:90%
at driver behavior over time.

00:16:17.590 --> 00:16:19.260 align:middle line:84%
And so in this
particular case, you're

00:16:19.260 --> 00:16:22.030 align:middle line:84%
looking at
participant number 66.

00:16:22.030 --> 00:16:23.450 align:middle line:90%
She's female.

00:16:23.450 --> 00:16:24.930 align:middle line:90%
She's 22 years old.

00:16:24.930 --> 00:16:28.950 align:middle line:84%
And all her Monday drives are
concatenated in the Monday row.

00:16:28.950 --> 00:16:31.860 align:middle line:84%
And all her Tuesday drives
are concatenated on Tuesday,

00:16:31.860 --> 00:16:33.130 align:middle line:90%
and so on.

00:16:33.130 --> 00:16:37.470 align:middle line:84%
And you can see that she's
pretty tired during the week.

00:16:37.470 --> 00:16:39.030 align:middle line:84%
There's a lot of
yawning happening

00:16:39.030 --> 00:16:40.650 align:middle line:90%
at the beginning of the day.

00:16:40.650 --> 00:16:44.020 align:middle line:84%
And during the weekend and also
towards the end of the day,

00:16:44.020 --> 00:16:49.650 align:middle line:84%
there's a lot more
positive expressions.

00:16:49.650 --> 00:16:53.040 align:middle line:84%
They say a picture is
worth a thousand words.

00:16:53.040 --> 00:16:55.950 align:middle line:84%
I believe a video
is worth even more.

00:16:55.950 --> 00:17:02.220 align:middle line:84%
So I'll just show you examples
of her PM versus AM versus PM

00:17:02.220 --> 00:17:06.420 align:middle line:90%
driving behaviors.

00:17:06.420 --> 00:17:09.990 align:middle line:84%
So, again, these were all
mined using our algorithms.

00:17:09.990 --> 00:17:12.930 align:middle line:84%
So we basically invoked
the yarn classifier

00:17:12.930 --> 00:17:15.060 align:middle line:84%
to detect all of her
yawning behaviors

00:17:15.060 --> 00:17:20.000 align:middle line:84%
across all of her
driving sessions.

00:17:20.000 --> 00:17:23.619 align:middle line:84%
So you can see that she's often
very exhausted in the morning.

00:17:23.619 --> 00:17:25.888 align:middle line:84%
And you can imagine, if
that's a recurring pattern,

00:17:25.888 --> 00:17:27.430 align:middle line:84%
then the car already
knows that she's

00:17:27.430 --> 00:17:30.340 align:middle line:90%
going to come in exhausted.

00:17:30.340 --> 00:17:32.770 align:middle line:84%
Maybe there are
suggestions proactively

00:17:32.770 --> 00:17:37.690 align:middle line:84%
that the car can make to the
driver given her profile.

00:17:37.690 --> 00:17:42.720 align:middle line:84%
Now contrast that to her
PM driving behaviors.

00:17:42.720 --> 00:17:46.140 align:middle line:84%
You can see that she's already
looking much more awake.

00:17:46.140 --> 00:17:49.110 align:middle line:84%
She's a lot more animated, a
lot more smiles, a lot more

00:17:49.110 --> 00:17:50.520 align:middle line:90%
positive expressions.

00:17:50.520 --> 00:17:54.300 align:middle line:84%
And once again if the car
had her driver profile

00:17:54.300 --> 00:17:56.940 align:middle line:84%
it could customize
and personalize

00:17:56.940 --> 00:18:02.170 align:middle line:84%
the driving experience based
on her emotional experiences.

00:18:02.170 --> 00:18:05.310 align:middle line:84%
So we're working very closely
with a number of automakers

00:18:05.310 --> 00:18:08.400 align:middle line:84%
to re-imagine this
transportation experience.

00:18:08.400 --> 00:18:10.260 align:middle line:84%
Not only in cars
today where we can

00:18:10.260 --> 00:18:13.890 align:middle line:84%
focus on the driver, but also
vehicles of the future where

00:18:13.890 --> 00:18:17.910 align:middle line:84%
we really have this vision of an
incumbent sensing solution that

00:18:17.910 --> 00:18:20.670 align:middle line:84%
looks at the driver but the
other occupants in the vehicle

00:18:20.670 --> 00:18:21.510 align:middle line:90%
as well.

00:18:21.510 --> 00:18:23.820 align:middle line:84%
As well as other
objects in the car.

00:18:23.820 --> 00:18:25.680 align:middle line:90%
Is there a child left behind?

00:18:25.680 --> 00:18:27.600 align:middle line:90%
Did you leave your phone behind?

00:18:27.600 --> 00:18:29.557 align:middle line:84%
How many occupants
are in the vehicle?

00:18:29.557 --> 00:18:30.390 align:middle line:90%
What is their state?

00:18:30.390 --> 00:18:34.300 align:middle line:84%
Can you personalize the music,
the content, the lighting,

00:18:34.300 --> 00:18:37.260 align:middle line:90%
and so on?

00:18:37.260 --> 00:18:39.110 align:middle line:84%
But, of course, a
lot of us are not

00:18:39.110 --> 00:18:40.610 align:middle line:84%
spending a lot of
time in our car

00:18:40.610 --> 00:18:43.310 align:middle line:84%
anymore during this
global pandemic.

00:18:43.310 --> 00:18:47.870 align:middle line:84%
And, in fact, a lot of us are
spending much more of our time

00:18:47.870 --> 00:18:52.250 align:middle line:84%
on video conferences
and on virtual events.

00:18:52.250 --> 00:18:55.700 align:middle line:84%
Making human connection via
technology is really hard.

00:18:55.700 --> 00:19:00.140 align:middle line:84%
And it's primarily because the
main method of communication

00:19:00.140 --> 00:19:01.430 align:middle line:90%
is nonverbal.

00:19:01.430 --> 00:19:05.070 align:middle line:84%
And sometimes when we are
on these virtual conferences

00:19:05.070 --> 00:19:09.140 align:middle line:84%
and virtual events
and video events,

00:19:09.140 --> 00:19:13.310 align:middle line:84%
you're not really tapping into
the energy and the expressions

00:19:13.310 --> 00:19:14.000 align:middle line:90%
of the audience.

00:19:14.000 --> 00:19:16.220 align:middle line:90%
I certainly feel that myself.

00:19:16.220 --> 00:19:19.220 align:middle line:84%
I just launched my book
Girl Decoded as I mentioned.

00:19:19.220 --> 00:19:21.890 align:middle line:84%
And I had to pivot
from a book tour

00:19:21.890 --> 00:19:24.650 align:middle line:84%
where I was supposed to be
traveling nonstop March, April,

00:19:24.650 --> 00:19:28.220 align:middle line:84%
and May, to all of these virtual
book tours and virtual book

00:19:28.220 --> 00:19:28.910 align:middle line:90%
talks.

00:19:28.910 --> 00:19:32.840 align:middle line:84%
And it's so different
because in a live environment

00:19:32.840 --> 00:19:34.370 align:middle line:90%
I can see the audience.

00:19:34.370 --> 00:19:36.710 align:middle line:84%
I can see you all and I can
riff off of your energy.

00:19:36.710 --> 00:19:38.870 align:middle line:84%
And I can customize
and personalize

00:19:38.870 --> 00:19:43.010 align:middle line:84%
and adapt my content based on
how you're engaging with me.

00:19:43.010 --> 00:19:46.190 align:middle line:84%
When I do this virtually, it's
often a one-way conversation

00:19:46.190 --> 00:19:49.010 align:middle line:84%
or at least it feels to me like
it's a one-way conversation.

00:19:49.010 --> 00:19:51.750 align:middle line:84%
And I find it really
painful and unsettling.

00:19:51.750 --> 00:19:55.820 align:middle line:84%
But I think there's an
opportunity to change that.

00:19:55.820 --> 00:19:59.390 align:middle line:84%
What if we integrate
it, emotion AI,

00:19:59.390 --> 00:20:02.300 align:middle line:84%
as a way to aggregate
audience response,

00:20:02.300 --> 00:20:05.360 align:middle line:84%
capture all of your facial
expressions anonymously.

00:20:05.360 --> 00:20:08.070 align:middle line:84%
I don't need to see
everybody's faces.

00:20:08.070 --> 00:20:11.120 align:middle line:84%
In fact, that would probably be
overwhelming as I'm presenting.

00:20:11.120 --> 00:20:14.510 align:middle line:84%
But if I was able to see a
moment by moment trace of how

00:20:14.510 --> 00:20:17.042 align:middle line:90%
positive or negative did you--

00:20:17.042 --> 00:20:18.500 align:middle line:84%
are you engaging
with this content,

00:20:18.500 --> 00:20:20.420 align:middle line:84%
that would be very
powerful information

00:20:20.420 --> 00:20:22.670 align:middle line:84%
that would allow me
to customize and get

00:20:22.670 --> 00:20:27.030 align:middle line:84%
a sense of the level of
engagement of the audience.

00:20:27.030 --> 00:20:28.900 align:middle line:84%
And so I wanted to
show you one example.

00:20:28.900 --> 00:20:31.390 align:middle line:84%
Just from an internal
team meeting on zoom where

00:20:31.390 --> 00:20:34.240 align:middle line:90%
we recorded teams' responses.

00:20:34.240 --> 00:20:37.360 align:middle line:84%
And you're able to see here
the aggregated response.

00:20:37.360 --> 00:20:40.180 align:middle line:84%
Of course, there's different
ways to visualize this data,

00:20:40.180 --> 00:20:42.070 align:middle line:84%
but we're able to
track everybody

00:20:42.070 --> 00:20:44.405 align:middle line:84%
and we can aggregate
people's smiles, responses.

00:20:44.405 --> 00:20:46.030 align:middle line:84%
They were actually
talking about-- this

00:20:46.030 --> 00:20:49.960 align:middle line:84%
was early in March, just
when the pandemic broke.

00:20:49.960 --> 00:20:51.970 align:middle line:84%
And we were talking
about what's going

00:20:51.970 --> 00:20:54.287 align:middle line:90%
to happen and work from home.

00:20:54.287 --> 00:20:55.870 align:middle line:84%
And people were
sharing their stories.

00:20:55.870 --> 00:20:58.190 align:middle line:84%
And there was a lot of
empathy and smiling behavior.

00:20:58.190 --> 00:21:00.190 align:middle line:90%
So we're able to track that.

00:21:00.190 --> 00:21:02.890 align:middle line:84%
Now compare that to a
really flat, boring meeting

00:21:02.890 --> 00:21:04.600 align:middle line:84%
where everybody's
just not engaging.

00:21:04.600 --> 00:21:06.230 align:middle line:90%
You can tell the difference.

00:21:06.230 --> 00:21:09.790 align:middle line:84%
And we're able to quantify that
data for real-time engagement,

00:21:09.790 --> 00:21:16.050 align:middle line:84%
but also do that after the
fact as post analytics as well.

00:21:16.050 --> 00:21:18.800 align:middle line:84%
Similarly, in an online
learning environment,

00:21:18.800 --> 00:21:21.650 align:middle line:84%
what if a teacher could measure
the emotional engagement

00:21:21.650 --> 00:21:24.920 align:middle line:84%
of his or her students
just as he or she would

00:21:24.920 --> 00:21:26.150 align:middle line:90%
in a live classroom.

00:21:26.150 --> 00:21:28.070 align:middle line:84%
That's what an awesome
teacher does, right?

00:21:28.070 --> 00:21:30.710 align:middle line:84%
You riff off of the
engagement of the students

00:21:30.710 --> 00:21:33.710 align:middle line:84%
and then you can personalize
the learning experience.

00:21:33.710 --> 00:21:37.620 align:middle line:84%
And maximize learning
outcomes as well.

00:21:37.620 --> 00:21:41.390 align:middle line:84%
Similarly, there's applications
in mental health as well.

00:21:41.390 --> 00:21:44.090 align:middle line:84%
Today when you walk
into a doctor's office

00:21:44.090 --> 00:21:46.403 align:middle line:84%
they don't ask you
for your temperature

00:21:46.403 --> 00:21:48.320 align:middle line:84%
or they don't ask you
for your blood pressure,

00:21:48.320 --> 00:21:49.970 align:middle line:90%
they just measure it.

00:21:49.970 --> 00:21:51.890 align:middle line:84%
So what if doctors
could objectively

00:21:51.890 --> 00:21:53.930 align:middle line:84%
measure how you are
feeling the way they

00:21:53.930 --> 00:21:56.090 align:middle line:90%
measure other vital signs?

00:21:56.090 --> 00:21:58.430 align:middle line:84%
Unfortunately, the gold
standard in mental health

00:21:58.430 --> 00:22:01.230 align:middle line:84%
is still on a scale
from 1 to 10--

00:22:01.230 --> 00:22:02.840 align:middle line:90%
how depressed are you?

00:22:02.840 --> 00:22:04.160 align:middle line:90%
How suicidal are you?

00:22:04.160 --> 00:22:05.810 align:middle line:90%
How much pain are you in?

00:22:05.810 --> 00:22:08.780 align:middle line:84%
And we can bring
emotion AI and apply it

00:22:08.780 --> 00:22:12.620 align:middle line:84%
in a way that brings objective
measures of mental health

00:22:12.620 --> 00:22:15.190 align:middle line:90%
conditions.

00:22:15.190 --> 00:22:17.760 align:middle line:84%
One area that is very
near and dear to my heart,

00:22:17.760 --> 00:22:21.010 align:middle line:84%
it's the very first application
of emotion AI that I explored

00:22:21.010 --> 00:22:22.300 align:middle line:90%
is autism.

00:22:22.300 --> 00:22:24.760 align:middle line:84%
Individuals on the
autism spectrum struggle

00:22:24.760 --> 00:22:28.780 align:middle line:84%
with reading and understanding
non-verbal signals.

00:22:28.780 --> 00:22:32.620 align:middle line:84%
They find the face, in
particular, very overwhelming.

00:22:32.620 --> 00:22:34.300 align:middle line:84%
And sometimes they
avoid it altogether.

00:22:34.300 --> 00:22:37.390 align:middle line:84%
They avoid face and
eye contact altogether.

00:22:37.390 --> 00:22:40.690 align:middle line:84%
So we are partnered with a
company called Brain Power.

00:22:40.690 --> 00:22:45.910 align:middle line:84%
They use Google Glass and our
technology to help individuals

00:22:45.910 --> 00:22:49.180 align:middle line:84%
on the autism spectrum learn
about these non-verbal signals

00:22:49.180 --> 00:22:51.550 align:middle line:90%
in a very fun and gamified way.

00:22:51.550 --> 00:22:56.200 align:middle line:84%
So I'm going to show you a
short video demonstrating that.

00:22:56.200 --> 00:22:57.850 align:middle line:90%
What do you see on-screen?

00:22:57.850 --> 00:22:59.090 align:middle line:90%
Mom.

00:22:59.090 --> 00:23:02.128 align:middle line:84%
8-year-old Matthew Krieger has
been diagnosed with autism.

00:23:02.128 --> 00:23:04.170 align:middle line:84%
A lot of the trouble he
gets into with other kids

00:23:04.170 --> 00:23:05.820 align:middle line:90%
is he thinks he's funny.

00:23:05.820 --> 00:23:07.950 align:middle line:84%
And doesn't read at
all that he's not

00:23:07.950 --> 00:23:10.020 align:middle line:84%
or that they're
annoyed or angry.

00:23:10.020 --> 00:23:11.550 align:middle line:84%
Matthew's mother,
Laura, signed him

00:23:11.550 --> 00:23:15.570 align:middle line:84%
up for a clinical trial
being conducted by Ned Sahin.

00:23:15.570 --> 00:23:20.010 align:middle line:84%
I want to know what's going
on inside the brain of someone

00:23:20.010 --> 00:23:21.270 align:middle line:90%
with autism.

00:23:21.270 --> 00:23:24.600 align:middle line:84%
And it turns out parents
want to know that too.

00:23:24.600 --> 00:23:26.700 align:middle line:84%
You get points for
looking for a while

00:23:26.700 --> 00:23:29.660 align:middle line:84%
and then even for looking
away and then looking back.

00:23:29.660 --> 00:23:32.910 align:middle line:84%
Sahin's company, Brain Power,
uses Affectiva's software

00:23:32.910 --> 00:23:35.940 align:middle line:84%
in programs Matthew sees
through Google Glass.

00:23:35.940 --> 00:23:37.950 align:middle line:84%
These games are trying
to help him understand

00:23:37.950 --> 00:23:40.860 align:middle line:84%
how facial expressions
correspond to emotions

00:23:40.860 --> 00:23:42.840 align:middle line:90%
and learn social cues.

00:23:42.840 --> 00:23:45.300 align:middle line:84%
One of the key life
skills is understanding

00:23:45.300 --> 00:23:47.000 align:middle line:90%
the emotions of others.

00:23:47.000 --> 00:23:50.190 align:middle line:84%
And another is looking
in their direction

00:23:50.190 --> 00:23:51.510 align:middle line:90%
when they're speaking.

00:23:51.510 --> 00:23:54.750 align:middle line:84%
Looking at your mom and while
it's green you're getting

00:23:54.750 --> 00:23:58.020 align:middle line:84%
points when it starts to
get orange and red you're--

00:23:58.020 --> 00:24:00.660 align:middle line:90%
you slow down with the points.

00:24:00.660 --> 00:24:01.698 align:middle line:90%
I am looking at you.

00:24:01.698 --> 00:24:02.490 align:middle line:90%
You are looking me.

00:24:02.490 --> 00:24:05.250 align:middle line:84%
Just a few minutes later, the
difference in Matthew's gaze

00:24:05.250 --> 00:24:06.780 align:middle line:90%
overwhelmed his mother.

00:24:06.780 --> 00:24:09.200 align:middle line:90%
I want to cry.

00:24:09.200 --> 00:24:09.700 align:middle line:90%
Why?

00:24:09.700 --> 00:24:13.540 align:middle line:90%


00:24:13.540 --> 00:24:19.340 align:middle line:84%
Because when you
look at me it makes

00:24:19.340 --> 00:24:22.020 align:middle line:84%
me think you haven't really
before because you're

00:24:22.020 --> 00:24:25.230 align:middle line:90%
looking at me differently.

00:24:25.230 --> 00:24:29.260 align:middle line:84%
So Brain Power has about 400
of these Google Glass systems

00:24:29.260 --> 00:24:33.400 align:middle line:84%
deployed in families
across the United States.

00:24:33.400 --> 00:24:35.650 align:middle line:84%
And the main question they're
trying to answer-- we're

00:24:35.650 --> 00:24:38.290 align:middle line:84%
already seeing
improvement in terms

00:24:38.290 --> 00:24:41.763 align:middle line:84%
of the kids' social and
non-verbal understanding,

00:24:41.763 --> 00:24:43.180 align:middle line:84%
while they're
wearing the glasses.

00:24:43.180 --> 00:24:47.410 align:middle line:84%
The key question is what happens
when they take off the glasses.

00:24:47.410 --> 00:24:50.580 align:middle line:90%
Does this learning generalize?

00:24:50.580 --> 00:24:53.592 align:middle line:84%
There's also other
applications of this technology

00:24:53.592 --> 00:24:54.300 align:middle line:90%
in mental health.

00:24:54.300 --> 00:24:58.050 align:middle line:84%
For example, a system for
early detection of Parkinson's.

00:24:58.050 --> 00:25:02.070 align:middle line:84%
Erin Smith-- she's now
a student at Stanford.

00:25:02.070 --> 00:25:05.370 align:middle line:84%
Way back when she was a
junior in high school,

00:25:05.370 --> 00:25:08.550 align:middle line:84%
she emailed us and she
said I've been watching

00:25:08.550 --> 00:25:12.240 align:middle line:84%
this documentary about
Parkinson's and I

00:25:12.240 --> 00:25:14.320 align:middle line:90%
want to use your technology.

00:25:14.320 --> 00:25:15.450 align:middle line:90%
How much does it cost?

00:25:15.450 --> 00:25:20.187 align:middle line:84%
And I remember our head of sales
asked me what do I tell her?

00:25:20.187 --> 00:25:21.520 align:middle line:90%
She can't afford our technology.

00:25:21.520 --> 00:25:24.390 align:middle line:84%
And I said just give
it to her for free.

00:25:24.390 --> 00:25:26.250 align:middle line:84%
What is she going to
do with it anyways?

00:25:26.250 --> 00:25:29.808 align:middle line:84%
And Erin disappeared for a
couple of months and came back.

00:25:29.808 --> 00:25:31.350 align:middle line:84%
She had partnered
with the Michael J.

00:25:31.350 --> 00:25:36.120 align:middle line:84%
Fox foundation and built this
system to identify facial

00:25:36.120 --> 00:25:38.250 align:middle line:90%
biomarkers of Parkinson's.

00:25:38.250 --> 00:25:40.590 align:middle line:84%
And she's continued to
work on this research.

00:25:40.590 --> 00:25:43.060 align:middle line:90%
Very inspiring young woman.

00:25:43.060 --> 00:25:45.090 align:middle line:84%
And I feel very proud
that we are playing

00:25:45.090 --> 00:25:47.190 align:middle line:90%
a small part of her journey.

00:25:47.190 --> 00:25:50.490 align:middle line:84%
We also know that there are
facial and vocal biomarkers

00:25:50.490 --> 00:25:51.780 align:middle line:90%
of depression.

00:25:51.780 --> 00:25:56.040 align:middle line:84%
And there's a lot of work being
done to flag suicidal intent

00:25:56.040 --> 00:25:57.630 align:middle line:90%
based on these signals.

00:25:57.630 --> 00:26:00.570 align:middle line:84%
This is work that
we are collaborating

00:26:00.570 --> 00:26:05.960 align:middle line:84%
with Professor Steven
Benoit and others as well.

00:26:05.960 --> 00:26:08.260 align:middle line:84%
But this brings up a
very important topic,

00:26:08.260 --> 00:26:10.410 align:middle line:84%
which is, OK, there's
a lot of applications

00:26:10.410 --> 00:26:13.320 align:middle line:84%
of this technology, where
do we draw the line?

00:26:13.320 --> 00:26:16.470 align:middle line:84%
And I feel very passionately
about this concept

00:26:16.470 --> 00:26:20.460 align:middle line:84%
of the ethical development
and deployment of AI.

00:26:20.460 --> 00:26:24.990 align:middle line:84%
It's not just about
acknowledging that there's

00:26:24.990 --> 00:26:27.630 align:middle line:84%
so much potential for
good, but it's also

00:26:27.630 --> 00:26:29.940 align:middle line:84%
acknowledging where
this can be abused

00:26:29.940 --> 00:26:33.560 align:middle line:84%
and the unintended consequences
of this technology.

00:26:33.560 --> 00:26:35.310 align:middle line:84%
A number of years ago
when we were raising

00:26:35.310 --> 00:26:37.380 align:middle line:84%
money for the company,
we got approached

00:26:37.380 --> 00:26:41.670 align:middle line:84%
by an agency that wanted
to give us a lot of money--

00:26:41.670 --> 00:26:44.700 align:middle line:84%
$40 million at the time,
which was a lot of money

00:26:44.700 --> 00:26:47.150 align:middle line:90%
for our little startup--

00:26:47.150 --> 00:26:49.680 align:middle line:84%
on condition that they wanted
to use the technology for lie

00:26:49.680 --> 00:26:52.500 align:middle line:90%
detection and surveillance.

00:26:52.500 --> 00:26:54.690 align:middle line:84%
And that was really not in
line with our core values

00:26:54.690 --> 00:26:58.020 align:middle line:84%
of respecting people's
privacy and consent.

00:26:58.020 --> 00:27:02.020 align:middle line:84%
And acknowledging
that, as a user,

00:27:02.020 --> 00:27:03.730 align:middle line:90%
this data is very personal.

00:27:03.730 --> 00:27:05.940 align:middle line:84%
And if I'm going
to be sharing it,

00:27:05.940 --> 00:27:07.890 align:middle line:84%
that I need to know
exactly who's using it,

00:27:07.890 --> 00:27:12.370 align:middle line:84%
how is it being used, and
also what's in it for me?

00:27:12.370 --> 00:27:14.040 align:middle line:84%
So we spent a lot
of time thinking

00:27:14.040 --> 00:27:15.780 align:middle line:90%
about this power asymmetry--

00:27:15.780 --> 00:27:17.580 align:middle line:84%
what value am I
getting in return

00:27:17.580 --> 00:27:20.307 align:middle line:84%
for sharing this
very personal data.

00:27:20.307 --> 00:27:22.390 align:middle line:84%
What's really cool about
this is that the industry

00:27:22.390 --> 00:27:25.440 align:middle line:84%
is taking a lead in
defining these best

00:27:25.440 --> 00:27:26.820 align:middle line:90%
practices and guidelines.

00:27:26.820 --> 00:27:30.600 align:middle line:84%
So we are part of a consortium
called the Partnership on AI.

00:27:30.600 --> 00:27:32.460 align:middle line:84%
It was started by
the tech giants--

00:27:32.460 --> 00:27:35.430 align:middle line:84%
Amazon, Google,
Facebook, Microsoft.

00:27:35.430 --> 00:27:39.120 align:middle line:84%
And they have since invited
a number of startups

00:27:39.120 --> 00:27:40.440 align:middle line:90%
like Affectiva.

00:27:40.440 --> 00:27:44.880 align:middle line:84%
But also other stakeholders like
ACLU and Amnesty International.

00:27:44.880 --> 00:27:47.760 align:middle line:84%
And I'm part of
the FATE committee,

00:27:47.760 --> 00:27:51.090 align:middle line:84%
which is fair, accountable,
transparent, and equitable AI.

00:27:51.090 --> 00:27:55.290 align:middle line:84%
And our task is to come
up with these guidelines

00:27:55.290 --> 00:27:57.227 align:middle line:90%
around thoughtful regulation.

00:27:57.227 --> 00:27:59.310 align:middle line:84%
We need regulation, but
it needs to be thoughtful.

00:27:59.310 --> 00:28:02.100 align:middle line:84%
We don't want to completely
squander innovation,

00:28:02.100 --> 00:28:03.600 align:middle line:84%
but at the same
time, we really need

00:28:03.600 --> 00:28:06.030 align:middle line:84%
to think through where
do we draw the line

00:28:06.030 --> 00:28:11.910 align:middle line:84%
and what does this thoughtful
regulation look like.

00:28:11.910 --> 00:28:14.240 align:middle line:84%
And, at the end of
the day, my mission

00:28:14.240 --> 00:28:17.570 align:middle line:84%
is to humanize technology
before it dehumanizes us.

00:28:17.570 --> 00:28:21.080 align:middle line:84%
And I want us to put the
emphasis back on the human,

00:28:21.080 --> 00:28:23.200 align:middle line:90%
not on the artificial.

00:28:23.200 --> 00:28:27.290 align:middle line:84%
And to wrap up, if this made
you a little bit more curious

00:28:27.290 --> 00:28:29.750 align:middle line:84%
about emotion AI
and its applications

00:28:29.750 --> 00:28:34.280 align:middle line:84%
and its implications, I am
giving away a signed book

00:28:34.280 --> 00:28:37.520 align:middle line:84%
to the first three people
who post on my social media--

00:28:37.520 --> 00:28:41.350 align:middle line:84%
LinkedIn, Facebook, Twitter,
Instagram-- using the hashtags

00:28:41.350 --> 00:28:45.860 align:middle line:84%
#LIVEWORX and #GIRLDECODED
With a comment or a question

00:28:45.860 --> 00:28:47.970 align:middle line:90%
about this presentation.

00:28:47.970 --> 00:28:50.970 align:middle line:90%
Thank you.

00:28:50.970 --> 00:28:52.470 align:middle line:90%
[LIVEWORX THEME]

00:28:52.470 --> 00:28:54.500 align:middle line:84%
Thank you for joining
us Dr. el Kaliouby.

00:28:54.500 --> 00:28:56.690 align:middle line:90%
What a compelling presentation.

00:28:56.690 --> 00:29:00.000 align:middle line:84%
So that concludes our
third session of the day.

00:29:00.000 --> 00:29:01.460 align:middle line:84%
Next up at the top
of the hour, we

00:29:01.460 --> 00:29:05.240 align:middle line:84%
will hear how companies are
leveraging SAAS-based CAD, PLM,

00:29:05.240 --> 00:29:07.550 align:middle line:84%
and augmented reality
as key tools for them

00:29:07.550 --> 00:29:11.270 align:middle line:84%
to not only survive disruption
but embrace this new normal.

00:29:11.270 --> 00:29:13.160 align:middle line:84%
We'll be right back
after a short break.

00:29:13.160 --> 00:29:16.510 align:middle line:90%
[LIVEWORX THEME]

00:29:16.510 --> 00:29:35.000 align:middle line:90%