Livestream Day 3: Amphitheater (Google I/O ’18)
Articles Blog

Livestream Day 3: Amphitheater (Google I/O ’18)

August 17, 2019


AI Adventures -Art, Science, and Tools of
Machine Learning May 10, 2018.
8:30 a.m. PT. AI Adventures-Art, Science, and Tools of Machine Learning. May 10, 2018.
8:30 a.m. PT. AI Adventures- Art, Science, and Tools of Machine Learning . . . . (captioner standing by) . (music) .
>>YUFENG GUO: Good morning.
(Applause).>>YUFENG GUO: May name is
Yufeng. I host a video series on YouTube
called AI Adventures, we explore the art, science, and tools of
Machine Learning. I wanted to bring a little piece of that here today in person, so
let’s go on an AI adventure together.
For the purposes of this session, let’s start with just a
short definition of Machine Learning and use that to build
up kind of a workflow and get into the tools.
We’ll then use that today to help us see how we can solve Machine
Learning problems and see how we can use that to apply to your problems outside
of I/O. And so we’ll say that Machine Learning is programming with
data, and we’ll use that data to create a system or model which is trained to do
some task, and we train this model,
very importantly, using data so everything comes back to the
data. So that’s really the first step
of Machine Learning, gathering your data. It all centers and starts from
there. Next, there is typically some kind of preparation needed
of that data. The raw data is often not so suitable for use,
and so we need to prepare it a little before it’s ready for
Machine Learning. Third, we’ll need to choose a
model we want to use, and then train
it up and evaluate its performance, do
some fine tuning, and finally make those predictions and do
those tasks. So we’ll go through each of
these seven steps in detail over the course of this session. So
first, let’s see what question we are trying to use our model
to address in the first place. We’ll walk through kind of a
simple example dataset and go through these seven steps using the task to
categorize some data about animals. Okay. We’re going to
say, what kind of animal is this based on some input data, is it
a bird, mammal, reptile, fish, amphibian, bug, vertebrae, one of these seven
types based on a variety of stats about a given animal.
So our model will take in this data, this structured data, and
use that to try to make a prediction, one of these seven
types of animals. Now that we know what kind of
situation that we’re modeling, let’s dive into each of these
steps in more detail. First up is gathering data.
Just because we know what problem we are trying to solve
for doesn’t always mean that we have the data needed to solve it, and without data your
model is going nowhere. So ask yourself this, where does
my data live and in what format,
and how can I get to it? These are often times the early
stumbling blocks as we kind of try to use data to solve and address
our problems. And perhaps you’re actually in a
situation where you don’t have all the data that you need, and
you need to collect your own data. This is also a common use
case and you can get creative in solving this problem by building
some systems for data collection, perhaps you make it
a game. One really interesting example of this is Quick Draw. Quick Draw is an online game
that lets you draw a specific picture and tries to predict
what you’re drawing. So it tells you to draw a
basketball or draw a boat. It’s kind of fun and people like it,
and as a result the game has generated over 1 billion doodles of
hand-drawn images from around the world and this dataset is
Open Sourced on GitHub and there is actually an AI
Adventures episode that we just put out if you want to learn
more about using this dataset and playing with that.
Back to our steps, we’ve gathered our data, so what’s
next? Well, we got to prepare that data. Data preparation could be its
own separate talk entirely, so we’ll just touch on a few
aspects here. Exploring your data before you
do any kind of Machine Learning can really help you understand
your data better, and this is fundamental because it gives you
this intuition about your dataset and can save you a ton
of time and headache later on in the
process, and so I really encourage folks to clean their data, to look at the data and
kind of get some intuition around it. It helps you identify gaps and
you’ll learn a lot about what you’re working with along the
way, maybe you’ll learn you need more data or certain pieces of
data you collected weren’t so useful afterall.
One toy I like to play there to do this is called
facets, we’ll take a little brief detour and talk about
that. Facets is a Open Sourced data visualization tool from
Google Research and it can help you get a better sense of your data distribution.
Yes, there are lots of tools that enable you to do very
similar things, you can write out commands and
call functions, but this one kind of puts it all in one place
and because it’s a UI, you won’t forget to compute a particular
statistic or look into some corner of the data because it’s
all computed for you. It just shows up, and so you can just
kind of browse it. And so here we can see Facets applied to our dataset of these animals
from the zoo, and most of them are kind of 1s and 0s datasets, does it brief,
does it have a tail, things like, that and so that’s why we
see this kind of funny distribution on 0 to 1, but a
lot of times with numerical distributions you’ll see
something far more interesting. Now, let’s move on to our
third step of Machine Learning, choosing a model. And we’ll
discuss a little bit about model selection, how we might go
about doing this using the least effort possible, and first,
let’s build an intuition around what is happening when a Machine Learning model is
working well. So here we see how the model is
trying to divide up a region, trying to say which side is blue
and which side is orange to match the data, which are the
dots. It’s not the hardest example in the world, right,
everything is kind of already separated out, but let’s look at
something a little more interesting. Here we see the
model struggle for a little bit as it tries to find a way to
draw that enclosing circle around the blue dots, and it’s
really need to see that these tools are available to you today
which are sophisticated enough to handle all of the minutia for
you so you don’t really need to worry about the details of how
it’s getting about it, it’s just a matter of is it achieving the
task, but it is fun to see. Different models can model
different kinds of data and different degrees degrees of complexity,
there’s what monoselection is really about, it’s about seeing
what model is the right fit for your particular dataset and the
particular problem you’re trying to solve. Now, concretely, what we’re
going to use is use TensorFlow to achieve this, right. So
there is lots of talks about TensorFlow here at I/O this year
so I want rehash those bits, but
we’ll see an example, perhaps, of what is might look like to
use TensorFlow to kind of do model selection to actually
write that code, especially if you haven’t had too much
hands-on experience with TensorFlow, this may be kind of
a useful look at just how accessible it
can be. So this is an example of a
linear classifier. It’s similar to the first animation we saw.
It’s good for drawing a kind of linear line between two kind of
spaces. We can see it’s pretty straightforward, but what if we
had more complex data, right? This one-line call, well it kind
of stays a one-line call when you try to try it out on a
different model, so we’re going to replace this with
a Deep Neural Network and it only takes a few small tweaks,
and in this case we just have to add one additional argument
called hidden units, and this defines the structure of our
network. In this case, we have three
layers with 30, 20, and 10 neurons in them. And so all we
needed to do is literally just change the name, just replace
that text and add one argument. You don’t have to re replumb
your entire system and don’t have to delete everything and
start from scratch. That’s really neat, and it’s similar
for a lot of models you can try. A lot of them just involve one
flip of the method call and you can basically play around with
these different menu of choices. And so in that paradigm, it
allows you to not have to commit to a
model kind of beforehand. You can try one out, see how it
does, and then come back, so these seven steps, you work
through it and can iterate back, you can always go back and cycle
through. And so that brings us to our
next step. We’ve gathered our data, we’ve prepared it, we’ve chosen a
model for now at least, and now we can do the training.
Training gets a lot of attention because in some ways
that’s where the magic happens, but in practice it’s often very
straightforward. You just let it run and you wait. So let’s
take a look at what happens during the training and then we
can see it in code. Conceptually when training a
model we pass in our data and we ask the model right away to try
to make a prediction. But the prediction will be pretty bad because our model has initiallyized randomly, but
we make this prediction and take it and compare it to the true
answer, the correct answer that we know. We use that information to
update the model, and so each time we go
through and update the model with a new
piece of data as it works its way through the training set, it
allows the model to get better and better over time with each training loop, so that’s kind of
what the training process looks like.
But in practice when you actually need to code it up, we take our
function, we’ve created the model, and we just call dot train on it,
it’s literally one line. You just need to provide it a
function that gives it the training data inputs and that’s
it. It’s one line. And so once the training completes, you run
this function and you go off and you grab a sandwich and you come
back and it’s time to do evaluation. We got to evaluate
our models to see how accurate it is, see if it’s doing a good job of performing its
primary task. So we can show it some examples
of things that we know the correct answers to, and the
model did not see those questions in training. It
didn’t see that data in training, so we use this
evaluation data that we set aside to measure the performance
of our model, and conceptually it
looks very similar to training, right. It’s still this same
shape, but the important difference is that there is no
arrow at the top, it’s not a closed loop. The model does not get updated
once we check its prediction results, and so if we did update
the model, then it would be no different than training and we’d
be essentially cheating, right. So the evaluation data is
kind of precious in that way. We don’t use it to update the
model. We hold it aside and we reserve
it for evaluating the performance of
our model. In code it’s just as simple as
everything else we’ve seen. We just call dot evaluate. You
want to evaluate, you call. evaluate. Notice we’re passing
in evaluation data, right, it’s not the training data, and if we
were using the training data, let’s say we copy that line down
and swopped out train for evaluate and we forgot to change
the data source, now we’re evaluating the performance of
our model using the training data
and that would be a big mistake because the model has optimized itself for the
training data, and so if you evaluate
using that same data what happens is you’ve
misrepresented the true performance of your model and it
will likely perform poorly on real real-world data
once you deploy it since its never seen and you never
measured how it does with previously unseen data. So this brings us to our next
point. How can we actually improve our model then? We ran
the training and did the thing and ran evaluation, now what?
Do we run training some more? How can we tweak it further? We
could always swop out for a new model, but what if we decided on
this model but we want to improve it as is? This is where
hyperparameter tuning comes in. These models have parameters of
their own, but what value should we use? How do we pick them?
Much like how you can try out different models, you can
also try out lots of different model parameters. This process
called hyperparameter tuning, in some ways, still remains a
active area of research, but that doesn’t mean that you can’t
take advantage of it. Conceptually, you would take our
training and instead of just training one model, we’ll tweak
that model and create a couple of different
variants, and so we’ll run training on different variants
of the same underlying model but with different model parameters
and see how that affects our accuracy. So we’ll train and
evaluate all of these different variants of the same or similar
models and then see which one performs best. That’s how it
informs our parameter choice. But you see, a lot of that is
experimental. You got to try them to see what
works, so you might have to break out
your four loop. And so yeah, the task of
choosing good hyperparameters is
something perhaps that you’ve intuited
this, you could probably turn into a Machine Learning problem
itself, optimizing the parameters you’re using for
Machine Learning, but we’ll have to save that for another talk.
All right. So we’ve gathered the data, we’ve
prepared it, chosen a model and trained it, evaluated the data, and we
tuned the hyperparameters. At long last we’ve reached our last
step, making those predictions. This is the whole point of
training out the model in the first place, right. It wasn’t
just to get a really good accuracy or to create heat on
our server. It was to actually do something useful with it. And so making predictions, we
take the model and we deploy it somewhere or kind of isolate it out, and we can
show the model some data that it hasn’t
been trained on and then see what kind of outputs it
predicts. And so that’s our our seven steps, conceptual
seven steps. Let’s see what happens in practice, look at the
tooling side and figure out what you use besides TensorFlow and
how you might actually achieve the seven steps. So we’re going
to go and look at some code, and in particular I’ll feature a
couple of tools to get hands on with and I’ll point you to some
additional resources as well. First up is a tool called
codelab. If you’ve ever worked with any
sort of notebook environment or repple on the web before, this
is basically a way to run Python in the browser, but
not just run Python. It’s a whole notebook environment.
You can put Mark down. The code actually executes, and because
it’s hosted essentially on Google Drive, you can share it
with other people. So let’s switch over to the demo and I can show you what is looks
like to use Codelab. So here we see and I’ve loaded
up Codelab and we can run various commands, right, all the
things that we would expect. So let’s see, we’re still
connected and in Codelab because it’s a notebook environment, you
have to upload your own files and it connects to the Internet
so you can also authenticate and pull things down, but in my case
I have the dataset loaded on the machine so we can download that.
I’ve got pandas loaded here so we can take a lock at your data
decision set, as promised it’s a bunch of animals and
statistics, and at the very end there is a class type. And so Codelab really lets you
just execute the code on the screen
in, well, in realtime. And moreover, because it’s all
hosted on Google Drive I don’t have to
spin up any machines in the background,
it just works seamlessly. You can put comments in and share
your work with others, and — there
we go, and do collaborative research or just work in general, so that’s
really neat. And so let’s see, I think we
have some time so let’s actually walk through a little bit of
kind of what I’ve ended up doing here. We took our data and what
I’m doing is shuffling it and then splitting
the data. We had a whole dataset, right. This whole dataset happens to
just be 101 different animal, so it’s probably the smallest
dataset that I’ve probably come across, but you still need to do
the same types of best practices. In this case I’ve taken it and
split it into training and evaluation
data because if I use all 101 to do training, what am I going to
do to evaluate my model? So we’ll split it, and in this
case I chose a fraction of 60% and
40%, but that ratio can be adjusted, and so we can see we have our training
data, all 60 values, and then we have our evaluation data below.
And I do a little bit of preprocessing so this is the
processing your data bit, and notably the
last column of the dataset is the label, it is the answer one through seven
of what the correct type of animal it is, reptile,
amphibian, mammal, et cetera. The problem is the labels
go from one to seven and I need them to be zero through six in
this particular case, so I shift it by one by subtracting one and
that’s it. So pretty simple preprocessing, is and I would
expect that a bigger dataset perhaps would do more things. And so we can see that our data apparently I didn’t make this
run. Run. Always run your code. And our input function here is
pretty straightforward. It takes our raw data, and if we
want it to be shuffled we can call shuffle on our dataset and
I just let it repeat and batch into chunks as needed and it will fit
into the training. I believe there was a talk on this
specifically earlier at I/O so if you missed that you can catch the
recording or roll back the livestream. So we have our dataset or our
input function. I think I forgot to run this cell again.
And this is a little cell I have here to just kind of try out my input
function, make sure it’s working, always good to test
things. We can see my input function is indeed returning all
of the data of each type of the feathers, whether it’s eggs,
whether it’s airborn, does it have a backbone, things like
that. So each kind of these arrays
represent a batch of data. I was also playing around with
this to check out the unique values for each column just to make sure, most
are 1s and 0s, and but there are a couple that are strange, for
example, there are animals with five legs
apparently so not all datasets are perfect.
TensorFlow uses this notion of feature columns to represent
the incoming data. Models are pretty generic, and so by using
feature columns it allows you to customize it for your particular
dataset. In our case, I just loop over
all the columns, all 17 of them which happen to be numeric and
set that as numeric, so it’s really just a configuration to
let TensorFlow know how many columns and of what type are
coming in. So we’ll run that and if things
break it’s because I forgot to run the cell, so yell at me if
that happens. And so here is that line that we saw before,
right, with the creating a model, so here we’ll create a
linear classifier and pull in those feature columns and say
that there are seven different classes, seven different possible values for the animals. And I’ve combined the train and
ee value, train and evaluate calls
in as a function for convenience so they both run together. I
want to run the training on the training data and evaluation on
the evaluation data, so wrapping this together helps make sure
you don’t introduce bugs as you re-run cells as you go along. So let’s say we run the linear
model and we’ll just let that churn, and so this is all backed by Google
Servers and apparently it is taking its
sweet time. While it’s do that — well this
one is done, so we got 90%, which is like okay, right.
There were only 100 rows, we took 60 for training and used 40
for evaluation, so it’s not the most reliable metric. The idea
here is the tooling is there, the code is there, and you
should replace this with your own data which will be much
better than this dataset and get awesome results.
So as promised, the Deep Neural Network is very similar.
It’s literally the same as before but I replaced it with
hidden units. I guess there is one notable difference, is that
I do take the old feature columns that we had from before
and we wrap it in an indicator column, so that’s just a space to put
the linear networks data into a way that the Deep Neural Network
can represent, so it’s a little bit out of scope for this talk, but it’s an adjustment just for
Deep Neural Networks that linear networks don’t have to deal
with. So we’ll let that get created.
Oops, just created a new cell. So I guess that’s also a good
thing to show off here, so we can very
easily create, you know, code blocks and cell blocks in
between and can you have all sorts of great editing kind of
abilities here, and you can turn it into
— and you execute the cell and it will be there. So with the deep network, I
forgot if I ran this. Maybe we’ll run it again. We can also like push the model
further, right. The percentages here are
sometimes wonky because they are — there is only 40 values in
the evaluation data and this is effectively, you know, something over 40 is getting it right and
the other ones are wrong, so in this case it looks like that 10%, probably
four are wrong or three or four are incorrect. So let’s make
some predictions and see what got missed. TensorFlow also has a. predict function so you can pass
in, I just took evaluation data and sliced out a couple of
examples to take a look at. We can see here that when we do
that you get, you know, you get the prediction and the correct
answer, so in this case those five that I arbitrarily chose
happened to work out but what if we wanted to see the exact ones
that we got wrong? This was a little experiment I ran just
because I was curious, you know, which ones were the linear
network getting wrong and which were the deep network getting
wrong? In this case they’re different, right. They got the same number of
incorrect predictions. They all had four wrong, but they’re
actually different examples. So this would be a opportunity to
dig in further and play around with that.
One final thing I’ll adhere is that Codelab or Colaboratory has GPU
support and so you can toggle on GPU so
if you have big datasets and fancy models and you want to
access that kind of stuff, go and get that GPU.
So let’s switch back to the slides briefly and take a look at
another tool. Aside from Codelab there is
another tool that we have that is very similar and it’s also
notebook based. You might have heard of it. It’s called
Kaggle. And while Kaggle is most known
for its competitions or discussion
forums and datasets, it also has a feature called kernels and
kernels is really just a fancy name for notebook, and the
kernels look something like this, this might start to look familiar from
Codelab except it’s blue. Kernels is different in a couple
of subtle ways. Firstly, I want to — let’s just switch over to the demo for
Kaggle kernels and we’ll see what that looks like. How is
that. So I took the notebook that we
had earlier in Codelab and I
downloaded the notebook itself as a notebook file. Since
Kaggle has datasets we actually have a zoo animal classification
dataset, how convenient that I chose the exact dataset that
already exists on Kaggle, right. And so we can click new
kernel, and I’m going to choose notebook, and Kaggle kernel is
not only run in Python but you can also choose to run them in R
for those of you who prefer R. And what I can do is I can
actually upload a notebook of — well,
first I need to download that notebook, so let’s do that. So from Codelab we will download — this is what
I get for zooming in a lot. We download the notebook and
then once it’s downloaded we can upload
that notebook back into Kaggle kernels and so boom, we have
that same notebook now in kernels.
There is one small tweak. Because we had to upload a file
from the local drive for Codelab
we’re going to get rid of that. And the only other tweak is that
in Kaggle kernels the data lives in one directory up called
Input. So if I run the first cell and then we’ll work our way
down, we’ll see that the data lives in input and it’s the same
kind of thing. And since we already see the rest of the
notebook, it’s literally the same thing.
What is interesting though is that in the kernels, we can — let’s
see. Let’s give it a name. We can commit that notebook, and
what Kaggle kernels does is it will run your notebook in a new fresh
environment, separate from the session that you’re in. This
will generate a notebook with all the outputs that we’ve been
seeing in a kind of nice view-only format,
which is really useful for sharing because what if you want
to share your notebook as well as how it was executed to
others? You want reproducible notebooks so that’s kind of what
we see here. We ran the notebook, it’s done,
and we can view that snapshot, so it’s kind of like a GitHub
model of like you can commit versions of your notebook.
And so by default, your notebooks are private and we can
see here that things ran. The outputs are shown, and I didn’t have to go through and run these
one by one, right. If I ran these cells out of order,
manually earlier to yield my results, running them top to
bottom like this will help catch those kind of
bugs that would make it hard to reproduce my performance down
the road when I come back to this next week or next month.
And if you share your notebooks, you can then fork
them, you can fork them and you can also get others to fork your
notebooks as well as add collaborators to your
notebooks, so you can add users to join you on the same notebook
rather than just forking them. So there is a lot of great collaboration models across
Colaboratory and Kaggle depending on your particular use case, and I guess it would
be good to also just briefly mention
that like Codelab, we can also enable GPUs
in Kaggle kernels as well, so don’t let that be a deciding
factor for you. So that’s Kaggle kernels, and
let’s switch back over to the slides and talk about just a few
other kind of little tools and tips and tricks before we kind
of wrap things up. And if we could switch back to
the slides. Awesome. So let’s say you don’t want
notebooks. Let’s say you try Notebooks and they just weren’t
for you or you have a bigger workload. You need to run a long running
job, it’s got to run for hours, or you have a really big dataset
and it doesn’t fit in Kaggle kernels or doesn’t fit on your
local directory so you’re not going to upload it to Drive
manually. Then you can use something like
Cloud Machine Learning Engine where you can kick off a job to run a long
running job that maybe runs across a set of distributed
machines, all of them potentially with GPUs attached.
And once you’re done, you might want to serve your model,
right. You want to make those
prediction, but perhaps you want to do it at scale. You’re
building a app, you train a model, and you want to rest a
REST endpoint that can serve predictions to the world, and so
Machine Learning Engine also includes a
auto-scaling prediction service. You can literally just take your TensorFlow model, give it a
name, and be done. Literally, point to the file and give it a
name. There aren’t any other steps because you can’t do
anything else in terms of creating a model to serve, which
is really neat. And if you’re working with
things like sigh kit learn or XGBoost
and you want to use those as well, we’ll take those too, and
then you don’t have to deal with the ops aspect of
Machine Learning, play with the data, tweak your model, train
it, and deploy with ease. And I think most — not most —
but many of you are also perhaps aware of the Machine Learning
APIs that are also available. These are pre-trained APIs for
doing various tasks. They’re a little more canned, right. You
don’t provide your own data, but it does mean that everything
just works out of the box. You need something — you need a
picture to be turned into a description, you need audio to
be turned into text, it just works so that’s kind of nice,
but the limitations, of course, are that you can’t customize it
for your specific use case. That is still something that
you’ll have to just wait a little bit longer for, with the
one notable exception of Vision, so the Auto ML Vision is
available right now in alpha so you can apply for that and
supply your own datasets to train and customize your own
Vision API, so that’s kind of like a neat side tangent.
The animations that we saw earlier that I showed with the orange
and blue dots came from the TensorFlow Playground. So if
you want to play around with a neural network in your browser
and just like toggle things on and off and mess around, head
over to playground.tensorFlow. org and you can do that right
there. And don’t worry, you can’t break
it. It’s just through your browser. You can only break
your own machine. (Laughing). And so what’s next? The code that we just saw, I
made it public and it’s on Kaggle at
Kaggle. com/ Kaggle.
com/yufengg/zoo-demo. If you want to learn more about
TensorFlow head to TensorFlow. org and there is a Machine
Learning crash course Google released recently, so if you
want to really dive into the concepts of Machine Learning and
go further than what we’ve talked about today, head over
there because there is basically a whole curriculum of videos and
kind of assignments that you can do and really build up your
Machine Learning knowledge. Finally, if you’re interested in
doing Machine Learning in the Cloud you can head over to the Cloud Dome,
the white tent next to the Google
Assistant Dome or head to Cloud. google. com/ML. I host a video series
by the same name of thissation. AI Adventures, we explore some
interesting nugget in each episode about Machine Learning
and try to do some hands on-demos once in a while and do
some interviews with interesting folks.
So hopefully you’ll check it out and subscribe.
So I want to thank you for joining me in this session, and
we really — you know, I really appreciate feedback on the
session, the information, so please head on over to the
schedule and you can log in and give some feedback. Thanks a
lot. (Applause). >>Thank you for joining
this session. Grand ambassadors will assist
with directing you through the designated exits. We’ll be
making room for those who have registered for the next session.
If you’ve registered for the next session in this room, we
ask that you please clear the room and return via the
registration line outside. Thank you. May 10, 2018 9:30 a.m. PT How to Kotlin The >>At this time please find
your seat. Our session will begin soon.>>Hello and good morning! It
seems like the mic is not on. Hello? Hello?
(Applause) >>Hello. The mic is not
on. Hello! This is better. Thank
you! Thanks for being here this morning. My name is James and I’m part of
the Kotlin team at Google. Today I have the pleasure of introducing a very special guest
from Jetbrains who really requires no introduction. Now, all of you know that Kotlin
is now one of the most loved
programming languages in the world.
(Applause) And at Google I/O it’s very rare
for us to have external speakers, but this person was here last year and we
invited him back because he couldn’t think of anybody else to better teach
Kotlin other than one of the people who invented it, so please help me
welcome the Lead Language Designer for
Kotlin. Andrey Breslav (Applause)>>ANDREA FOEGLER: Thank you,
James. >>Hello, everybody. I’ve
very glad to be here. Today I’m going to talk about what can it
be, Kotlin, I guess, and I’m really going to do a live demo
so please bring my demo on. So the reason why I have this horrible code in the slides is
that we are all learning and our old
habits sometimes get in the way, so I’ll be presenting today on the topic of
how you get out of your Java habits and get to your Kotlin
habits, so we all come from different backrounds, of course,
and many of us started with the Java
Programming Language and built up our knowledge of programming
through this, and so we remember many things. The thing is
Kotlin has been inspired by many languages complu
languages including the Java Programming Language, so many of
the constructs in Java will work in
Kotlin and you can get your job done in this way, but in many
cases it can be improved dramatically.
This particular example is about declaring classes, and can
you see here that I have a Kotlin class on the left and a
Java class on the right and they look very similar, but this is definitely not how we write
Kotlin code. What you’re actually supposed to do is like
remove all the unnecessary stuff. What I have to say here is one
class, that’s it, I can try to transform by hand, but I
actually want to show off a nice tool and simply copy and
paste the code from the Java site to Kotlin site and so it will use the Java to
Kotlin converter built in and do it for me, so boom, there it is
a single line that’s actually all you needed —
(Applause). — to declare one class and two
properties, and that’s it. All I have here is a class with a primary constructer, so it has
two parameters and both of them are properties and that’s all we
wanted to say. So this is one of the things
that demonstrates how cheap declaring
classes is in Kotlin, and there is a consequence to this. So
look at this code. Here it’s obviously not how you’re
supposed to write code in any language, actually. I wanted to
parse a full name into a first name and last name, and so
that’s what I’m doing here, but how do I pack the results for the function? I
don’t have a way of returning two things from a function. I
have to put it into one object, and I’m using a list here and then
awkwardly taking out one and the other to make a first name and
last name. Don’t do this in any language.
But there is a kind of psychological reason to doing
there, at least in our old habits because declaring classes
is expensive, right. You have to create a new file, put a lot
of code in it, it’s kind of awkward. But in Kotlin you
don’t have to do this. All you need to say is my class fullname with first and last
names as properties, and then all I need
to do here is just return that, right. So my full name, here it goes,
and now instead of indices I can say
first and last right here so that’s
the idea. Classes being cheap is not only saving you time at
the declaration site, but it’s saving you mental effort. You
can represent your multiple return as a class and you
doesn’t cost you anything. So around this you see that my
equals doesn’t work, obviously, because that’s a single line class and so now
I’ll go to declare equal there and then
hash there but really I don’t need to do this in Kotlin
because you probably know there is something called data
classes, right, who knows data classes? Many people. Good. So you know that I simply put
this single keyword there and it generates many things for me,
equals, hashcode, string, and many other convenient methods,
so that’s it. Change your mind about how expensive a class is. You can use it easily in all of
your abstractions. So more or less done with the
warmup, so let’s look at something else, properties.
So we talked about classes, we’ll go through properties and
then go over to functions. So here is a property done the
way you shouldn’t do it in Kotlin, again, and so the
properties I showed you before were kind of one-liners were both is
trivial. If you want a custom setter you definitely don’t
define functions for that. You have your syntax as you
probably know, if you know data classes you know that, and so inside you have
field to write to your backend storage, but that’s it. You
don’t need to introduce extra names and anything else, so
that’s straightforward, right. But then look at this code. So here is already some sensible
logic. I have two properties and one of them is private and nullable and
mutable, and on my first access I’m checking if that’s now and
then I compute a value and write into it and then I
output return from my getter. So what is it? It’s a lazy
property, right. I personally wrote dozens and
dozens of those in Java and many other languages so I got kind of bored
by that and that’s why Kotlin has delegation for property, so
delegated properties let you get a read of all the
reputation of this lazy logic. All you care about is this
expression here, so let’s just do it. Implement my property by just
lazy of all this. This is it. So what
I’m having now, I’m saying that my property is not simply simply initiallyized by
something but delegated to this lazy thing here and upon first
access this lambda will be executed and the rest will be
stored by the library, so lazy is not a language construct, but
it’s just a library function. You can define your own, and the library provides you with many
other things. The takeaway here is that if you have a common kind of property
like observable, for example, when you need to be notified
that something was modified, use a library or write your own. So here delegates. observable does the job from the
library, but if you like you don’t have to write code like this when you
have one property and then the other property and the other
doing the same thing over and over again. All you need to do is this,
actually, declare a single class that
encapsulates the logic of your property like your generic getter and generic
setter and that’s it. Now you can simply refer to this
class in many properties and get your business logic database access, all kinds
of validation, anything you like can be extracted as a library
and then reused across your project. Does that make sense?
Who uses this already? So many people, you actually
should. I’m sure you can benefit from this.
Okay. So this is more or less it about properties and now
let’s get to functions. Functions are very important,
right. So again, this is very horrible
code. Don’t try code like this in Kotlin, please. This is very
much inspired by our habits in the Java programming language.
When you have to put everything into a class, right. So string, does your project has
its own string class, if it doesn’t
it’s just a very new project, right?
(laughter). So any of my projects have them,
but the thing is in Kotlin it’s a little different. You don’t
have to use a class. First of all, a Kotlin class, they don’t
have statics so to use these functions from this class you
have to say string into parentheses which makes a new
object, right. I don’t want a new object every time, I want it
like this, so I turn this class into an object. It’s a little
bit of an improvement in my insanity, right, so I was
creating an object every time I wanted to call a function,
that’s crazy. But really in Kotlin, I don’t
need any enclosing container at all because I have top-level
functions. So this may seem obvious, like
functions, what are they? They’re just declarations,
right? But some languages, you know, have them all in classes
and many people learned this and rely on this. So this is a lot
more of a Kotlin way, but it’s still not great in
terms of what you can achieve with Kotlin because here you
have two overloads, right. So getfirstword is supposed to
parse a string, find a space and take the first word returned,
right. But what if the separator is not a space but a comma or something,
so here is a more full featured version and this is how you’ll
call it actually in most contexts, right.
So what I wanted to express here is just a default value.
In Java we are used to using overloads for this, and also
some people use nullable parameters like tasks
and null here and I’ll give you a default value. Don’t do this
in Kotlin. You don’t need to. All you need to do, actually, is
simply specify your default. My default is space here, and
that’s it. All right. So there is no need to emulate defaults. They are built into the
language. And same for when you have many,
many default parameters with different values like multiple Boolean and
so on and so forth, you can just use named parameter syntax to
express which you actually need and all the rest will be used by
default. So this makes functions fewer in the first
place, and then a lot more expressive.
Okay. Good with functions, right? Well, actually, this is kind of
— this function is kind of midway between like the Kotlin style and the
Java style because it’s actually working on strengths, right.
Very much a good idea to put this into a string class — oh,
it’s not because the string class is not controlled by you and we can put everything
into the string class and if you really want to keep the string
API minimal, so what I would really like to do is something
like this. Where I can say my string
getfirstword and that’s it. Right. So it looks like a
method, it’s called an extension function, actually. It’s not
sitting in a string class. I didn’t go into the JDK and
alter the class I can control, but
still it works like this. So this is a mechanism you can
use. I’ll do it manually to illustrate how it works here.
So I have a receiver, I’ve typed string, now I don’t need this
parameter anymore, and I can say that this
here and use my “this” here or omit
“this” on the left-hand side so now I’ll be able to use it this
way. Make sense? Ky
I can do the same with a property, actually it would be
very nice to do it this way. Just have first word as a
property name, and you can have an extension “property,” but of
course there will be no customization for the separator,
but otherwise you’re good to go. Yep. I’ll just need to put a
space here. And that’s it. So extension functions,
extension properties, it’s actually very important idea, it’s not only just salines
convenience. It allows you to keep your classes very minimal.
Look at the string class in Kotlin. It’s only five methods.
If you can compare that to Java it’s screens and screens of
declarations, so you can keep your API minimal
and all the utility functions can be
extensions, and certain different libraries can be
modular like this and that’s a very important tool for
designing APIs. Do you have questions? Okay. I
couldn’t take them anyway. (laughter).
Okay. Now, let’s have a look at this. Here I’m doing
something very typical and pro verse traversing a hierarchy,
containers and leaf elements, and I want to extract all the
text from this hierarchy. Pretty straightforward. So my classes are three lines of
code. Not much, there is an element, a container, and with the list of
children there is text. Now I’m traversing this so I’m
using extensions function, top-level functions, everything
as I told you, and so all right but I don’t like this code.
Why don’t I like it? Here to traverse the hierarchy I
need recursion so I need to pass the string builder down the
stack and add to it as I’m going down the tree. But then I end
up with a top-level function that’s only needed by this one here, right, so this one is not
really anywhere but inside this function, so what I’d really
like to do is just put it inside , just go here and make it a
local function. So again, it’s just expressing
that nobody else needs this. You don’t need private helpers
anymore or look for local helpers. It can be improved a
little bit and you can actually make use of closure, so I can create my string builder
right here and get rid of all of this so I don’t need to return or take
parameters here. All I need here is to use
whatever is declared above. And then I just do extract text
of e right here and return string
builder to string. I’m sorry, that’s an extension function,
right. No, sorry. Yeah. So here is
how it goes. Like, you can turn something into a local function
and a leverage closure, so this variable is declared outside my
function, it’s not accessible to anyone outside the outer, and I’m using
here and that’s it. Now, local functions, extension
functions, deep level functions, default parameters use these,
they will make your code nicer. Now, let’s look at what’s
still there. You see gray code? Gray code is useless. The ID in
the compiler show you that something is not needed there
and it actually isn’t. This is redundant because we have this
extract here, right, so you simply can remove this. I don’t
know if you see it — oh, yeah you do, but the text variable
has gone green. Why it’s green? Because the compiler can figure
out the cast for you. It’s actually much safer. It’s not
only convenient, but I’m really annoyed at my casts all over
the place, right. I know it’s text, why don’t you
know? Well, now it knows, and actually
you don’t need this variable either because it’s the only usage, right. The
same thing here and then my container can be in line as
well, and so here it is. Smart cast, makes your code
safer and more concise, and actually it makes all the casts
that are still in your program meaningful, so when you see an
as operator in Kotlin now you know it means something, it’s
not just a useless complement to the is check
above. Also, this thing here is kind of stupid because what I’m doing is
applying the same function for everything as a single function,
so what I want to do is something like this. It’s a little bit nicer looking. And then let’s look at what we
have. We’re traversing hierarchy, I have my leaves, I
have my containers, and that’s what I want to express, right?
I’m checking different case, so to do that it’s a lot nicer to use a
“when” statement, so when can switch in types right here, but
there is an annoying thing about it, and it’s again coming from my old habits on declaring
at close hierarchy I have only containers and text and I don’t
have anything else, right. Now I have this pretty annoying
“else case” right here. The compiler has no idea. I don’t
have anything but continues and text. It’s just an abstract text and I
have some cases there. You can actually express this in Kotlin
with sealed. I can have a seal class which means all the
subclasses are known, you can declare them outside of this
file, and this way the ID and compiler
know this “else” is useless. We went from almost two screens
of code to less than one, simply applying the idiom it’s of
Kotlin to this code. Do you have questions?
I’m sorry. (Laughing). All right. So now let’s just continue with
this exercise and look at some more
examples of expressions that are written like with old habits in
mind and we’ll try to transform them into something
better. So first thing that really stands out here is Var.
I can’t say never use Var. Vars are useful mutable
variables, can be used for many nice things,
but it’s kind of discouraged. If you need a Var, you need a
very good reason. Here is now a good reason using
a Val, definitely. Let’s look at these three. It’s
repetition. Repetition is ugly, repetition is errorprone,
especially if this was not a single name but many things
chained, so I would like to get rid of this repetition. What I can do is say with ex —
does anybody remember Pascal? Anyone? Good. Good. I started
in Pascal almost. So it hid this thing which is a
building construct, and in Kotlin it’s a function and we
can use it and here we can get rid of all the ex things here.
Just like this. And now it looks even more stupid, right.
I’m just assigning to the same variables, don’t do that.
Okay. So now I have a printline with
string plus some things, string plus some things, string plus
some things. It’s awkward. Most languages now have string
interpolation and Kotlin has that as well, so what you
actually need here is this. Okay. Done with this one. Import
things into your scope with “with” use string interpolation,
it’s nice. Now here I’m creating a map, the
old way. I can kind of make it a little nicer like this by using my operators,
which it’s really much nicer if I just use a builder function,
so what I can do here is replace my things with pairs — oh —
not pairs, but pair. I’m sorry, typing when talking is
difficult. Yeah, so a map can be
constructed of pairs, right. A map is only a set of pairs from
key to value, but actually pairs are kind of redundant in this so we’re
usually using the to function. It’s not a built in operator but
just a library function here and this is how you create a map. When you want to traverse the
map, you can say here key, and value, and
just a few variables like this which
makes four loops a lot more concise. This example of code with my
“if” statement is something I really hate about my code in
Java because it’s like these assignments here, they all
fall apart so easily, so I really
like to do things like this in Kotlin — so
if — and many other things are actually expressions. This is
something pretty unfamiliar for this C language family. We’re
used to dividing our code into statements and expresses,
right. Statements are things that have effects and
expressions are things that have value, so you assign expressions
to variables and write statements to assign things to
things. So Kotlin is halfway between
this procedural tradition and
functional tradition but has a lot more expressions than you’re
used to in other languages. You can do this here and of course
you don’t have to use a Var, you
don’t have to make it a different line, you can assign
it right aso if expression, make it nicer, and the result of the
expression is the all lasting in the block. So the same for When. When is
not simply switch case on steroid, it’s largely
importantly an expression, so you can also do
it like this, right. So not many returns here but one return
here will be a lot nicer. Also, you’re going to story to repeat
yourself, of course, this much and you can say even this.
By the way, if you want to check if something is odd and
even, don’t do it like me. It’s only for demo purposes. Don’t
try this at home. It will hurt. (laughter).
Yeah, so this one can be further simplified like this, so again
you’re trying to remove the noise. When you see code like
this, just try to get rid of the noise. Noise is harmful for
your brain. Last thing just a quick thing
with what you do with nullable. These question marks are with
nullable types in Kotlin. How many people, I’ll go really,
really quick, so you can have nullable types and compiler
makes you do things like this, and so it’s an error now. The
string is nullable you can’t reference it. You can either do this, which
says just safely reference me, which you by the way can do here
as well, right, so you don’t have to write an “if”
around it and can simplify it like
this. Another nice thing is you can
use an alias operator like this, so to simplify your like longer
if statements into something, and this is kind of curious
because this is definitely the expression position, right, so
how Elvis works. Elvis takes an expression on the left-hand side
of string and asks, are you a null? Really nicely, and if it’s a
null it evaluates the right-hand side. But the right-hand side has to
be an expression, right. Basically, it’s supposed to be a
default, so like if you are null on the left-hand side, like use
a default on the right-hand side, but your
default can’t be just a return, which means you don’t compute
any value there, you just jump out of the function.
That’s a quite an interesting thing from the type
system standpoint but I’m not giving a lecture here, I’m doing
a demo. Okay. We’re good with
expressions. Let’s look at some functional style. So people very often refer to
Kotlin as a functional language. I don’t think it is, actually. I think Kotlin is a
multiparadigm language that supports functional style. You
don’t have to write functional in Kotlin, but it’s often times
very nice to do it, so let’s have a look
at this. So in my Java old days in mind,
I wrote this code which just goes
over a list of numbers and picks those devicable by 16 and converts
them to X. So what it actually does is filter map, right, map
is this one and filter is this one. So what I can do even with the
help of my ID, I can do this. So newer versions of all
programming languages have something like this. I can
definitely leverage this, and so this filter is a function, this
lambda is a function value. You don’t have, by the way, to
declare it as a variable. You can get rid of it. And so
that’s a lamb ka parameter. Kotlin has some nice semi
functional things like you can say anywhere in the code, it can
say also, and have this value also do this for me,
please, like print this list for me and then
proceed with what you were doing, like nevermind this just
debug output or some side effect I want to insert. Side effects
are not functional on the one hand, on the other hand this is
handy for debugging, you don’t have to break your clain apart
and so on and so forth. Also use let, use run, and so on and
is forth. There is one very deep thing
about functional abstractions in non-functional languages. When
I do something like this, I have my repeat function right here,
right. So what it does, it takes a number of times I want
to repeat something and this something is a function, and by
the way you don’t have to invent your own function interface
every time, just use the function types here. It’s a
function that takes an INT returns a unit, and unit is
something that you don’t care about. Then it simply repeats
it, right. So when I say repeat, I’m also
very much conscious about like what’s
it going to cost me? Right? So it’s a function, it takes a
lambda as a parameter, and so it’s actually just another
parameter. The Kotlin custom is to write outside of parentheses
because it looks more like a language construct like this,
but then, okay, I’m running this and
I have to create a lambda object, right. I have to create
a lambda object every time I do anything like this, and so there
is a cost to this abstraction. It’s nice code, I can reuse
things, I can raise the level of abstraction in my code but there
is a toll on that. Actually in Kotlin, you can very often get rid of the upgrading
lambdas, lambda objects for you by just using inline functions.
When I say inline, my code doesn’t change, right. So here
nothing happened at the call site that I can see. But if I say show common byte code and just
decompile this into Java just to scare you a little bit, it was much —
yeah. Easy talk so far. So if I do this, here is goes. It’s a simple for loop. Where
did my lambda go? Well, the compiler simply
optimized it away. You don’t need lambda, right.
So if you simply have your loop here and you inline everything, you
end up with a loop and that’s it. So the big difference in the
mindset when you go from the Java Programming Language to the
Kotlin Programming Language is you still use lambdas but some
of the lambdas are really free, and by the way these are all
free, too. Many, many lambdas in the
library are free abstractions, you don’t have to pay for
calling them. It’s just code generated for you.
So functional in Kotlin is not only convenient but also quite cheap.
Speaking of cheap, by the way, let’s look at this example. So here I’m trying to do a
parallel computation. Well, it’s a stupid sample and
nobody does variable computation, but I want to
illustrate a point. What I’m doing here is again
with my old habits in mind, I’m
creating 100,000 threads. 100,000 threads each of which
does some work. Actually it sleeps for one second and then
prints a number, and then I have to join all of these threads to
my main thread. So if I run this — oh, oh.
There was an exception. What was that? The Java out of memory error.
Basically, what it’s telling me is hey, you cannot create 100,000
threads. Are you crazy there? It’s 100,000 stacks, it doesn’t
fit into memory, just get reasonable. And that’s fair, like OS threads
are not cheap, you have to allocate resources for threads, so you
don’t do such silly things with thread, but I have this example down
coroutines, who knows about them in Kotlin? Oh, good. Who uses
them in production? Okay. Soon enough you will all be using
them, I’m sure. So have a look here, it’s very much the same code so I’ll just put
it side by side here. It’s very much the same code,
but instead of threads here I’m creating async tasks that are
using co-routines underneath so I’m still waiting for one second and printing, and if I
run this there is no out of memory. It’s printing all the
numbers and I’m good. So again, Kotlin introduced
coroutines as means of making your asynchronous computations nicer
and that workers, but what’s the cost of that? So the cost
of that is at least cheaper than having a thread per each
computation, of course nobody does that exactly, but still
like coroutines are very cheap. You can spin off 100,000
coroutines and it doesn’t costly near as much as threads.
Let me illustrate something that coroutines are really good for
right here. So here is a legacy interface. Or I don’t know, a motor
interface, I don’t know. What we very often have to do to make this asynchronous or reders very
denned our dependencies so on and so
forth. Ask me to do something and I’ll do it and let you know
when I’m done. Here is the mock service request and a callback
function that’s passed to it so when the work in the comments
are done, I’m calling the callback and just passing my
answer there. All right, so that’s all right.
It’s working for everyone, right. But this is what the
code looks like when I want to exchange messages between two
services, and so I just want to basically send two messages in
sequence, and here is what I have to do. First, request and then a
callback. This is a result of the request I printed, and then
next request inside that callback and then print
inside, so you see the stairways right here
staircase right here, right. One step, two step, three step,
and you can actually get quite deep down this staircase which
is not nice. And so what I would really love
to do is something a lot more straightforward, but so this is
kind of tolerable but what if, just imagine, what if you needed to do like N
calls? Just a number of like make a
list of calls. So this is the code I came up with, which isn’t
nice at all, so I definitely need that there because I need
to nest a callback inside a callback. This is the shortest
code I came up with, it copies arrays, don’t do that. It’s wasteful in terms of
memories and time, it’s quadratic.
Basically I have to come up with something like this, so you
can’t just say repeat this five times, right. So what I really want to do to
be able to do is something like this where I just say, okay,
send one request, wait for results, send the other
request, and thenfy then if I want to repeat
something, just repeat it with a for loop, right.
So this code here is actually using the same
callbacks. Only the coroutine abstractions
are pushing this away from me, so actually you can take any callback based
API that you have now and turn it into this, like make it straightforward with
just a few lines of code and I’ll show you. So this is
calling the same services because I have this function
right here. So what I’m doing is I’m just turning the request into a
suspension function through this simple construct. That’s an
extension function of my callback service. I say, the
first thing I say there is suspend my coroutine, so I’m
assuming I’m until a coroutine, suspect
spend it right away, I get the continuation, I do my request,
that’s it I’m suspended and waiting for a request, and so
there it is, and when the request is
done, I just say resume to my coroutine and that’s it. So there simple lines of code
turns your callback based API into a
coroutine API and makes — so it makes this — oh, I’m sorry. It makes this code into this,
which is a lot more readable to my sense. How do you like it?
(Applause). I see some nods in the
audience. Thank you. Yeah, well, actually if you want
to be a lot more prudent here and I’m
sure you want to, you need to catch exceptions, and so you know, capturing —
handling your exceptions is very important and that’s as easy as this, just
catch your exception and whatever happens on the — oh,
I’m sorry. Not here, of course. Whatever happens with your
request, just catch it and resume with
exception, so this will propagate exceptions through
your coroutines very nicely and you’ll able to have try catch
around here, surround this with try catch, or I’m sorry,
whatever, and catch exceptions there as if it was a sequential
code but underneath it’s all asynchronous. You can do HTTP requests like
this, async I/O file systems, background threads, everything you need.
Isn’t it nice? And I guess the last example
I’ll be showing you today is this one. It’s just another showoff for
how coroutines can help you. Take a look. So what I wanted to do is to
create an infinite stream of numbers. Who likes infinite
streams of numbers? I eat them for breakfast. Okay. So I want just a sequence to be
generated here and then I can take 20 of them here, the sequence of 20
and I can take 200, 2,000, filter, map,
slice, whatever. And so this Build Sequence
function is a library function and in the Kotlin standard
library and it’s actually based on the same mechanism as
coroutines. It doesn’t do any background processing. It’s all
in the same thread. What it does, it takes all the
yield statements from here and just puts them in a sequence,
and so if I want to yield something here, I just do it
like I insert 2 into my sequence. If I want to say say if TMP is
greater than 10, continue, I can skip pieces of my logic, so that’s as
straightforward as any coroutine it gives you a lazy sequence.
Okay. So takeaways, classes are cheap,
functions are top level or local, no overloading to emulate your
default values, use properties, use delegated properties, use
coroutines, have a nice Kotlin. And I want to advertise some
more activities today. So if you still have questions that I
couldn’t take, you can come over to an office hour we’re having at
12:30. You can come over to the sandbox
area C where we’re at the Kotlin booth some of the day at least,
and right after my talk there will be a talk by
Jake Warden about Android KDX, very exciting
on Stage 2, so you’re welcome there. Thank you very much for
your attention. (Applause). >>Thank you for joining
this session. Grand ambassadors will assist with directing you
through the designated exits. We’ll be making room for those
who have registered for the next session. If you’ve registered
for the next session in this room, we ask that you please
clear the room and return via the registration line outside.
Thank you. May 10, 2018
10:30 a.m. PT TensorFlow Lite for Mobile
Developers morning, everyone. Thank you so
much for coming to our session this morning. I’m Sarah
Sirajuddin. I’ll on the TensorFlow Lite
Team, and we work on bringing Machine
Learning to mobile and small devices. Later on I will
introduce my colleague Andrew Selle who will be doing
the second half of this talk. So the last couple of days
have been really fun for me. I’ve gotten to meet and speak
with many of you, and it’s been
really nice to see the excitement around TensorFlow
Lite. Today I’m happy to be here and
talk to you about all the work that our team is doing to make
Machine Learning on small devices possible and easy. So in today’s talk we’ll cover
three areas. First, we’ll talk about why Machine Learning
directly on device is important and how it’s different than what
you may do on the server. Second, we’ll walk you through
what we have built with TensorFlow Lite. And lastly, we’ll show you how
you can use TensorFlow Lite in your own
apps. First, let’s talk about devices
for a bit. What do you mean when we say a
device? Usually our mobile devices are basically our
phones. Our phones are with us all the time, we interact with
them so many times during the day. And modern phones come
with a large number of sensors on them which
give us really rich data about the physical world around us. Another category of devices is
what we call edge devices and this industry has seen a huge
Explosion in the last few years, so some examples are Smart Speakers, Smart Watches, Smart
Cameras, and as this market has grown, we see the technology
which only used to be available on more expensive devices is now available on far cheaper
ones. So now we see that there is this
massive growth in devices, they’re becoming increasingly
capable, both mobile and edge, and there is
opening up many opportunities for novel
applications for Machine Learning.
I expect that many of you are already familiar with the
basic idea of Machine Learning, but for those that aren’t I’m going to quickly
cover the core concept. Let’s start with an example of
something we may want to do, let’s say classification of
images. So how do we do this? In the past what we would
have done was to write a lot of rules that
were hard coded, very specific about some characteristics that
we expected to see in parts of the the im image.
This was time consuming, hard to do, and frankly didn’t work all
that well. This is where Machine Learning comes in.
With Machine Learning we learn based on examples, and so
a simple way to think about Machine Learning is that we use algorithms to learn from
data and then we make predictions about similar data
that has not been seen before. It’s a two-step process, first
the model learns and then we use it to make predictions.
The process of model learning is what we typically call training
and when the model is making predictions about data is what
we call inference. This is a high-level view of
what’s happening during training. The model is passed
in labeled data, that is input data along with the associated
prediction. And since in this case, we know
what the right answer is, we’re able to calculate the error,
that is how many times is the model getting it wrong and by
how much? We use these errors to improve the model. This
process is repeated many, many times until we reach the point
that we think the model is good enough or that this is the best
fit we can do. This involves a lot of steps and coordination and that is why we
need a framework to make this easyier,
and this is where TensorFlow comes in. It’s Google’s
framework for Machine Learning, it makes it easy to train and
build neural networks, and it is cross-platform, works on CPUs,
GPU, TPUs, as well as mobile and embedded platforms.
The mobile and embedded piece of TensorFlow, which we
call TensorFlow Lite is what we are going to be focusing on in
our talk today. So now I want to talk about, why
would you consider doing Machine Learning directly on device? And there are several reasons
that you may consider, but probably the most important one
is latency. If the processing is happening on the device, then
you’re not sending data back and forth to the server, so if your
use case involves realtime processing of data such as audio or video,
then it’s quite likely that you would consider doing this. Other reasons are that your
processing can happen even when your device is not connected to the
Internet, the data stays on-device and this is really
useful if you’re working with sensitive user data which you
don’t want to put on servers, it’s more
power efficient because your device is not spending power
transmitting data back and forth, and lastly, we are in
a position to take advantage of all the sensor data that’s already
available and accessible on the device. So this is all great but there
is a catch like there always is, and the catch is that doing on-device ML
is hard. Many of these devices have some pretty tight constraints, they have
small batteries, tight memory, and very little computation
power. TensorFlow was built for processing on the server and it
wasn’t a great fit for these use cases, and that is the
reason that we built TensorFlow Lite. It’s a lightweight
Machine Learning library for mobile and embedded platforms.
So this is a high-level overview of the system. It
consists of a converter where we convert models from TensorFlow
format to TensorFlow Lite format, and for efficiency
reasons we use a format which is different. It consists an
interpreter which runs on-device, there are library of
apps and kernel, and then we have APIs that allow us to take
advantage of hardware acceleration whenever it is
available. TensorFlow Lite is
cross-platform so it works on Android, iOS,
Linux, and a high-level developer workflow here would be
to take a trained TensorFlow model, convert it to TensorFlow
Lite format and then update your apps to use the TensorFlow Lite
interpreter using the appropriate API. On iOS, developers also have the
option of using CoreML instead and what they would do here is to take
that trained TensorFlow model and convert it to CoreML using the TensorFlow
to CoreML converter and use the converted model with the CoreML runtime.
The common questions we get when we talk to developers is it small
and is it fast? Let’s talk about the first question. One
of the design goals with TensorFlow Lite was to keep the
memory and binary size small, and I’m happy to say that the
size of our core interpreter is only 75 kilobytes and when you include all the
supported ops, the size is 400 kilobytes.
How did we do this? First of all, we’ve been really
careful about which dependencies we include. Secondly, TensorFlow Lite uses
flood buffers which are far more
memory than protocol buffers are. Another thing is
selective registration and that allows developers to only use
the ops that their model needs and thus they can keep the
footprint small. Now, moving on to the second
question which is of speed. We made several design choices
throughout the system to enable fast startup, low latency, and
high throughput. Let’s start with the model
format. TensorFlow Lite uses FlatBuffers
like I said and it’s a cross-platform efficiency
realization library. It was originally created at Google for
game development and is now being used for other performance
sensitive applications. The advantage of using
FlatBuffers is very can directly access the data without
doing parsing or unparsing of the large files which contain
weights. Another thing that we do at the time of conversion is that we pre
pre-fuse the activations and biases and it leads to faster
execution later at runtime. The TensorFlow Lite interpreter
uses a static memory and static execution plan. This leads to faster load times
times. Many of the kernels that TensorFlow Lite comes from have
been especially optimized to run fast on NEON
and ARM CPUs. Let’s talk about hardware
acceleration. As Machine Learning has grown in
prominence, it has sparked quite a bit of innovation at the
silicon layer as well and many hardware companies
are investing in building custom chips which can accelerate
neural income processing. GPUs and DSPs which have been
around for some time are also now being increasingly used to
do Machine Learning tasks. TensorFlow Lite was designed to
take advantage of hardware acceleration whether it is through GPUs, DSPs
or custom AI chips. On Android, the recently
released Android Neural Network API is an abstraction layer that
makes it easy for TensorFlow Lite to take advantage of the
underlying acceleration. And the way this works is that hardware renders write
specialized drivers or custom acceleration code for their
hardware platforms and integrate with the Android NN API. It integrates with the Android
NN API via its internal delegation API.
A point to note here is that developers only need to
integrate their apps with TensorFlow Lite. TensorFlow
Lite will take care of abstracting away the details of
hardware acceleration from them. In addition to the Android NN
API we’re also working on building direct GPU acceleration
in TensorFlow Lite. GPUs are widely available and
used and like I said before, they’re now being increasingly
used for doing Machine Learning tasks. Similar to NN API, developers
only integrate with TensorFlow Lite if they want to take
advantage of the GPU acceleration.
So the last bit on performance that I want to talk about is called quantization and this cuts
across several components in our system. First of all, what is
quantization, a simple way to think about it is that it refers
to techniques to store numbers and to perform calculations of
numbers in formats that are more compact
than 32-byte floating representations. Why is this
important? For two reasons.
First, model size is a concern for small devices, so
the smaller the model the better it is. Secondly, there are many
processers which have specialized SIMD
which process faster than the floating point numbers.
So the next question here is how much accuracy do we use if we’re
using 8 bytes or 16 bytes instead of the 32 bytes used for
representing floating point numbers? The answer obviously
depends on which model that we’re using, but in general the learning process is robust
to noise and quantization can be thought of as a form of noise, so what we find
is that the accuracies tend to be usually within acceptable thresholds.
A simple way of doing quantization is to shrink the weights and
biases after training and we are shortly going to be releasing a
tool which developers can use to shrink the size of
their models. In addition to that, we have been actively
working on doing quantization at training time and this is an
active area of ongoing research, and what we find here is that we
are able to get accuracies which are comparable to the floating
point models for architectures like MobileNet as well as
Inception and we recently released a tool which allows
developers to use this, and we are working on adding support
for more models in this. Okay. So I talked about a bunch
of performance optimizations, now let’s talk about what does
it translate to in terms of numbers. So we benchmarked two models, MobileNet and Inception V3 are
the pixel tool and we’re getting speedups
of more than three times. When we compare quantized quantized models and
these numbers do not include any hardware acceleration. We’ve
done some initial benchmarking with hardware acceleration and
we see additional speedups of 3 to 4
times with that, which is really promising and exciting so stay
tuned in the next few months to hear more on that. Now that I’ve talked about the
design of TensorFlow and performance, I want to show you
what TensorFlow Lite can do in practice. Let’s please roll the video. So this is a demo demo
application which is running the MobileNet classification model
which we trained on common office objects, and as you can
see it’s doing a good job detecting them, even this
TensorFlow logo that we trained this model on.
Like I said, it’s cross-platform so it’s running on iOS as well as
Android and we also are running it here
on Android Pings. This was a simple demo. We have
more exciting demos for you later on in the talk. Now,
let’s talk about production use cases. I’m happy to say that
we’ve been working with partner teams inside Google to bring TensorFlow Lite to
Google Apps, so portrait mode on Android
camera, Hey Google in Google Assistant and SmartReply will be on TensorFlow
Lite in the next few months. It’s the version powering the
custom functionality in the newly
announced ML Kit and for those of you that may have missed the announcement, ML Kit
is a Machine Learning SDK, used Cloud
powered APIs for Machine Learning and as well as your ability to bring your own
custom models and use them. These are some examples of apps
that are already using TensorFlow Lite via ML Kit, PicsArt a photo
editing and collage making app, and VSCO is
a cool photography app. So back to TensorFlow Lite and
what is currently supported, so we have support for 50 commonly
used operations which developers can use in
their own models. I will point out here that if you need an Op
which is not currently supported, you do have the
option of using what we call a custom Op and using that, and
later on in this talk, Andrew will show you how you can do
that. Op support is currently limited
to inference. We will be working on adding training
support in the future. We support several common
popular Open Source models as well as the
quantized counterparts for some of them.
With this I’m going to invite my colleague, Andrew, to
talk to you about how you can use TensorFlow Lite in your own apps.
>>ANDREW SELLE: Thanks, Sarah.
(Applause). So now that you know what
TensorFlow Lite is and what it can do and where it can be run,
I’m sure you know — you want to know how to use it. So we can
break that up into four important steps. The first one
and probably the most important is get a model. You need to
decide what you want to do. It could be image classification,
it could be object detection, or it could be even speech
recognition. Whatever that model is, you need to train it
yourself and can you do that with TensorFlow just like you
trained any other TensorFlow model. Or can you download a
pre-trained model if you’re not ready to make your own model
yet. Or if an existing model
satisfies your needs. Second, you need to convert your
model from TensorFlow to TensorFlow Lite and we’ll show
some examples of how to do that in a second.
Third, if there is any custom Ops that you need to
write, this could be because you want to spot optimize something
with some special instructions you know about, or it could be
that you’re using a piece of functionality that we do not yet
support like a specialized piece of signal
processing. Whatever it might be, you can write your ops.
This may not be necessary, of course.
The next step is to write an app and you use whatever
client API is appropriate for the target platform, so let’s
dive into some code. Converting your TensorFlow model, once
you’re done with the TensorFlow training, you typically have a
saved model or you might have a
graphdev, what you need to do first is put this through the
converter so here I’m showing how to do this within Python, so
if you download the normal TensorFlow tooling that’s
pre-compiled like the PIP, you’re able to run the converter and it
just takes a save model directory in or frozen graphdev and you specify
a file name of what TFLite file you
want and that inputs a FlatBuffer on disk that you can
now shift to whatever device you want. How might you get it to
the device? You could put it into the package, or you could
also say distribute it through a Cloud service where you can
update your model on the fly without updating your core
application. Whatever you want to do is
possible. So next, once you’ve converted,
you might actually run into some issues during the conversion
because there is a couple of things that can go wrong. So
the first one is you need to make sure that you have a frozen
graphdev or save model. Both of these are able to get rid of the
parts of the graph that are used for training, these are
typically things like variable assignment, variable
initialization, optimization passes. These are not strictly
necessary if you’re doing inference that is prediction, so
you want to get rid of those out of the graph because we don’t
want to support those operations right now because we want to
have the smallest version of the runtime that can be distributed
to keep your binary size small. The second thing that you
need to do is make sure that you write any custom operators that
you need, and now I’ll go into a little bit of an example of
doing that. Well before that, let me tell you one more thing, which is we also
have some visualizers to let you understand the model that you’ve
transformed and the transformation process, so take
a look at those. They’re linked off of the documentation. So let’s get into writing a
custom op? What might we do? Here I have an example that is
silly, but it’s to return p. The important thing when you
write an op is to perform C functions so we have a C define
for writing operations and the reason we do that is all of our
operations are implemented this way so they can run on devices
that only support C, eventually, you but
you can write kernels in C++. In this case what I’m doing
is ignoring the input tensors and
putting in output tensor which is N pi.
If you have input tensors and you want an output tensor you
could also read the input tensors and say multiply by
three and now I have a multiply by three operation. This is
going to be application at the pend dependent and as I said
before you don’t always need to do this. I’m laying it out
because if there is some functionality you need we are
extensible. Okay, once you’ve converted your
model you need to use a client API. Let me start with the C++
API, but we have other language bindings as well that I’ll get
to. But in any of the binding, it’s
going to follow the same basic pattern. The pattern is create an
interpreter and load the model, fill in your
data, execute the model and read back the data. Very simple. In the C++ first thing you do is
create an object, this is given the file name of the TensorFlow
Lite file and creates an object that is going to hold that model
and M map it, so we use FlatBuffers and the reason why
is because we can M map the buffers which means there is
zero latency to start running the model effectively.
Second, if you have any custom operations, you can
register them, so basically at this phase you’re deciding which operations to include into
your runtime. By default we provide a built in
Op resolver that includes all of our default operations. You
might also use selective registration that we alluded to
before where you include only a subset of the operations, and in
this case you might provide a minimal resolver and if you
wanted to use the custom operation that we had before,
you would create a custom resolver that would tell
TensorFlow Lite how to find your custom operation. So now we know what our Ops are
and where to get the code for them, and we know our model.
Now we need to create an interpreter object, so we take
the pair of model and resolver and put it together and it
returns an interpreter. This interpreter is going to be our
handle for doing our execution, so the next step is we’re going
to perform an execution, but before we can do that we need to
fill the buffer. So if you have a model like a classification
model, that is something that takes an image in. Where are
you going to get the image? The obvious place you might get it
is maybe from your devices storage
if it’s an image file name, or also commonly it might be a
camera. Whatever it might be, you produce like a buffer, and in this it’s going
to be a float star and instar buffer and you fill it into your
buffer. Once you fill this buffer, you’re ready to run. So
we filled our buffer, TensorFlow Lite has all the information it
needs to run the execution, and we just
call invoke. Now it’s going to block until the execution is
done and then we’re going to be able to read the output of it in an analogous way to our input
and so that is we can get a float star buffer out which
could represent the class numbers. And then you’re free
to do with that data whatever you want, so for example in an
image classification app that we showed before, you would read
that index out, map it back to a string, and then put it into your GUI display.
Great, so now we know how to use C++. What if you’re using another
platform, for example, RaspberryPi. On RaspberryPi the most common
thing to use is probably Python, and again it’s going to
follow the same basic pattern. First we create a interpreter
object, the interpreter object is now our handle, how do we
feed data? Since it’s Python, we can use
num pi arrays and this is convenient because if you need
pre-processing or post processing you can do it with
primitives you’re familiar with, and this is kind of a theme that
goes on we want to keep our bindings as idiomatic in the
language that they are and also keep the performance, so in this
case we put in some mun pi array and take out
some, so that’s Python. What if you’re writing an
Android app or want to write an Android Things application, then
you might use the Java API. So in this case, it’s the same
thing. You take and build an interpreter, you give it the
file name of the interpreter, it might be from a resource if
you’re doing an Android application, and then finally
you’re going to fill the inputs in and call run.
So one of the things that we did for the Java API is that
we know that many Java programmers don’t really want to
deal with building their own native library, so in that case
you can just use our Gradle file here which will include our
precompiled version of TensorFlow Lite. You don’t have
to download our source code, and even for the tooling parts where
you do the conversion from TensorFlow to TensorFlow Lite,
you can download the pre-compiled version of
TensorFlow as I alluded to before.
Great. So whether a if you’re doing iOS? This that case use the C++ API
and you can also use the Objective C API, but again we provide a precompiled
binary in the form of a could he . Cocoapod.
I want to tell you what’s coming up in TensorFlow Lite.
One thing we’re asked for is more operation, the more
operations we have the more models that can be run from
TensorFlow out of the box. The other thing that happens with
Machine Learning that is often difficult is researchers come up
with new techniques all the time, and that means that
TensorFlow is always adding operations and that means that
we’re going to continue to follow TensorFlow as it adds
important operations and add them into TensorFlow Lite as
well. Okay. The second thing we’re
going to do is we’re going to improve the tooling, provide
better documentation, and tutorials and try to focus on
ease of use, so it’s really easy for you to understand on
end-to-end examples how to integrate TensorFlow Lite.
And the third thing, which Sarah already mentioned but I’ll
mention again, is that we’re excited about on-device
training. On-device training is really exciting because it
allows us to refine a model based on a user’s experience.
It allows us to decouple that experience from going to the
Cloud, so if they’re disconnected, we can continue
improving the model so there is a lot of requests for this.
This will of course require more and more computation on the
device, but we’re excited about upcoming
hardware accelerators that will make this more and more
possible. Okay. One more question before we get
into some exciting demos. When should I use TensorFlow
Lite? As we alluded to before, we’re starting to use TensorFlow
Lite for first-party applications and
third-party applications are also using it. That means that
what we’re doing moving forward is we’re going to make
TensorFlow Lite our standard solution for running ML on small
devices and mobile devices. TensorFlow Lite currently
supports a subset of TensorFlow ops and this means that our
recommendation is that you should use TensorFlow Lite if
you can and let us know — (overhead speaker — let us know
of any missing functionality you might need. It’s probably not
done. With that I want to show you a video of retrained models.
We showed you the TensorFlow logo being
recognized, and this is a common theme that we get which is that
people like our pre-trained examples like MobileNet but they
may not have a application where they need to tell the difference
between five dog breeds and many zoo animals, they might have an
office area where they have markers and whiteboards and as
we were testing the app we had this issue too. It’s like we
don’t have the classes that are in these pre-trained models.
So one of the great things one of our other TensorFlow
members created is something called TensorFlow For Poets and
there was a Codelab about that and it’s available online as
well, and it basically allows you to take a pretrained image
model with really good detector ability and put your own classes
into it. I want to show you a demo app
that we created that runs on the PC and creates TensorFlow Lite
models for you, so can we go to the video?
Okay, so we showed you before, can we recognize scissors and
post-it notes? Try it out. You always want to try the models.
The scissors look good. Okay. Great. Post-it notes also looks
good. What if we had another object? An object that is more common,
more important, like this medal
TensorFlow logo? Happens? Everyday life, right? Let’s
take a look at how this does. It’s labeled as other, that’s
not very good, but the great thing about Machine Learning is
we can fix it. The way we fix it is to add data. We have the
application and have gone to the training tab. Now we’re going
to define a class called TensorFlow, and this is
basically short for TensorFlow logo and
now from our website cam we’re going to
click capture a couple different
perspectives as many as we can, and ideally on different
backgrounds so it doesn’t associate the background with it
being a TensorFlow logo and then I click Train, and now it’s
using TensorFlow to train the model and it’s converging to a
good validation accuracy, it’s going to reload the model, we’re
testing in TensorFlow Lite running on the PC right now and
we see it’s recognizing TensorFlow correctly, so it’s
that fast and easy, but also we can take that model and we can move to Android
and iOS and use the exact same model and update it. Thanks.
Now, let’s move to a live demo, so I’m going to go over
here to the podium. (Applause). All right. So classification,
what I just showed you is kind of this idea that you have an
image in and you put an image out and you put classifications
out. What if you have multiple objects in the scene or
something in the corner of an object? You also want to
know where in the scene that object is, and that
enters this model called Singleshot detection, a type of
model and turns out our friends in TensorFlow released a package
called object detection as part of the TensorFlow models and it
allows us to use the pre-trained model to recognize many classes.
What I’ve done is I want to load it on a small device. We’ve shown you a lot of things
with small devices. I’m going to show you another thing. This
is a RaspberryPi. It’s another cool example of a device because
it’s cheap and easy to get so any high school student can have
one of these, you can have many of these and just use them for a
dedicated project, but the other great thing about them is not
only are they relatively powerful, but they’re also able
to interface with other hardware and have GPIO
pins and this can be capitalized in a number of different ways,
but one way is to run Linux, and that’s what we’re doing here.
But you can also use Android Things which you can see that the
sandbox has many examples doing that, so can also do this with
Android Things. In this case I have the system
board right here and it’s connected to a motor controller, and this motor
controller is just a microcontroller that interfaces to server
motors, and the servers motors can go left and right and up and
down and aim the camera. Now we’re going to load the object
detection model on to this device and we’re actually going
to run it in order to recognize different objects,
so let’s go to the demo feed, please.
We can see my app, you can tell by the beautiful nature of
this app that I’m not a great app developer, but this is what
I can do on a weekend, so give me a little bit of slack.
Okay. So here if we hold up the apple, it’s rising the apple and it’s
telling us what probability and where the object is. Now,
that’s all good and fine but when we couple it with the
ability to move, where I’m going to turn on the motors now and
bring back the apple. What I’m going to do is move the
apple in the screen and it’s going to try to keep it
centered, so as I move this apple, it’s basically going to
try to keep it centered, it’s like a little virtual camera
person, and this works on other objects like this banana
here, hopefully. Oh, there we go, and it’s going to keep that
centered. If you put two objects in the
screen, it’s going to try to keep them both in. Okay, so we have a little bit of
false detection, but it’s basically going to try to keep
them both centered. So this is really a fun
application, and I bet you can come up with many different
applications that are really exciting to do with this type of
application, so can I go back to the slides again? I’ll get my
clicker. (Applause).
So like I said, this is basically what I can do on a
weekend, but I imagine great app developers and people with a lot of creativity about
connecting devices and connecting software can do many
interesting things. So what I want to do now is I
want to tell you in summary,
TensorFlow Lite, you’ve seen a lot about it. Basically, we
feel TensorFlow Lite makes on-device ML small, fast, and
easy, and we think that you’re going to find it very useful in
all of your applications and we’re excited to see what you
build. Come talk to us. I’m going to
be in office hours at 12:30. You can come talk to me. In
addition, you can come to our sandbox if you haven’t already,
and we have, of course the examples that I showed you here, we have the
tracking camera, we also have the object classification on
mobile device, but another cool thing that we have
is the donkey cars and this was done by a group outside of
Google and we converted them over to TensorFlow Lite and we
are excited to see that their
application works really well with TensorFlow Lite as well.
So with that, I hope you check these things out. I want
to tell you that if you want to get started, you can go to our
documentation page, you can go to the TensorFlow. org page and there is a TFLite
page that you can find more information. Our code is all
Open Source, get it on GitHub, download it, modify it, submit a
pull request, and of course file any issues that you have while
using it. In addition, if you want to talk
about TensorFlow Lite, talk about your applications, ask us
about feature requests, please send to our mailing list. This community is exciting
exciting, in Open Sourcing TensorFlow, people got really
exciting and we made it better forever for people inside and
outside of Google and we hope you’ll engage TensorFlow Lite in
the same way TensorFlow has been engaged.
With that I want to thank you for your attention, for coming to
I/O, for listening to our talk about TensorFlow Lite, and I
also want to thank you — thank our Google partners. This
product didn’t come out of isolation. It came from all of
our experience building mobile apps with machine intelligence,
and as we gained experience we found that there was a common
need and that was the genesis of TensorFlow Lite, so all of our
partners provided provided application,
provided feedback, even provided code and help with our model, so
thank you so much to you and to them and enjoy the rest of I/O.
(Applause). >>Thank you for joining
this session. Grand ambassadors will assist with directing you
through the designated exits. We’ll be making room for those
who have registered for the next session. If you’ve registered
for the next session in this room, we ask that you please
clear the room and return via the registration line outside.
Thank you. May 10, 2018 11:30 a.m. PT Don’t Let Your App Drain Your
User’s Battery – – lunchish, welcome everybody and
thank you for coming to talk about battery. It’s another
year and I’m here again to talk to you about power. It’s an ongoing project and as
you’ve probably saw from the keynote, Dave talked about Maslow’s or his own
version of the pyramid and I think
actually a lot is in this image. Battery is foundational to
everything you’re doing on your phone, right. If you don’t have
power, you don’t really have any of the cool feature, you don’t
have the camera, you don’t have the assistant. Battery exists
to power all the other things and it’s a struggle for
us and a struggle for all of you as well. You want to do awesome
things for users and you also want to save power so they can
do those things for longer, and battery is actually a really
easy problem to solve if you just kill everything, fixed,
I’ll take my promotion, it’s great
The problem is it’s obviously much more nuanced than
that and that is a struggle that we have internally every day
talking about power and what are the right tradeoffs for users,
and that’s always a really hard question and we’re going to talk
a little bit about that and some of the things that we’re building into Android P to make
it a lot easier, and so the first thing I want to talk about is where is it going.
This is a horrifically simplified, I’m sure some of the
engineering out there are like oh, my God, he’s showing this!
You have hardware, things like how much RAM you have on your device
cause a persistent drain on what’s happening there. What hardware your OEM selected,
what you’re using, how big your screen is, how bright your
screen is all play into that effect You have the Os itself, is the
kernel being efficient, waking up a lot, does it have a lot of
overhead? Do you have disk encryption, is it in software or
hardware? Those things all effect your
overall power profile happening under the hood
The next thing is absented services and this is where we’re
going to spend a bit of time talking about today. Obviously,
what apps you have on your phone and services running that phone drive power substantially,
largely through the CPU and network activity
The last is the user’s interaction with the device
itself, how they behave, what applications they use, how
quickly they respond to notifications or don’t, all
impacts the overall picture of where battery ends up. A really
complicated topic, we’re going to talk largely about the bottom
left corner today, but the thing I want to say before we get into
that is what other things have we done and how does the things
we’re going to talk about in P affect the strategy? We started in lollipop with job
scheduler and that was a powerful tool to avoid apps
having to hang out on services all the time processing in the
background, but you could schedule a piece of work and
have it done a finite amount of time and the Os could
have variability on when to schedule it. next was doze and that was
targeted at you have the tablet or phone, you put it down, not
using it, you put it in a drawer and forget about it. Hopefully
when you pull it out and use it again a few days later it still
has a battery. Doze was trying to target that problem and was
largely successful We had to go further, Doze on
the go or Does light we call it
internally because it’s faster. That was targeted at phones. I
have my phone in my pocket, I’m not using it, but I’m moving
around and that’s not in a drawer. That is like the in
exstep, is how do we get that to be the next scenario, and Doze was looking
at that. You still don’t want full Doze
but a little backed off and We still needed to do more and
go deeper, the next is background limits, largely
targeted at background services as a whole, now we’ve had jobs
around for quite a while, and also we’re looking at background
limits for location and trying to figure out what’s the right
balance of applications looking at location in the background
versus the foreground and ensure there would be different
thresholds of what is appropriate
Really good effect with that, but the only downside to what we did
with Oreo we needed apps to target the Oreo SDK, I’ll talk
about that in a moment, in order to support this. P, since we’re
here we’re going to talk about some more stuff that we’re
working n the one thing I wanted to take a moment because we’re
mostly going to talk about really cool ML and cool things.
There was a lot of brute force optimization that the
engineering team has also done and this was
looking at f2fs filesystem work and how do we choose between
whether a process should be on a small core or big core, and looking at when to do CPU boosts
so you have buttery smooth sliding but you’re not just
purning power when the screen is on in anticipation the user
might interact with the screen. These are all things done over
the years. This is a set we just did on P, and there is
probably still more we’re trying to get done.
And then the thing is there is still some of those big
challenges that are remaining and this is what we’re going to
talk about today. What we noticed was battery
drain was roughly proportional to the number of apps you have
on the phone, not the number of apps you are using,
but how many you had installed. And obviously that’s not ideal,
and so we wanted to try to remedy that. We also saw a lot
of scenarios where apps are accidently making mistakes, like if you’re working with
wakelock you’re kind of playing with fire and what happens when
the mistake happens? Usually the battery suffers. Also some
developers are really aggressive with the use case and trying to
be like oh, I must be there all the time ready to go at a
moment’s notice and sometimes that might take a awful lot of battery to achieve that
and is that really in the interest of the user, and how
can we help balance that back out?
The last part is, even when an app was aggressive or app had
a mistake, it wasn’t obvious to the user as to what to do. They didn’t know is the battery
menu going to help me, does it tell me everything, does it
catch all the cases, none of the case, some of the cases?
Even when you saw a high battery value, now what is this do I
uninstall, live with it, call up the developer and send a
nastygram, that’s something else we wanted to look into
The next thing I mentioned before is the Oreo SDK, having apps
targeting the SDK to say I understand how the background
limits work, I understand how I can use Jobs and alarms
instead of using services. As you probably saw there was an
announcement late last year talking about targeting
requirements for applications on the Play Store, if you haven’t
seen that Google that and find out about it, you should be
looking at it because those requirements are kicking in
later this year and that’s going to ensure then on P the majority
of apps on your phone are now running the Oreo SDK and all of
these features we’re talking about work. So the high-level thing I want
to talk about is a feature called
Adaptive Battery. Battery Saver and trying to make it better,
and additional restrictions around background restrictions
to help users understand when can I make it stop, can I have
another choice other than uninstall, something in the
middle between live with it and r
remove it This pyramid is trying to
articulate, we want power efficiency, we like batteries that last a long
time, we want cool apps and we don’t want the use tore have to
manage between the things. Cool apps or power efficiency? No
one wants to make that choice, it’s difficult, I don’t want to
make it, no one wants to make it, we want it just to work and
that’s what this whole project has been about, so with that I’m
going to invite James up here to talk
about Adaptive Battery. Thanks. (Applause)>>JAMES SMITH: Thank you very
much. Hi, everyone. Good morning. I’m James. I’m rep sending the DeepMind
team today and we’re pleased to be here and working with Android
and excited about what we’ve built together.
So as a user, you shouldn’t have to closely manage how your
apps behave on your device. A modern intelligent operating
system should just take care of it, and that’s where Machine
Learning can help. We’ve co-developed this feature
with Android called Adaptive Battery. It intelligently
aligns app power consumption with app usage.
Apps can still run in the background when they need to and
users don’t need to micromanage. So Adaptive Battery uses the concept of app App Standby
Buckets. Every app is assigned to the buckets. Each bucket has
different limits on background activity and there are four buckets ranging from active to
rare. Apps in the rare bucket have the
most restrictions and we use the ML model to assign apps to buckets based on
their predicted usage. So for example, apps that are
predicted to be open in the next few hours, I’m here at I/O, so I
might be using the I/O app and so the I/O app is going to be in
the active bucket. But in a few days time when I’m back at home, Adaptive Battery is
going to automatically determine that I’m unlikely to be using
this app and put it into the rare bucket so it’s not
unnecessarily consuming resources like battery. So what are the restrictions
that are applied? Jobs, Alarms, Firebase Cloud
Messaging, and Network are all restricted in Android P
depending on which bucket the app is in. Apps in the active bucket have
no restrictions, just like it is
today. Working set has restrictions on jobs and
alarms, and frequent and rare buckets introduce restrictions
on the number of high priority Firebase
Cloud Messages. Messages beyond that cap will be treated
as normal priority so that we batch together with other
messages to save battery. Generally, as developers you
should assume the background activity will be deferred and
ensure that your app can work under those conditions. But one
thing to remember is once this device is plugged into power,
all of the restrictions are lifted. There are ADB commands you can
use to test apps in each buckets to make sure it performs as
expected. Can you also use new framework
APIs to to get your apps current bucket at runtime. Most apps should be fine if
they’re already following best practices such as using Jobs for
background work and targeting recent API levels like
Android Oreo. As Ben just said, this is going to be a requirement for app updates on
the Splay Store later this year so please make sure you target
the latest SDK versions. So let’s talk a little bit about
Machine Learning. The model was built by Android
in DeepMind to predict which apps
go into which buckets. We should have done this with
simple heuristics or we could have even just used past
behavior on the device but we found that Machine Learning
allows us to capture the nuance of how users behave with their
app, and I’ll go into some detail on how we built the model
in a second. For those of you who are
interested in the architecture of the model, we’re using a two-layer deep
convolutional neural nets with a feet forward
neural net on top and this is used to predict the probability
that an app will be opened in a given interval.
You might have heard of convolutional neural nets
before, particularly they’re common in image classifiers, but
we’ve used them to here to measure higher levels —
higher-level patterns of app usage over time.
Turns out they’re pretty good at that. Now all of there is happening
on-device using TensorFlow, and that’s actually a first for DeepMind.
We never deployed production models on the compute pair of a single
device, and a single mobile device with the limited compute
power that’s available is a particular set of challenges,
and of course we’re building this Machine Learning model with
the intention of attempting to save power, and if
we’re doing Machine Learning on-device
we have to be careful we’re not spending more power than we’re
saving. We’re going to be making this
model available to all device
manufacturers so they can take it if they wish for their
devices on Android P or they can implement their own or they can
take an Android Open Source version of it as well.
So the model, we trained the model on millions of sequences of app
opens and transitions to discover the patterns of how users behave, so an
example would be if you use an app at 8:00 in the morning every
morning, you’re probably quite likely to open it tomorrow at
8:00 as well. But if you only use a certain app on weekends
like a game or a travel app, then you’re probably not going
to open it on Monday morning, and the model is able to capture this nuance.
The model looks at the user’s behavior and outputs a
probability of when you’re next going to open a particular app.
The model compares the usage to the patterns we’ve observed in
the data and the behavior on-device will
influence the predictions given by the model. So we’ve also built this model
with certain principles in mind. We tried to make it fair. There is no favoritism of one
app over another. If you use one particular app in
exactly the same way that I use a different app, then the model
will output the same predictions and they will be assigned to the same
buckets. It’s sensitive, there is no
personal identifiable information used by the model,
both when it’s trained and when it runs on the device. The model only compares the app
usage on-device to our known sequences of millions of app
transitions and it predicts when you’re next going to use that
app. So if you combine the App
Standby Buckets and their restrictions with this model
that predicts when you’re next going to open the app, that’s
the Adaptive Battery feature. It’s a system that adapts the
phone to you. The output of the model is used to personalize the
Os’ behavior to adapt it to how you use the phone, so the apps
that you use get to run in the background and the apps that you
don’t, don’t. We hope this will create a more
consistent battery experience for users. I’m now going to pass it over to
Madan a product manager on Android to talk about the Battery Saver future.
Thanks very much. (Applause).
>>MADAN ANKAPURA: Thank you, James. So let’s talk about Battery
Saver. Many of us run through the day
and as we end the day we realize oh, I
won’t be actually able to make it.
Battery Saver is there to save you. Remember the red bars with
our animations that were janky in Oreo? Who liked those? So we got rid of them. We also actually turn off
Location and that helps battery. In addition to that, many device-specific features like
display for example are also turned off. All of these add up
in order to save battery and prolong the device
battery so that you can go back and charge the device.
We have also made some changes in the UI so that you can live on
this Moar for longer. Now you have a slider that you can
actually increase all the way to the right, which probably allows
you to stay on this mode if you choose to.
But what’s also interesting is what we found is that if an app is being
used by the user in this mode, having a
dark theme really helps. Our Internet tests show that apps using dark theme save power
compared with the ones that don’t. So if you are a
developer that already supports Dark Theme, please do think
about switching to Dark Theme when you detect that the device is in
Battery Saver. There are a few commands that
you can use through ADB for force the device
into Battery Saver and test them, and for those of you that already
support that mode, there are APIs like Power
Save Mode and broadcast that you can use to switch your device, or your app
theme to dark. So let’s jump into the battery
settings itself. We think when the user actually
goes into a battery setting, probably they are having a bad
battery day. You want to make the user very
easy to take actions. You want to keep the UI
simplified, and we will tell you exactly
what applications might be causing the battery drain, so it
will be much more opinionated about these things.
In addition to that as you scroll down, we’ll also be able
to see how long your battery will last, and that
number is also powered by an ML
monitor. So let’s jump into that
opinionated background restrictions. So we have some
principles upon which we have built this particular feature.
User asks for controls, they want to know like which app is
doing what so that this he can take a quick action on them. We also want to make sure that
apps that don’t necessarily target Oreo are also being well managed by the
user, and lastly our goal is to make sure
that apps don’t get restricted in the first place, so that way
apps fundamentally are a better battery citizen. So with P, we will actually
launch this future with two specific
criteria. The first one is if your app is
still not targeting Oreo and has a background service, the second one is if an
app holds wakelocks or what call
stuck wakelocks for more than an hour in the background, both of
these are very well understood causes for battery
drain. There are more such reasons. We today present those reasons
within Play Developer Console and we
call them Android Vitals so we are in the future going to be looking at many such
signals in order to incorporate coming up with new rules for
background restrictions. So what does it mean for your
app in case if the user decides to restrict them? As James talked about jobs,
alarms, services, and network, the apps will not be able to do any of those in
the background. There are some explicit intents, for example,
location will not be delivered to those applications, so we’re
trying to minimize the reasons why an
app should be able to use device
resources when it is in the background.
And lastly, if an app is in the background and restricted,
it won’t be able to use foregone services.
So as I spoke earlier, I hope many of you are already familiar with
Android Vitals, and if not I would strongly recommend to you
go take a look into what signals do we actually show
in their there and how your app is behaving.
We have heard many success stories of application
developers using this data in order to improve their
applications, and we in fact use some of these signals internally also to
make Os better. Trying to figure out like what kind of
changes we could be making in order to have the right guardrails within
Os to make sure that apps don’t
accidently drain battery. There was a session about it, if
you haven’t had a chance to listen to those, please take a
look at the recording. So we made the UI simplified,
but there might be several of you
who would fall into as power users, so you do like graphs and
you do like number of percentages, so we do have it in
the overflow menu. Now the graph is improved with
better predictions and the user can take action from this screen
if you see any app that is not
necessarily surprising you as being consuming more battery. So from here you can directly preemptively restrict
applications or unrestrict them. This is a good user control for
users to have. So how do you test for
this? Again, there is ADB to put them in the
state and test them while they’re in the state. Note that
even an app who has been restricted could still be used
by the user by launching them
explicitly so you want to make sure that they still work.
So we have gone through some of the key features in P, and what
does it mean for you as an app developer to continue to
actually deliver your features using various things that are offered like
Jobs, and so on and so forth, what does it mean for you to do
a background work? You might have actually seen
this flowchart before. Ben presented it last year. It
literally went through what is it that you want to do and
offered some choices. We want to simplify our recommendation,
and thanks to Jetpack, in it we have a great tool.
So it will simplify it now hopefully. If you want to think
about doing anything in the background, we highly recommend
you to actually evaluate Work Manager as a go-to thing. And
if you think it’s something that is really important and must
happen now, of course you have the ability to use foreground
services. But note that if you use foreground services, you
have to tell the user why you are using it for because there
will be a notification and you’ll have to justify yourself
as to why you think it is important.
And now because of that, if you change your mind, go back to
WorkManager. So if none of this worked, think
about actually doing the work when the app actually is
launched and being actively used by the user, so the choices are hopefully very
simplified. All of this really helps us manage battery better
and that will lead to having a better battery life.
There have been some sessions that have already
happened so I want to call and shout out to Jetpack which
talks about WorkManager and there was
also a session on Android Vitals and
there is a video. There is a detailed documentation of the
features that we talked about. Please head over to d.android. com/power, and finally since we
won’t be taking any questions, please do
find us. All three of us will be in the
office hours from 2:30 to 3:30 in the office hour space, so look for Android
Framework office hours. And finally, if you have any feedback here is our URL and
thank you. (Applause). >>Thank you for joining
this session. Grand ambassadors will assist with directing you
through the designated exits. We’ll be making room for those
who have registered for the next session. If you’ve registered
for the next session in this room, we ask that you please
clear the room and return via the registration line outside.
Thank you. Advances in Machine Learning and TensorFlow Breakthroughs in Machine
Learning >>At this time, please find
your seat. Our session will begin soon. >>Hi, everybody, and welcome
to this session, where we’re going to talk about
breakthroughs in machine learning. I’m Laurence Moroney,
a developer advocate working on TensorFlow. We’re here today to
talk about the revolution that’s going on in machine learning and
how that revolution is transformative. Now, I come from
a software development background. Any software
developers here? Given that it’s I/O, sure. This
transformation is particularly from a developer’s perspective,
is really, really cool, because it’s giving us a whole new set
of tools that we can use to build scenarios and to build
solutions for problems that may have been too complex to even
consider prior to this. It’s leading to massive advances
in our understanding of things like the universe around us.
It’s opening up new fields in art. And it’s impacting and
revolutionizing things like healthcare and so many more
things. Should we take a look at some of these? First of all,
astronomy. At school I studied physics. I’m
a physics and astronomy geek. It wasn’t that long ago when we
learned about new planets. The way we discovered it is,
sometimes we would observe a wobble in the star. That meant there was a large
planet orbiting the star closely and causing a wobble because of
the gravitational attraction. The kinds of kind of planets
we want to find are like Earth, where there’s a chance of
finding life. And finding those and discovering those was very,
very difficult to do because small ones, close to a star, you
wouldn’t see. But, with research that’s been
going on, they’ve recently discovered
Kepler 90I by sifting through data and
building models for using machine learning and TensorFlow. And Kepler 90I is much closer to
its home star than Earth. The orbit is 14 days instead of
365. And not only that, which I find
really cool, they didn’t just find this as a single planet
around that star. They’ve mapped and modeled the entire solar
system of eight planets that are there. So these are some of the
advances. To me, I find this a wonderful time to be alive, because technology’s
enabling us to discover these great new things. Even closer to
home, we’ve also discovered that looking at scans of the
human eye, as you would have seen in the keynote, with
machine learning trained models on this, we’ve been able to
discover things such as blood pressure predictions, or being
able to assess a person’s risk of a heart attack or a stroke.
Now, just imagine if this screening can be done on a small
mobile phone. How profound is the effect going
to be? The world will be able to access easy, rapid, affordable, and
noninvasive screening for things such as heart diseases. It will
be saving many lives and improving the quality of many,
many more lives. Now, these are just a few of the breakthroughs
and advances that have been made because of TensorFlow. And
TensorFlow, we’ve been working hard with the community with all
of you to make this a machine learning platform for everybody. So today we want to share a few
of the new advances that have been working on this, including we’ll be
looking at robots. And Vincent will come out to show us robots
that learn and some of the work that they’ve been doing to
improve how robots learn. And then Debbie is going to be,
from NERSC, showing us cosmology advancements, including how
building a simulation of the universe will help us understand
the nature of the unknowns in our universe like dark matter.
First of all, I would love to welcome from the Magenta team, Doug, a
principal scientist. Doug.>>DOUGLAS ECK: Thanks,
Lawrence. (Applause)
>>DOUGLAS ECK: Thank you very much. All right. Day three.
We’re getting there. Hi, everybody, I’m Doug, a
scientist at Google working on a project called Magenta. Before
we talk about modeling the entire known universe, before we
talk about robots, I want to talk about music and art, and
how to use machine learning for expressive purposes. So, I want
to talk first about a drawing project called sketch
RNN, where we trained a neural network to do something as
important as draw the pig that you see on the right there.
And I want to use this as an example to highlight a few important
machine learning concepts that we’re finding to be crucial for
using machine learning in the context of art and music. So, let’s dive in. It’s going to
get a little technical, hopefully it will be fun. We’re
going to try to learn to draw not by generating pixels, but
pen strokes. This is a very interesting representation to
use because it’s close do what we do when we draw. Specifically
we’re going to take the data from the popular quick draw
game,. That was captured as delta X
movements of the pen. We know when the pen is put down
and lifted up. That’s our training domain. One thing I would observe is we
didn’t need a lot of this data. It fits the creative process. It’s closer to drawing, I argue,
than pixels are. It’s modeling the movement of
the pen. Now, what we’re going to do with
these drawings is push them through an
auto-encoder. On the left, the encoder network’s job is to take
the strokes and encode them in some way so that they can be
stored as a latent vector. The job of
the decoder is to decode that into a generated sketch.
And the only point that you really need to take away from
this talk is that that latent factor is worth everything to
us. First, it’s smaller in size than
the encoded or decoded drawing. It can’t memorize everything.
Because of that, we get nice effects. For example, you might
notice if you look carefully that the cat on the left, and
which is actual data and has been pushed through the trained
model and decoded, is not the same as the cat on the right,
right? The cat on the left has five
whiskers. The model regenerated the sketch with six because
that’s what it usually sees. Six is general. It’s normal to the
model, whereas five is hard for the model to make sense of. So this idea of having a tight,
low-dimensional representation, the latent vector that’s been
trained on lots of data, the goal is the model might learn to
find generalities in a drawing, learn general strategies.
So here’s an example of starting each of the four
corners with a drawing done by a human, David, the first author.
And those are encoded in the corners. Now we move linearly
around the space — not the space of the strokes, but the
space of the latent vector. If you look closely, I think you’ll
see that the movements and the changes from these faces say
from left to right, are actually quite smooth. The model has
dreamt up all of those faces in the middle. To my eye they
really do fill the space of possible drawings.
Finally, as I pointed out with the cat whiskers, the
models generalize, not memorize. It’s not interesting to memorize
a drawing. It’s more interesting to learn general strategies for
drawing. We see with the cat. I think more interestingly —
it’s also suggestive — we see this with doing something like
taking a model that’s only seen pigs and giving it a picture of
a truck. What’s that model going to do? It’s going to find a pig
truck, because that’s all it knows about. And if that seems
silly, which I grant it is, in your own mind think about how hard it would be, for me, if
someone says draw a truck that looks like a pig.
It’s hard to make that transformation. And these models do it. Finally, by the way, I get paid
to do this. I just want to point that out as an aside. It’s kind
of nice. I said that last year. It’s still true. These latent
space analogies, another example. If you add and subtract
pen strokes, you’re not going to get far with making something
that’s recognizable. But if you have a look at these analogies,
we take the latent vector for a cat head and we add a pig body.
And we subtract the pig head. And of course it stands to
reason that you should get a cat body. We can do the same thing
in reverse. This is real data. This works. I mention it because
it shows that these models are learning some of the geometric
relations between the forms that people draw. I’m going to switch gears and
move from drawing to music, talk about a
model called Nsynth, that generalizes
audio. You may have seen from the beginning of I/O with
bathing that this has been put into a hardware unit. How many
people have heard of super? How many people want an ensign
super? Good. That’s possible, as you know.
For those of you that didn’t see the opening, I have a short
version of the making of the NSynth to give you
an idea of what the model is up to. Let’s roll it.
>>Here’s a flute. Here’s a snare. I guess in the
middle, this is what it sounds like. ♫ >>It does feel like what
could be a new possibility. It could generate a sound that
might inspire.>>The fun part is even though
you think you know what you’re doing, there’s some weird
interaction happening that can give you something totally
unexpected.>>Wait, why did that happen
that way? ♫
>>DOUGLAS ECK: Okay. So, what you see here — by the
way, the last person was Jesse, the main scientist on the NSynth project.
This grid that you’re seeing, this square where you can move
around the space, is exactly the same idea as we saw with those
faces. You’re moving around the latent space, able to discover sounds that
have some similarity. Because they’re made up of learning what
makes humans — how sound works for us in the same way as a pig
truck gives us new ideas about how sound
works. As you probably know, you can
make these yourself, which my
favorite part about it. This is open source, GitHub. For those of you who like to
tinker, give it a shot. If not we’ll see some coming from tons
of people who are building them on their own. So, I want to keep going with
music. I want to move away from audio to musical scores, musical
notes, something that think of last night driving
the sequencer. And talk about basically the
same idea. Can we learn a latent space where we can move around what’s possible
in a musical score? So what you see here is some
three-part musical thing on the top, and some one-part musical
thing on the bottom, and then finding in a latent space something that’s in between,
okay? And now, I put the faces
underneath this. What you’re looking at now is a
representation of a musical drum score where time is passing left to
right. I’m going to play this for you. It’s a little bit long. We’re going to start with a drum
beat, one measure of drums, and we’re going to end with one
measure of drums. You’re going to hear those first, A and B.
Then you’re going to hear this latent space model try to figure
out how to get from A to B. And everything in between is made up
by the model in exactly the same way that the faces in the middle
are made up by the model. So as you’re listening,
basically listen for whether it makes musical sense or not, the
intermediate drums. Let’s give it a roll.
♫ >>DOUGLAS ECK: There you
have it. (Applause)
>>DOUGLAS ECK: Moving right along, take a look at this
command. This may make sense to some of
you. We were surprised to learn that this is not the right way
to work with musicians and artists. I laughed, too. We
thought this is a great idea, guys. Paste this into Terminal.
They’re like what’s terminal? You know you’re in trouble.
We’ve moved quite a bit towards trying to build tools that
musicians can use. This is a drum machine that you can play with online built around
TensorFlow.JS. I have a short clip of this being used.
You’re going to see all the red is from you. You can play
around with it. The blue is generated by the model. Let’s
give this a roll. This one is quite a bit shorter. ♫ >>DOUGLAS ECK: So this is
available for you as a code pen which
allows you to play around with it, and really amazing, a huge shoutout to Tero
who did this. He grabbed one of our training models and used TensorFlow and hacked
code and put it on Twitter. We had no idea this was happening.
And then we reached out to him on Twitter. I said you’re my
hero. He said you care about this? I’m like, of course,, this
is our dream, to have people playing with this technology. I
love that we’ve gotten there. Part of what I want to talk
about today — actually close with, we’ve cleaned up a lot of
the code. Tero helped. We’re able to introduce Magenta.JS,
very tightly integrated with TensorFlow, and it allows you to
grab a checkpointed model and set up a
player, and start sampling. In three lines of code you can set
up a drum machine. We have the art side as well. And we’ve seen a lot of demos
driven by this, a lot of interesting work by Googlers
and people from the outside. And I think it highly aligns
with what we’re doing. We’re working to engage with musicians
and artists, very happy to see the JavaScript stuff come along,
which seems to be the language for that. Hoping to see better
tools come, and heavy engagement with the open source community.
If you want to learn more, please visit the website. Also,
you can follow my Twitter account. I post regular updates
and try to be a connector for that. So that’s what I have for
you. Now I’d like to switch gears and go to robots, with my colleague,
Vincent Vanhoucke. Thank you very much. (Applause)
>>VINCENT VANHOUCKE: Thanks, Doug. So, my name is Vincent and I
lead the Brain robotics research team, the robotics research team at
Google. When you think about robots, you
may think about precision and control. You may think about robots, you
know, that live in factories. They’ve got one very specific
job to do and they’ve got to do it over and over again. But as you saw in the keynote
earlier, more and more robots are about
people. Right? They’re self-driving cars that are
driving in our streets, interacting with people.
They essentially now live in our world, not their world. And so they really have to adapt
and perceive the world around them, and learn how to operate in this
human-centric environment. How do we get robots to learn
instead of having to program them? This is what we’ve been
embarking on. And it turns out we can get
robots to learn. It takes a lot of robots. It takes a lot of time. But we can actually improve on
this if we teach robots how to behave collaboratively.
So this is an example of a team of robots that are learning
together how to do a very simple task like
grasping objects, right? At the beginning, they have no
idea what they’re doing. They try, and try, and try. And
sometimes they will grasp something. Every time they grasp
something, we give them a reward. And over time, they get
better and better at it. Of course, we use deep learning for
this. We basically have a network that
maps those images that the robots see of the workspace in
front of them to actions and possible actions.
And this collective learning of robots enables us to get to
levels of performance that we haven’t seen before. But it
takes a lot of robots. And in fact, you know, this is Google.
We would much rather use lots of computers if we could instead of
lots of robots. And so the question becomes, could we actually use a lot of simulated
robots, virtual robots, to do this kind
of task and teach those robots to perform tasks?
And would it actually matter in the real world? Would what
they learn in simulation actually apply to real tasks?
And it turns out the key to making this work is to learn simulations
that are more and more faithful to reality. So on the right here you see
what a typical simulation of a robot would look like. This is a virtual robot trying
to grasp objects in simulation. What you see on the other side
here may look like a real robot doing the same task, but in
fact, it is completely simulated as well. We’ve learned a machine learning
model that maps those simulated images to real images, to real-looking
images that are essentially indistinguishable from what a
real robot would see in the real world. And using this kind of
data and training a simulated model to
accomplish tasks using those images, we can transfer that
information and make it work in the real world as well. So there’s lots of things we can
do with these kinds of simulated robots.
This is Rainbow Dash, our favorite little pony. And what
you see here is him taking his very first steps. Or very first hops, I should
say. He’s really good for somebody who’s just starting to
learn how to walk. And the way we accomplished this is by having a virtual Rainbow Dash
running in simulation. We train it using deep
reinforcement learning to run around in the simulator. And
then we can basically download the model that we’ve run in the
simulation onto the real robot and actually make it work in the real world as well.
There are many ways we can scale up robotics and robotic
learning in this way. One of the key ingredients turns out to be learning by itself,
self-supervision, self-learning. This is an example, for example,
what you see at the top here is somebody driving a car. And what we’re trying to learn
in this instance is the 3D structure of the world, the
geometry of everything. What you see at the bottom here
is a representation of how far things are from the car. You
probably are looking at avoiding obstacles and looking at other
cars to not collide with them. And so you want to learn
about the 3D geometry based on those videos. The traditional way that you
would do this is by involving, for example, a 3D camera, or
something that gives you a sense of depth. Here we’re going to do
none of that. We’re going to simply look at
the video and learn directly from the video the 3D structure
of the world. And the way to do this is to
look at the video and try to predict the future of this
video. You can imagine that if you actually understand the 3D
geometry of the world, you can do a pretty good job at
predicting what’s going to happen next in the video.
So we’re going to use that signal that tells us how well
we’re doing at predicting the future to learn what the 3D
geometry of the world looks like. So at the end of the day,
what we end up with is yet another big
convolutional network that maps what you see at the top to what
you see at the bottom without involving any 3D camera or
anything like that. This idea of self-learning or
just learning without any supervision directly from the
data is really, really powerful. Another problem that we have
when we’re trying to teach robots how to do things is that
we have to communicate to them what we want, what we care
about, right? And the best way you can do that is by simply
showing them what you want them to perform. So here is an example of one of
my colleagues basically doing the robot dance. And the robot that is just
looking at him performing those tasks, and trying to imitate
visually what he is doing. And what’s remarkable here is that,
you know, even though the robot, for example, doesn’t have legs,
it tries to do this crouching motion as best
it can given the degrees of freedom
that it has available. And all of this is entirely
self-supervised. The way we go about this is that
if you think about imitating somebody
else, for example, somebody pouring a glass of water, or a can of Coke, it all
relies on you being able to look at
them from a third-party view and picturing yourself doing the same thing from your
point of view, what it would look like if you did the same
thing yourself, right? So we collected some of this data that
looks like that where you have somebody looking at somebody
else do a task and you end up with those two videos
of one taken by the person doing the task and another one taken
by another person. And what we want to teach the
robots is that those two things are actually the same thing. So we’re going to use, again,
machine learning to perform this matchup. We’re going to have a
machine learning model that is going to tell us, okay, this
image on the left is actually of the same task as this image on
the right. And once we’ve learned that correspondence,
there lots of things we can do with this. One of them is just
imitation like this. Imagine you have somebody pouring a glass of water, the robot sees
them, they try to picture themselves
doing the same task, and try best they can to imitate what
they are doing. And so using, again, deep
reinforcement learning, we can train robots to learn those kinds of activities completely based on
visual observation without any programming of any kind. So I won’t let that robot pour
quite yet, but it’s encouraging that we can look at robots that
understand essentially what the nature, what the fundamentals of
the task is regardless of whether they’re pouring a
liquid, or they’re pouring beads, or
whatever the glasses look like or the containers. All of that
is abstracted and the robot understands deeply what the task
is about. So I’m very excited about this
whole perspective on teaching robots
how to learn instead of having to program them, right. At some point I would want to be
able to tell my Google assistant hey, okay, please go fold my
laundry, right. And for that to happen, we’re
going to have to rebuild the science of robotics from the
ground up. We’re going to have to base it
on understanding and machine learning, and perception, and of
course we’re going to have to do that at Google scale.
With that, I’m going to give the stage to Debbie, who is
going to talk to us about cosmology. Thank you.
(Applause)>>DEBORAH BARD: Thank you. Good afternoon, everyone. My name is Debbie Bard, and I’m
going to be talking about something a little bit
different. So I lead the data science engagement group at
NERSC. And NERSC is the National Energy Research Scientific
Computing Center. We’re a supercomputing center at
Lawrence Berkeley National Lab just over the bay from here. We
are the mission computing center for the Department of Energy
Office of Science. We have something like 7,000 scientists
using our supercomputers to work on some of the biggest questions
in science today. And what I think is really cool
as well is that I get to work with some of the most powerful computers on
the planet. One of the things we’re noticing is we’ve seen
that scientists are increasingly turning to deep learning and
machine learning methods to solve big questions that they’re
working on. We’re seeing these questions showing up in our
workload on our supercomputers. So I want to focus on one
particular topic area. It’s very close to my heart, which is cosmology, because I’m a cost moll cosmologist by training.
I’ve always been interested in the nature of the universe.
Perhaps one of the most basic questions you can ask is, what is it made
of? These days, we have a fairly
good feel for how much dark energy there is in the universe,
how much dark matter, how much regular matter there is in the
universe. And there’s only about 5% of regular matter, which is
everything that you and I, and all the stars, dust, gas,
and galaxies out there are made of regular matter. That makes up
a pretty tiny proportion of the content of the universe.
The thing that I find really interesting is we just don’t
know what the rest of it is. Dark matter, we don’t know what
that’s made of. But we see indirectly the
gravitational effect it has. Dark energy, we don’t know what
that is. It was recently discovered. Dark energy is a
name we give to an observation which is the accelerated
expansion of the universe. And this is, I think, really
exciting. The fact that there is so much that we have yet to
discover means that there are tremendous possibilities for new ways for us to understand our
universe. And we are building a bigger and
better telescope. We’re collecting data all the time, taking images and observations
of the sky to get more data to help us understand this, because we only
have one universe to observe. We need to collect as much data as
we can and extract all the information we can from our data, from our
observations. And cosmologists are turning to deep learning to
extract meaning from our data. I’m going to talk about a couple
of different ways we’re doing that. First of all, I want to
ground this in the background of how we
actually do experiments in cosmology, because cosmology is
not an experimental science in the way that many of the
physical sciences are. There’s not a lot we can do to
experiment with the universe. We can’t really do much to change
the nature of space time, although it would be fun if we
could. Instead we have to run simulations. We run simulations
in super computers off theoretical universes and to
different physical models and the different parameters that
control those physical models. And that’s how we experiment. We
run the simulated universes and compare the outputs of the
simulations to our observations of the real universe around us.
So when we make this comparison, we’re typically
using some statistical measure, some kind of reduced statistic
like the power spectrum, which is illustrated in this animation
here. The power spectrum is a measure
of how matter is distributed throughout the universe, whether
it’s, kind of, distributed fairly evenly throughout space
or whether it’s clustered on small scales. And this is illustrated in the
images on the top of the slide, snapshots of a simulated universe run in the
super computer. And you can see that over time,
gravity is pulling matter together. And so that’s dark
matter and regular matter. Gravity is acting upon that,
collapsing the matter into very typical
cluster type structures, whereas dark energy is expanding space
itself, expanding the volume of this miniature universe. And so by looking at the
distribution of matter, we can start to learn something about
the nature of the matter itself, how gravity is acting on
that, and what dark energy is doing.
But as you can imagine, running these kinds of simulations is very
computationally expensive, even if you’re only simulating a tiny
universe, it still requires a tremendous amount of compute
power. And we spend billions of compute hours on supercomputers
around the world on these kinds of simulations,
including the supercomputers that I work with. And one of the ways that we’re
using deep learning is to reduce the
need for such expensive simulations, similar to the
previous speaker was talking about in robotics.
We’re exploring using generative networks to produce, in this
case, this example, two-dimensional maps of
the universe. These are maps of the mass concentration of the
universe. You can imagine the three dimensional volume
collapsed into a two-dimensional projection of
the mass density in the universe as you’re looking out at the
sky. And we use a fairly standard DC topology to produce new maps
based on simulations. So this is an augmentation. We’re using this network to
augment an existing simulation. And we see that it’s doing a
pretty good job. So just by looking by eye at the generated
images, they look pretty similar to the real simulated images. As
a scientist, squinting at something and saying that looks
about right is not good enough. What I want is to be able to
quantify this, how the network is working
and how like the real images our generated images are.
And this is where scientific data has a real advantage
compared to natural image data, because scientific
data usually, very often, has associated statistics with it.
So statistics that you can use to evaluate the success of your
model. So in this case, we were looking at reduced statistics
that describe the patterns in the maps, like the
power spectra and other measures of the topology of the maps. We
see that not only do the maps look about right, but the
statistics that are contained in those maps match those from the
real simulations. So we can quantify the accuracy
of our network. And this is something that potentially could
be useful for the wider deep learning community, using
scientific data that has statistics could be of real
interest to deep learning practitioners in trying to
quantify how well your networks are working. So I mentioned
before that this is an augmentation that we’ve been
working on so far. It can produce new maps based on a physics model that it’s already
seen. We’re working at producing physics models that the network
has never seen, making this into a true emulator.
This will help reduce the need for these expensive simulations and
allow cosmologists to explore space more freely. I’d like to
explore a little bit further what this network is actually
learning. I saw an interesting talk this morning here touching on this kind of
thing, how we can use machine learning to gain insight into
the nature of the data that we’re working with. So in the
work that I’m showing here, we were looking at which structures
in our mass maps are contributing to the model, most
strongly contributing, by looking at a quantity called
saliency. And so if you look at the map of
saliency, black and white, you can see the peaks in the saliency map
correspond to peaks in the mass map, which are concentrations of matter, and these correspond to
galaxy clusters. And this isn’t news to
cosmologists. We’ve known for decades that
galaxy clusters are a good way of exploring cosmology. The
shapes of the features that this network has learned are not
round balls, they are irregular and they’re showing some
structure. And this is really interesting
to me. And there’s also indications that some of the
smaller mass concentrations are showing up as important features
in this network. And that’s a little bit
unexpected. So by taking this kind of introspection into the
features that our network is learning we can start to learn
something about the data and get insight into some of the
physical processes that are going on in our data, and learn what
kind of structures are most sensitive to the interplay of
gravity and dark energy. This is something that’s a real
strong point of deep learning, when you are allowing the
network to learn features for itself rather than imposing
features, doing feature engineering or telling it any
particular statistics. You can allow the network to tell you
something about your data that might surprise you. So far, that was looking at
two-dimensional maps, but, of course the group verse is not a
two-dimensional place. It’s at least four dimensions, perhaps
many more dimensions depending on your favorite model of string
theory. But we’ve been looking at three
dimensions and scaling this up. The reason why three dimensions
are interesting for us, in a three-dimensional data volume,
you’re looking at matrices,
convolutions, it’s computationally expensive and it
can run really well on a supercomputing
architecture. A team recently demonstrated for the first time
that deep learning can be used to determine the physical model
of the universe from
three-dimensional simulations of the full matter distribution.
This is the full matter rather than a two-dimensional
projection of the matter density. And this work showed
that the network was able to make significantly better
estimates of the parameters that describe the physics of this
simulated universe compared to traditional methods, where you
might be looking at one of these statistics like the power
spectrum. And so this is a really nice example of how the network was able to
learn, and what structures in this were
important rather than just looking at statistics that we in
advance thought were going to be useful.
So we’re working on scaling this up in collaboration with NERSC, UC Berkeley, Intel, and Cray. We’re using larger simulation
volumes, more data, and TensorFlow running on those of CPU nodes achieving
several petaflocks of data. We’re able to predict more
physical parameters with even greater
accuracy by scaling up the training. This is something
we’re really excited about. I think it’s worth talking a
little bit more in technical detail about how we achieved
this performance, how we are using TensorFlow to get this
kind of performance and insight into our data and our science.
Now, supercomputers are fairly specialized. We have specialized hardware to
allow the tens of thousands of compute nodes we have on these
computers to act together as one compute machine. We want to use
this machine as efficiently as possible to train our network.
We have a lot of performance available to us. We want to be
able to take advantage of that when we’re running TensorFlow.
The approach we take is using a fully synchronous data parallel
approach where each node is training on a subset of the
data. And we started off as many
people do, using GRPC for this, where each compute node is
communicating with a parameter server to send their parameter
updates and have that sent back and forward. But like many other
people have noted, this is not a very efficient way to run at
scale. We found that if we were running beyond a hundred nodes
or so, then we had a real communication bottleneck between
the compute nodes and the parameter servers. So instead, we use NPI, which is
a message-passing interface to allow our compute nodes to
communicate with each other directly, removing the need for
parameter servers. And this also has the advantage that you can
really take advantage of our high-speed interconnect, the
specialized hardware that connects our
compute nodes. So we use for gradient aggregation for this, we use a specialized NPI collective, which is designed by
Cray, our partners with our supercomputers. And this MPI is pretty neat. It’s able to avoid imbalances in
the node performance, the straggler effect that some of
you might run into. It’s overlapping communication and
compute in a way that allows very effective scaling and we’ve
seen we’re able to run TensorFlow on thousands of nodes
with little drop in efficiency. Something I’ve been excited to
see here is MPI reduce is coming soon in
TensorFlow. And we’re excited to see how this is going to work in
the larger community. So the three things I’d like you
to take away from this talk, first, cosmology has cool
science problems and cool deep learning problems. The second is
that scientific data is different from natural image
data. The statistics that we often have associated with
scientific data could be of real use in the deep learning
community. And the third thing is that MPI all reduce is the
optimal strategy for scaling TensorFlow up to multiple nodes,
and we’re looking forward to seeing how the rest of the
community is going to work with this. So now I’ll turn things back to
Lawrence. Thank you. (Applause)
>>LAURENCE MORONEY: Thank you, Debbie. Great stuff.
Actually simulating universes. So, we’re running very short on
time, so I just want to share, these are just three great
stories. There are countless more out there. This is a map
that I created of people who starred TensorFlow and
GitHub and shared their location. We have the people
from Australia to Ireland, from the north arctic
circle in Norway all the way down to deception island. There
are countless stories being created and new things being
done with TensorFlow and machine learning.
If some of those stories are yours, we’d love to share them.
So with that, I just want to say thank you very much for
attending today. Enjoy what’s left of I/O, and have a safe
journey home. Thank you. (Applause) >>Thank you for joining this
session. Brand ambassadors will assist
with directing you through the designated exits. We’ll be
making room for those who have registered for the next session.
If you have registered for the next session in this room, we
ask that you please clear the room and return
via the registration line outside. Thank you. ♫ Android Fireside Chat (Cheering and applause)>>Hello, and welcome to the
Android fireside chat. We have a lot of content. We’re going to
get through that first and see if we can get to questions next.
If we could get to the next slide. I guess we’re ready for
some questions. So we do have one question to start off, if we
could go to that one. Someone asked online on the Twitter
thread why is it called a fireside chat if there’s no
fire. I thought this was a pretty good question, so the answer to that
would be . . . (Laughter)
(Applause)>>MODERATOR: All right. So warm
yourselves by the fire. We’re going to ask some questions.
You’re going to ask some questions. We’re going to try to
answer those questions. So, we pre-rolled some questions
online. It was interesting to see what people were thinking.
We have some of those. We’ll interweave them with questions
from the audience. I will show you how that works in a
question. First, I’m Chet Haase, Android toolkit team.
>>DAVE BURKE: Dave, most things.>>STEPHANIE SAAD:
>>I work on the developer experience.
>>I work on Kotlin.>>I manage the framework team.>>I’m Xavier, I work on the
tools team.>>CHET HAASE: We have a bunch
of other people from the engineering team to my stage
right here. So if one of these people here can answer it, maybe one of the
people in the audience can’t answer, either. We have a couple
of questions to kick it off, and then we’ll kick it to the
audience. First of all, what’s an exciting feature that you’re
working on for the next release? Romain, do you want to take that
one?>>ROMAIN GUY: No.
>>CHET HAASE: That’s a really good example. We do not
answer questions about future development, kind of a policy
thing that we have. I would say 90% of the questions
I got on the Twitter thread were about future development. That
makes it easy. We don’t have to answer those.
>>This is going really well. >>CHET HAASE: We’ll be done
in the next two minutes. Let’s ask a more real question. I’ll
toss this to Romain. Will Android architecture
components be a de facto architecture choice with
continuous support, like continuous builds except for
support.>>ROMAIN GUY: Yes. More seriously, we listened, you
were asking us to have an opinion about architecture. For
many years we did not have one. You can do whatever you want.
But we have a solution that’s our solution. We’re going to
support it. We have tooling. We have more libraries,
documentation, tons of talks. If you like it or don’t know what
architecture you want to use, go with that one.
>>CHET HAASE: There you go. That’s an example. It was a real
question and kind of a real answer as well. I would invite
everybody in the audience to ask a question. there are two microphones. We
were curious how long it would take people to get to them. It’s going to be kind of a
Hunger Games thing. Come to the mic, ask your question. If
people in the audience don’t ask questions, I have some here. I
see someone walking up. Go ahead.
>>AUDIENCE: The mic is too tall for me. So, following up the question
about architecture components, I remember Diane posting online
saying that, like, you know, look, we gave you activity. We
gave you content, we gave you all these things. That’s the agnostic platform,
you can do whatever you want with it.
We’re not going to impose any choices on you. I’m sure there
are other people on the Android team who agreed with her. I
wonder what Diane and those other people might think about
architecture components. Is this like a perversion of their
vision of Android? (Laughter)
>>All wrong. (Laughter)
>>AUDIENCE: Or is this something that like, I
understand why some people might need it, but if you want to do
things weirdly, you can. Was there a debate about
architecture components?>>DIANNE HACKBORN: Basically,
that all holds true as far as the core platform is concerned. You can think of — when I say
the framework team, it’s the Core platform. This is stuff
that’s shipped on the device that is, like, defines what you
can do with the device. And we want to keep that still
generic, not enforcing any particular
model. We want to have flexibility for applications to
do new things and all that kind of stuff. The architecture
components are basically a layer on top that it’s very different,
because in the support library, it can evolve rapidly with the
applications. It can change and break because
it’s not going to break applications when those changes
happen. So we’re basically putting a new layer on top of
the platform that gives our very strong position on how we
think applications should be developed, which

Leave a Reply

Your email address will not be published. Required fields are marked *