Livestream Day 3: Stage 1 (Google I/O ’18)
Articles Blog

Livestream Day 3: Stage 1 (Google I/O ’18)

October 12, 2019


In-app navigation: Principals Principles and recommendations
Navigation: Principles and Recommendations. At this time please find your
seat. Our session will begin soon.>>Good morning. Thanks for
getting up early to be here with us even despite the fact
that some of our thunder may have been
stolen by the 3 or 4 presentations before
this that demoed the navigation editor. We have time to go into
a little more detail than any of the previous
presentations so there’s a lot of good material here. I’m Lukas Bergstrom and with me
I have Ian and Sergey who built the
navigation component. Navigation if you think about it
is a prop that pretty much every app on Android has to solve but
until now we haven’t really given you anything to do that
with other than start activity which for various reasons is not
the best way to go about that. So if you think about our job of
making Android development easier, navigation was a common problem with no
real solution and that means that there are a lot of things
that you have to solve on your own. And those range from how to
commit fragment transactions, hopefully without throwing an exception, how to
test the navigation is happening correctly and that the right things are
happening when navigation occurs, how to map
deep links to various places in your app and how to deep those
deep link schemes up to date as your app navigational structure
chaels. Passing arguments from place to
place. Again we don’t give you a type safe way to do that
today. How to make sure that up and
back take users to the right places particularly in sort of
more difficult situations like someone deep linking deep
into your app’s navigation hierarchy. And so what this means is by the
time you finish solving these problems, you typically have
gone 1 of 2 directions. You’ve either written 60% of a
navigation framework or you’ve got a lot of error prone boilerplate
everywhere a nave gags needs to happen in your app. You have a
bunch of parallel lines in your code involving these
problems and the whole structure is very brittle. Oh, sorry. And individually these problems
are pretty tractible but if you look at a real world example you
can see that they can get pretty hairy. So say that I have an item
screen saved in my app and it’s accessible via deep link
but if someone navigated to the screen opening the app from the
home screen they would have a couple other screens on the back
stack. So hitting it up we want them to
take it not out of the app from the screen, we want them to go to the
category screen and then the home screen. If someone deep
links into the app we need to synthesize these screens
and add them to the up stack before
showing the screen. It’s when you’re in the middle of writing
the code to do this to synthesize these screens and add
them to the up and back stack but only
on a deep link, that’s when you start to feel like maybe you’re
solving a failure of the framework. So it helps all problems like
that that we are launching navigation. What we are giving
you us a visual tool that let’s you edit the
navigation graph of your app which is represented in XML. And that let’s you define a set
of available navigation actions, arguments you can pass from
place to place, things like visual transitions and a single
navigate call activates all that at run time. And so last but not least that
means one thing you never have to worry about again is torching is touching a fragment
transaction with your bare hands. [Applause].
So the navigation graph is essentially just a set of what are the
possible destinations people can reach in my app and those
usually correspond to screens but not always. So you can have the navigation controller just changing the
contents of a smaller part of the screen,
but it’s a set of navigation destinations and the actions
that link them. And the actions really represent how you can get
from place to place. So to actually navigate from
point A to point B in an app, you’re going to call the correct
navigation action at run time. So let’s look at what this looks
like and let’s switch over to the
demo. Okay. So what we see here is a set of
navigation destinations and these are fragment destinations
although other options are possible. And the lines connecting them
are arrow heads are actions. And those actions actually
generate methods you can call at run
time. This whole thing is backed by
XML as you all know and love. And the XML and the navigation
editor have the same set of capabilities so you can use
either one. We are going to add a screen
here. And what you’re seeing is a set of available activities
and fragments in my app. Okay. So we just added an option for
them to actually answer this question successfully and win
the game so now we are going to add an action to
navigate to that screen. Great. And so that navigation
action has a bunch of options that you can set, obviously. We
are going to talk about those a little bit more. For now we are
going to do one thing. We are going to say if they have gotten
to this congratulations screen, that means the game is over. So
hitting back shouldn’t take them back into a game that no longer
exists. So what that means is we want to
on that action say let’s pop to the
match screen and that means I’m going to pop off everything on
the back stack in between this destination and the
match screen. So when the user gets to the congratulations
screen, when they hit back, they’re just going to go
straight to the match screen. So a lot of other options I can
set but I’ll just talk about that for now.
Let’s go back and look at the congratulations screen again. One other thing to mention, the
key thing that is set here is the fragment class. And that’s what is actually
instantiated a run time. And we see layout previews here
because the navigation editor know what’s layout is associated with
that navigation graph, because it
knows, I can double click on that fragment destination to get
here into the layout editor. Great. And everything that I’ve
just done here adding this new destination
to the navigation graph, an action changing what the pop
behavior of that action is, all this stuff I can also do
programatically at run time. So navigation editor, XML or programatic all work just
fine. Great. And now I’m going to hand this
off to Ian to walk you through some more detail.
>>IAN LAKE: So the first question that might come to mind is wow this
is a totally new way of structuring
the UI and it brings up this question immediately so what is my
activity actually meant to do? Every kind of app has a
different starting point from where you guys are right now or maybe as of two
days ago on how you kind of struck your UI of your app.
Some may be very activity heavy, fragment heavy, very much in a
different system. All those are certainly valid
places to be but we are moving towards a model where the
activity is more just an entry point into your app rather than
the activity being the owner of the content of your app it’s actually just
what is going to store that global state. So the global navigationss,
navigation drawings, you’re still using an
action bore, those are things that activities manage but it delegates to what
we call a nav host for content. So our nice super simple
activity here. We have a action bar on the top,
a bottom nav on the bot. This is our nav host n the world
of navigation when you navigate between different destinations
in your app we are replacing everything inside that box so
you have that global navigation outside of that and we have
hooks that you can hook that up to stay in sync
with the actual nav host. What does a super simple version of
this look like? If you are using fragment ors destinations s you’ll want to your our nav
host fragment . You do this by including the
navigation fragment dependency. If you’re using a totally
different kind of destination it will probable I also have a
different kind of nav host that you add here. But for now
post fragment we set up a few convenience methods. The ability to set what
navigation graph you’re using in XML so you don’t need to do
anything programatically to set it up. It will just go to the
start destination of your graph by
default. For fragments we can hook this up to the system back
button so we offer a method to do exactly that. So you don’t have to override on
that specifically for navigation. We will hook up all that stuff
using the magic of fragments through this default nav host
option. That means that our activity
actually gets to be two lines of code. It would be two lines but it
doesn’t fit horizontally on the slide. But all we need to do is just
inflate our layout, set content view and
hook up the up button. Because you told us about this
structure of your graph, in this case we are using a column extension method that
allows us to find the navigation
controller bypassing in the ID. And that gives us access to
navigate up. And navigate up is going to do the right thing
based on your navigation graph. You done need to actually do a
lot of extra work’t need to actually do
a lot of extra work here by giving us your graph. We are
able to do this for you. But for most apps just having a
single thing is not actually what you have. Maybe you have
something a little bit more. So we have kind of set up a
second dependency, navigation-ui which is really just a set of
static methods that connect your navigation component with some
of the material design components. Some of the things like bottom
nav and things that are very much in
that global navigation space but of course it’s 2018 so we have KTX one
that changes those static methods into extension methods so it’s really
easy for users to integrate to your
app and have navigation feel just like it’s something that
exists. Of so what does this look likeSo
what does this look like? If we make our activity a little
bit more complicated, add a Toolbar
on the top and a bottom navigation view. We still have
the same app menu that you had before on a bottom navigation view but what we do to hook
those things up it takes two parts. One, your menu here is
actually going to be using the same ID’s that you’ve had on
each destination. Each destination has a unique ID and we can use those same ID’s on
your menu items. So that build an implicit link
of if you click the home menu item you’re going to go to the
home destination of your app. In cold we are going to set up
our action bore using our tool bore. And we can do the same find nav
controller to get our access to the nav controller object. And then we just have a column extension for activity that
allows you to set up bar with nav
controller. This does quite a bit of magic. What it’s doing is receiving
events on when you navigated in your nav controller and using
the labels that you set up in your navigation graph to update
the title of your action bar. We also have another helper
method if you’re using a drawer layout to automatically change
it from a hamburger button into a back arrow based on what
destination you’re on. Really those helpful patterns to
make sure those things stay in sync. Similarly for the bottom nav you
call set up and redo the two-way thinking here. As you click on
things in the bottom nav it will change the graph and do the
correct transition based on the material guidelines as well as
as you navigate between your app if you have separate buttons it
will update the selected item in the bottom navigation. So this givers this gives us a
lot of power but not everyone is using
prebuilt components that another team did. You have your own
custom UI. We have to go deeper into what
nav controller actually gives us. You have a button, you want
it to go somewhere. We have the convenience method,
create on click listener. You give it the ID of where you want it to go, what action you want
it to trigger and we will do all the work for you. This is
perhaps a little too magical so you can unroll it just a
little bit. In this case we are using
another extension method on view. So from any view that’s been
created by nav controller, you can
actually get a reference to your nav
controller just by calling find nav controller
as you might expect and use that nav
controller to call navigate and just
navigate to an ID of a destination or an action in your
graph. That’s it. Under the covers
this navigate is actually doing a lot of work. So the nav
controller is talking to what we call a Navigator. So So for fragments we are talking
to a fragment Navigator. It’s going to build all of the
fragment transactions for you. It’s going to do all the things
you told us to do by putting that information in your
navigation graph. So if you have had pop up, it’s
going to do all that stuff,
transitions, all that in this one line of code. All of it can be either
programatically or something that you determine ahead of time as part
of your navigation graph. But for a lot of these kinds of
places it’s not actually just a
navigate. You have some information to pass to the next
source. So for this we need to pass a
bundle of information. Here we are passing a string and
an int and using our nice helpful bundle from AndroidKtx and it works.
It’s fine. This is really useful for a lot
of things but at the same point
it’s not very safe. If you do a miss type here, what
are you going to do? We want to make this a lot easier so we
will talk about what we did here. >>We build something called
safe Args Gradle plugin. First let’s see what they’re trying to
resolve. Let’s go back to our sample. Our fragment where we tried to
negate actually requires us the best
screen name argument. Best category which has integer
type. Let’s go back to the calling site. Well in our
slides we made everything correctly. We passed screen, but actually you can forget the best screen
name. It’s a result in run time exception. It’s super easy to fix but it’s
still annoying. I don’t know. Okay. Now we have this navigation
graph. Let’s put all our navigation
there including the arguments for destinations. Let’s see how it looks in XML.
And super simple. We just specify your argument
name, its type . And this allows us to build
two links that once you have an action that lists to this fragment or other action
or activity, we can check and make
the best proper argument also. So let’s take a look at how its.
So let’s take a look at how it looks.
Now we use this special object which instead of passing ID in a
bundle, of course internally the subject incorporates the same ID
and same arguments but to get this object
we use home fragment direction class,
it’s generated for you. It’s just a fragment name plus
static directions. It has static methods for this
destination. And those static methods may
have required arguments. In our case it makes us the best
home argument there and later it can
set up your other additional arguments.
And then everything is type safe.
And after that it’s super simple. You have this, it’s Args class. For our case is help fragment
Args. And it’s all your arguments that
you defined in a type set manner. And for this small improvement
we go to the bigger one which is deep
links. >>IAN LAKE: Yes. Deep links
are traditionally something that Android has supported for a long
time. You can take over a web URL, as
well as deep linking a super useful for notifications and
things like that to link back into your app. But it gets a lot more
complicated as you get a more complicated app like how you
structure these things and how you say all right I need to
build a notification, what is all of the code that is needed
to actually get into the correct place in my app and pass the
right kind of information here? So for navigation we really made
deep linking a kind of class citizen
in our structure. So there’s really two kinds of
deep links. The explicit kind, these are the things like for notifications,
app shortcuts, app widgets tan new
actions and slices. Things that you create and are
usually pending and tent based. These are things you’re passing
to another app or system to say I
want to go to this specific place in my app. The implicit
side of things are more around the web URLs and custom
scheme URLs. These are the other apps triggering your app
to launch. We handle both of these for navigation.
For explicit deep links we have nav deep link builder and its sole goal
in life is to deep link to a specific destination in your
navigation graph by its ID. It’s easy to say but a little
bit harder to make sure that all works in the system. But if we create a nav deep link
builder, you create it with context, give it your graph,
destination, any arguments you have, and then you can just call create pending intent. We
are doing all the work to create it within your graph and your
parent activities as well. And we are going to pass that along
and create the correct intent that gives you to the right
place when you trigger this intent. And then you just pass
it through to your notification. You don’t actually need to do
more than this to get all the correct behavior.
For implicit deep links these are again links to web URLs. So
in this case instead of it being something that you create
programatically it’s information you include in your navigation
graph. So here we are adding a deep link element just like we added the
arguments and actions to our graph, these are just a deep
link. And of course all of this individual editor as part of the
products for a destination. And it’s really just as simple
as an app on your URI. This is a static URI which is
boring and dumb. So we of course support some
wild cards. So if you want to do a dot star
for a wild for a wild card, totally
supported. If you wanted to fill in the argument for your destination you can use
curly braces and we will parse the URL
for you and extract those values and
give them to you. Now you can get those directly from your URL and not have to reparse
things like you already know what this is suppose. Like, you already know
what this is supposed to be. Similarly you can combine the
two. If you want the make more
complicated patterns, you can. We also have auto verify. We
wanted to make sure that you can do the same kind of thing if
you’re using navigation as well. And note here that we left off
the HTTP, HTTPS. Here we are doing
both. We are saying HTTP, HTTPS. You can’t really control the
URLs that other apps — maybe they
accidently took the S off your URL but we still want to support
both of those just as a convenience method instead of
having two lines. And of course it also works with
custom schemes. So if you have your own scheme, they use set ups specifically for
your app, you can also attach those to deep links. Now the best part is we kind of
worked across the tool space so besides
just the navigation editor we also worked with the manifest merger team.
You can add a single nav graph element to your activity in your
manifest pointing to your graph and all the deep links in that
graph will then get expanded out to be the correct intent filter.
We’ll build all those for you. And if you go to the manifest
merger view in in Android Studio you see the
exact line. So this means that we now have a single source of
truth in your navigation graph that you know
this is not going to get out of out of
sync with what you expect, it’s not going to get out have sync when you change
your argument names and XML file. This is one central place
to do things and we think it’s a lot easier for basically all of the implicit
deep link kind of cases. Of course we do do things where
this is all action view URLs as they
would be for web URLs. You can see it added directly to the
line. If you have multiple of them it will tell you what line,
if you have multiple graphs associated with different
activities those will all work just fine.
So one of the other subjects that is really important to all of our
architecture component is testing. Testing navigation is very hard
and this is something that we are going to continue to look at
over the alpha period. We really want all of your feedback
as well. But I wanted to discuss what we think testing in
a navigation world should look like. So a lot of it is if all of the
links between your destinations are
through navigation, then it’s a lot
easier to test a destination in isolation. You can test each destination by
itself and then test just the out going edges or the incoming arguments
and not have to deal with oh did it actually do the right
fragment transaction? Because we could test that just at the
navigation controller level. So this is something that we are
going to spend a lot more time on and in the fragment talk
yesterday we actually talked about really trying to
make fragments themselves much more testable in isolation so
it’s kind of a package deal where we are trying to build testing into navigation
controller and also trying to build testable
destinations as well. So you might be interested to do
something right now. So if you want to test, oh when I
navigate to something, does it go to the right place? We actually have a ad on
navigation list inner. You can get a call back of oh
did you go to the right place when I
clicked this button. This is one method we found
successful in testing things completely black box outside of
things. If you want to inject a nav
controller or use any other method those
are also valid ways of kind of setting things up.
So what can you play with today? What can you play with — you
must have looked at it, right? It is in alpha right now. 1.00
alpha 01. Giving ourselves a long runway of bug fixes and
improvements here. And it really comes down to two
main artifacts, the navigation fragment which includes the
navigation run time dependency as a
transitive dependency and also the nav host fragment and fract
Navigator that you need to use destinations. And for every one of the
dependencies for navigation, we have a dash
KTX version of them. So we really tried to make Colin
a first class citizen, especially if you are doing
programmatic graph construction like say you’re reading your
whole navigation graph from a server we have a call a call in DSL. So there’s more to do and I’ll have Lukas talk
about where we go. >>LUKAS BERGSTROM: Are are
going to need and troid studio preview
canary 3.14. Please do download it and try it out. There’s a
lot of great stuff there. This is obviously going to become a
really core part of not just architecture components but
jetpack overall. We took a blank sheet of paper,
what do we want the and Droid the Android developer experience
to be. Creating a new project in Android Studio is going to by default
start you up in the nav editor and it should be a pretty great
world. And jetpack is available for you
to try right now. It’s right now it’s sort of got
a much nicer introduction to what all the key pieces of Android development
are so it’s a much easier on ramp and
we are looking forward to expanding this story over time.
There are more talks for you to go to. A lot more detail on AndroidKtx
and paging. Paging in particular is a really
cool deep library that does a lot for
you by tieing together different pieces of architecture component and
jetpack. If you ever have a list view with more stuff than you can hold in
memory at any given time you really want to go to this
pageing talk. And we want your feedback. We
want your feedback on this session, first of all but more
importantly we want your feedback on the navigation
component. The reason that we are launching to alpha is not
because we think that this is sort of really untested and
untried. We have done a lot of prerelease testing of this. But we are launching it to alpha
because we want to get a lot of feedback from the community before we
lock down on the API and switch over the beta. This is a great time for to you
try it out, tell us what works for us, tell us what doesn’t. Either communicate with us
directly or on our publication tracker. We would love to hear
from your. Your feedback has been critical in every point of
this journey in making sure we are attacking the right problems
at the right solutions so please do try it out and tell us what
you think. Thank you.
[Applause] . (Standing by). .
>>Thank you for joining this session. Grand Am bass doors will assist
you guiding you through the exits. If you’ve registered for
the next session in this room we ask that you clear the room and
return via the registration line outside. Thank you. .
. .
. .
. .
. ” Real time caption Realtime captioning on this screen” Realtime captioning on
this screen” . .
. What’s now in Android security. New in Android Security. .>>Welcome. Please fill in the
seats near the front of the room. Thank you.>>Welcome. Please fill in the
seats near the front of the room. Thank you. .
>>At this time please find your seat. Our session will begin
soon. >>At this time please find your
seat. Our session will begin soon. and welcome to the Android P edition
of what is new in Android security. My name is Dave. In
a few minutes I’ll hand over to Xiaowen who is the lead security
product manager for the Android platform. We have a lot of
ground to cover so we will start with a brief state of the union
on Android security and jump into all the really cool things
we have been working on Android security over
the past year and launching here at
Android P including secure hardware
support, integrity and privacy. State of the union, let’s talk a little bit about what the
Android security strategy looks like. There are three main
pillars. First, Google play protect. This is the malware and security
services. The second is platform engineering. These are the corp operating
system defenses that we built into
Android to improve security systems such as SE Linux, encryption, lots of
other features. The third pillar is the security
development life cycle. These are all the programs we put in place to ensure a consistent
high quality level for security across the Android ecosystem. Things like testing
infrastructure, also includes our security patching programs.
We have been working really hard on that.
A couple of things, we have been trying to make Android just easier to
patch. Google, we have a pretty steady
track record delivering those patches
to market. And so making Android more
modular with projects like treble really
help contribute to that. We also head on working security
patching into our OEM agreements. This will really be
a massive lead to a massive increase in a number
of devices and users receiving
regular security patches so we are excited about that. But there are a couple of
philosophical Principles that underlie everything we do when
it comes to security. We believe in transparency and openness because that breeds
confidence and it breeds trust. Conversely a closed platform,
secrecy, that breeds distrust but there’s a really important
security advantage to being open. Today’s model devices are
faced with really sophisticated attack threats. When you have
billions of users it’s an attractive target so it
deserves the strongest possible defense. With a closed platform the
defenders are the employees of the ones
that own the platform. We have thousands of owners
waking up every morning thinking of how to
best protect us, we have security teams who work closely with Google on
protecting Android and its users. We have
security teams helping to protect Android. We have the worldwide open
source Linux community contributing to
Android security every day. We have the academic research
community which prefer working on open platforms. So this is a mass force
multiplier in protection. As operating systems have
matured, the power of open has become evident
to the point where today the protective capabilities of Android are now
on par with any other global platform and I believe the power
of open will accelerate those protective capabilities for our
users going forward. The other really important philosophy that
underlies our strategy is measurability. We always look
for objective independent measurements to help not only
inform the work we do to insure we are investing in the right
directions but also to measure progress. So one example you
see here is the incidence of malware or potentially harmful applications we call PHA
on devices. The bottom curve are devices
that load only from play, and the top
curve are devices that load from sources other than play. You
can see over time it has been reducing across all users so we
are committed to protecting users regardless of where they
get their applications from. But this is due, this
improvement is due to many things.
It’s locking down APIs and permissions over time or
constantly looking at that and investing in the malware
detection engine itself. Today 60% of malware is detected
through Machine Learning and that’s one area of a big
investment for us. Over the past year we had a 50% reduction of PHA on play. We
are happy with the progress but we are not content with where we
stand today. Although I would say the odds of loading a PHA
from play business the same as being struck from lightning so it is a safe place to live on
your mobile life while we are going to continue to invest
tremendously in this area. Another really important
measurement is the overall ability of the operating system
to protect itself against exploitation. In any complex product there are
going to be bugs but there’s no reason why bugs have to lead to
exploitation to harm users so we worked hard on building features
and improvements that make Android much more difficult and
expensive to exploit. How do you measure how well
you’re doing? Well, lots of people want to
purchase exploits. There’s a vibrant market for
that. As exploits get more difficult the law of supply and
demand, the prices are going to go up. We watch the pricing
over time and there’s a number of different markets you can
look at. On the left side you see the
device manufacturer’s rewards programs. Google is paying out
the highest rewards this the industry. Another market are the elite
hacking contests. You can see on the graph on the
right the price of Android has risen
to the point where now the most recent event a few months ago
the pricing for Android is on par with other platforms.
If you haven’t seen the results, Android performed right well in
that event. Another market is the great market. Independent
researchers and brokers who will sell exploits to the highest
bidder. This is a little harder to track but we have connections to a lot of
researchers and the price of exploitation on Android is now
as high or higher than any other platform.
So this is really great. We are happy with progress but we continue to invest in all these
areas. Now let’s switch gears and talk about some of the new
emerging features in Android P starting with Android protected
confirmation. So the problem here is in
today’s secure mobility we use mobile
devices much more than we ever did before but there’s still a
ceiling of trust that we haven’t quite broken through. We don’t vote for Prime Minister
or president from our phones. We don’t program don’t program critical medical
diseases medical devices from our phones. I’ll talk about a few use cases,
medical, financial and enterprise but the key innovation is protected confirmation is the first time
that we now have the ability to execute a high assurance
transaction, a user transaction completely within secure
hardware, running in a trusted execution environment or TE,
that runs separate from the main operating system. So how does it work? So an
application developer, say you’re a medical company that is developing a solution for people
with diabetes so you’re managing an
insulin pump. The application will enable the
user to select two insulin units and call the protection API to transmit that
data to the secure hardware area where a completely independent
trusted user interface will execute. The interface you see
here on the screen shows the two insulin
units, the user confirms it by pressing a
button. It’s guarded in this area. And the entire
transaction is signed using a transcriptd key that never
leaves that security area. This provides higher assurance
to the relying party that the integrity of this data was not
corrupted even if you had root level malware it cannot
corrupt the integrity of that transaction.
So in code this is really easy. We use the standard Android key
store API. We have this new method to set a
flag confirmation required. We create the confirmation
dialogue using the confirmation dialogue
API and use the method to transfer
control where the user will interact with that special
screen. Really easy. So we have a number of launch
partner whose have been working closely
with us on this technology. They have been building
prototypes. Big foot medical is a program
that works on solutions for people with diabetes. The user is looking at the
glucose level deciding I want to inject
one and a half insulin units and calls the API to invoke the interface. The user confirms and only then
will the insulin pump administer that dose. In the medical side we have
royal bank Canada that is working to integrate protected
confirmations into their application. I don’t have a
video for this one but you can track left the right this
application is moving a person to person financial transfer. We see we are going to send
$1,500 to ravy. The application invokes the
protected confirmation’s API. The user confirms 1500. 1500
can’t be changed to 15,000. The relying party on the other
end has high confidence that we intended
to send ravy $1,500 and the transaction
goes through. Duo security is a firm that is
working on strong enterprise authentication. Imagine you’re logging into your Chromebook and it launchs a
second factor authentication to your phone. The duo security’s application
comes up and asks for confirmation but then there’s a
second level confirmation using the protected confirmation API that provides again higher level
of assurance for the enterprise that it is the device and user
and location that is expected for that authentication.
So there are a lot of other launch partners we worked
closely with on this. on this. Proks to me is doing proximal
based authentication. I would like to throw a shout
out to qual com. Protected confirmation requires
a deep integration at the hardware
level, it’s optional for P so it requires a supported Android P
device. We are breaking through that last ceiling of assurance in mobility
so it’s very exciting. There’s a lot more to talk
about. I would like to call up Xiaowen to take us through this
story. (Applause). >>XIAOWEN XIN: Thanks, Dave.
Good morning, everyone. And I’m really excited to be here to
talk about a lot more of the security and privacy features that we
built into Android P. As Dave mentioned security
hardware is a huge focus area for us because it can provide
defenses against attacks that software alone is not sufficient
to handle. Another API, another
feature that we are making in Android P is
leveraging secure hardware also. So why do we need stronger
protection? Google pay is a great example here. We are
working closely with them on this P feature and they’re going
to launch with it later this year. Consider their security
goal n the traditional transit use case
they need to make sure that your transit card and only your card
can be used to pay for your bus ride so your transit card has
your account information, a lot of secrets in there that
represents your account. Now the transit card is made
using a security element inside of it so
it’s hard to break into it and
extract secrets. Android pay transit is working
to replace that card with your phone. We need to make sure we
provide the same security guarantees which that is your
secrets cannot be extracted out of your phone and put to another
phone. To pay for your bus ride you must present your phone. A great solution here is use
secure hardware. Now Google pay transit is one
example. Payments is another. In all of these use cases you
want to make sure that your phone and only your phone can
make that transaction. There are quite a few other examples
where we benefit from stronger protect from private keys. For example, if you have high
value Cloud data, if you’re an
enterprise or financial institution you want to make
sure that all requests, all data access is coming from a known
phone or phone that you trust and that phone is identified by
a private key. Also if you have high value
local data, let’s say your password manage, you’re storing passwords locally
on disk then you may want to protect it
again with a private key. How do we provide stronger protection for private keys?
There are exacting standards certifyd by professional labs to
be resistant to hardware tampering. Phones are starting
to incorporate that exact hardware directly into the phone so your phone can replace
your transit card or credit card. With Android P we are now
exposing APIs so all applications on Android can take
advantage of this tamper resistant hardware on compatible devices.
Specifically we are adding a new type of key store called
strongbox. Strongbox is built using tamper resistant hardware security
elements that has isolated CPU, RAM and
secure storage. It makes it so that it’s resistant to shared
resource attacks. For example, many of the high profile
hardware attacks we heard about recently, strongbox is resistant
to those as well as timing attacks
as well as physical attacks. So when we look at the key store
types that are available on Android there are now three types of key store. On older Android devices key
store was implemented using it directly. Even when now devices it was
using the TE, trusted execution environment. Now we are providing strongbox
that can run alongside together with the existing key store and
the TE. Strongbox is resistant to the
widest variety of attacks and well suitd if you have a use case that requires
strong protection for your private
keys. It does require new hardware.
To use strongbox it’s fairly straightforward. When you create your key store
key set a new flag. If a device supports strongbox
then everything succeeds and it goes
well. If not, you will get an
unavailable exception. It uses tamper resistant
hardware and this is the first time that we
are offering a generic API to access the type of secure
hardware for key management. This feature as well as the
protected confirmation API are really pushing the boundary for
security hardware support on mobile and we are really excited
about the use cases that this enable also. So protecting your private key
is one thing that apps need to do. Another is to make sure that the
right user is present. When you look at a typical
Android device the most likely security
incident to happen to that device is not
malware but rather getting lost or stolen so a locked screen is
very important for Android, make sure you set a lock screen. And
also an app developer make sure you gain sensitive access on
user authentication. So in Android P we added a few
different features to help ad developers do that starting with key guard bound
keys. Key guard bound keys are key store keys that are well suited for
protecting data that you store on the
device. Key guard bound keys have their functionality tied to the key
guard so the keys can be used to encrypt data at any time and
decrypt data only when the device is unlocked. The life cycle of the keys are
tied to the life cycle of the lock screen. If you have very
sensitive confidential enterprise data or private health and fitness data you
might want to encrypt it with a key guard
bound key before you store to disk. It’s now a little bit
harder for an attacker to access the sensitive data. To use a key guard bound key
it’s also fairly straightforward when you create your key store
key set a flag to require that the squeeze the device be
unlocked. When you create your object you
can create it for encryption at any time and for decryption
when the device is unlocked. What if your device had been
properly unlocked but you want the check
user authentication one for time let’s say before a sensitive
action like a payment happens. This is where a biometric prompt
comes in. It’s our replacement for fingerprint manage.
Fingerprint manage has a few limitations. One is it only works for
fingerprint. A lot of devices today are
starting to support face, iris and other
modalities. We do support more than just fingerprint. And it
will automatically pick the right modality for that user for
that device. Another benefit of biometric
prompt is it uses standard system UI which is really nice
when user experience perspective to show the user
standard UI when they’re making security relevant decisions. Also it sets us up well for
future advances in sensor technology when you have an indisplay fingerprint
sensor, it tells you where to put your
finger. We know that biometric prompt is
quite different from fingerprint
manager and so to ease the main of migration
we are also providing a support library and so apps will be able
to call one API in that support library and that
will use biometric prompt. To use biometric prompts, create
the further object and pass the
title and subtitle of properties then call
the authentic to show the authentication prompt. We ask you use the crypto object
to user to user authorization is proper.
Now, what if the user is going to your website in Chrome? How do you authenticate them
there ? Coming later this year in Q
four Chrome on Android will support
web op N which means if the user is going to your website they can use their log
screen or biometric signals to
authenticate to your use site. If you like to buy things on the
web pay pal has a demo running where
you can use your fingerprint to
authenticate to PayPal and to make your purchase. It’s more convenient than typing in your
password every time you make a purchase. To summarize several
different methods we talked about to gain access based on authentication, first with key
guard bound keys you can tie data
access to the life cycle of the lock screen. If the user already unlocked the screen you can use biometric
prompt to show system UI. And if the user is going to your
website instead of your native app you can use web web auth. Now that you
determined it’s the right user, let’s talk about integrity. A lot of apps need to make sure
that the integrity of the data and
the integrity of the device they’re running on. In Android P to help you protect
the integrity of your data in transit we are going to require TLS by default.
If the system will throw an exception if the app sends data in the
clear. Using TLS should be a no-brainer because it protects the privacy
of your users and protects your content from being modified in transit,
whether it’s unwanted adds or tracking
identifiers or to exploit a weakness in your
app. You should always incrept. You can still opt out in
specific domains by updating your network security configure. Do visit the website on the
slide. Now before you run off to change your code we have one
more piece of good news. For a lot of apps care about
cryptographic compliance because it’s important in regulated
industries tan government. So we are really happy that what
is used to security SSL traffic on Android. Recently we received CAVP certificates for many
approved algorithms so this means that developers s have automatic fips
compliance. Another topic in the integrity
section is how do I make sure that the device has not been
tampered with? The device itself is still
healthy? In Android O we introduced key
attestation which allows you to get a signed statement from the secure
hardware itself about the state of the device and about the property of
your private keys. It can tell you whether the device passed
verified boot, whether it’s running a recent security patch,
whether your private keys are protected by TE or strongbox. Another thing that key
attestation will return to you is the firm
wear hash that the device is running. Think of this as transparency
enabled verified. What this means if you’re running a firm way test that is the same
as a known good version you are
running a bit for bit version so that’s a powerful thing to know about
when you’re running. To know that operating system you’re running is an exact good copy of
a known version. To users of the safety net
attestation API, it will call the platform under the hood. So
you will be able to take advantage of this without any
changes to your code. Now in some cases if you want to
get more information from the API than what is returned by safety you can
still call directly. Now last but certainly not
least, privacy. Privacy is an important area to security. We
actually talked about privacy quite a bit already when we talked
about the TLS by default feature coming in
Android. But there are a few other privacy features that are in Android P
that we want to cover now. First this is probably one of my
favorite features. Sensor access only in the
foreground. In Android P running on an
Android P device regardless of your API
level if your app is in the background
and idle you will no longer be able to access the camera, microphone or sensors.
This behavior is slightly different based on the exact
characteristics of the API you’re targeting. With the
microphone API you will get silence when you try to access
microphone from the background in idle. With a camera API it will behave
as if you were preempted by a higher
priority camera clients. With the sensors it’s whether
it’s continuously or via call back. If the app is in the background
you can no longer access data from sensors. If you need to access the camera , microphone or sensors create a
foreground service with a persistent user notification.
That gives users more control and transparency into which apps
have access to their sensors at that time. To start a foreground service
create that notification first and
start thetory ground method and pass it to that notification.
All right. Besides background access to
sensors, restricted background access to sensors we added a lot
more control over your data. Android is the first major operatingtism to have DNS over
TLS support built right in. Your queries will be passed to a
source of your choice. Now if your default DNS already supports we will automatically
encrypt your data. We did this in collaboration
with alphabet jigsaw team. We are looking forward to many new
developments here. Another cool feature we added in
Android P in the privacy space is lock
down mode. It’s useful if you’re in a situation where you may temporarily lose
access to your device. Let’s say you need to hand it
over in a security check point. You can put it in lock down mode
which is in a state where all your knowledge factors can be
used to lock down the device. Your fingerprint biometrics will
be disabled. Notifications will no longer
show on the lock screen. You have higher assurances on the
security of the lock screen when the device is temporarily out of
sight. So that was a quick overview of
the features that are coming with Android P. There’s a lot
more that we didn’t have time to cover. So please do give us your
feedback and send us an e-mail.
Thank you for coming and have a great day. (Applause) .
. .
. .
. .
. .
. . “Realtime captioning on this screen” . .
.>>Thank you for joining this
session. Grand ambassadors will assist
you getting you through the exists.
If you registered for the next session, we ask that you clear
the room and return via the registration
line outside. Thank you. in this room, we ask that you
clear the room and return via the registration line outside.
Thank you. .
. .
. . Personalize actions for the
Google Assistant .
. . .
>>Welcome. Please fill in the seats near the front of the
room. Thank you. .
>>At this time please find your seat. Our session will begin
soon. .
>>Hello. Welcome to personalize actions
for the Google Assistant. Thanks for being here. I’m
Silvano Luciani. I work in the assistance and
actions on Google developer relations team. >>ADAM DAWES: My name is Adam
Dawes. We build identity tools for
developers. >>SILVANO LUCIANI: In the next
half hour we are going to provide to you some of the functionality that Google
provides to you. Why is it important to personalize your
action? And let’s start by looking at
the Google Assistant. This is the assistive technology
that is at the center of an ecosystem
of more than 500 — sorry, million
devices. And it’s there to help users to get things done. Things like listening to music
or playing some games, get informed
by watching some news, learn new recipes and so on. Actions in Google is the
platform that allows you to extend the Assistant. You can
add your on actions so you can provide help for a knowledge
domain where you think you are an expert and you can provide
value to the user. You can connect with new users
across all the devices that we have
seen and you can innovate by adding conversational interface
that can make some of the tasks very easy, just a straight voice
command. So thinking about this there are
two aspects where it’s important to personalize your action. Aspect number one is for some of
the things you might want to do like for example present music
that a user can listen to to be entertained, if you know more about the preference and
the tastes you can provide them with music that will be more in
line with their expectation and fulfill their
intent in a better way. We have seen your actions can be
involved across different devices so you want to be able if the user
expressed preferences for example as a speaker, honor
those preferences when they start another conversation with on a
different device like a smartphone will you will be able
to provide a better experience because now you can show visuals
to provide a better output to the user. So we are going to look at three
main packets thats that already was
mentioned. The first one is what is the platform providing
to us so that we can learn something more about the
Assistant user. Then we see how we can store this information in
the current conversation we have with the user or all the conversation we might have with
the user which means future conversation and conversation on
different devices. Finally Adam will talk about
adding identity to your action. That can be a requirement where
cases for example where the user wants to know where the next
payment for a certain service is due, if you are the service
provider you need to know who they are on your system, you need to
authenticate them so you can receive the correct information
for their account. So starting from learning more
about the Assistant users I’m going to
introduce helper intents. Helper intents are a core
concept of the Assistant platform. You can request to the Assistant conversation so
it can perform special tasks on your
behalf. Some of the special tasks, you
can ask to transfer the conversation if you start the
conversation on a smart speaker you want to show a visual to the user, you can ask the
conversation to be transferred to a phone or added device that can show the visual or you
can even deep link into an Android app if for example the user has has requested a task that
doesn’t provide a good user experience with a voice
interface. In this talk we are going to
look at a subset of the helpers that allow
you to obtain information from the user. This information can
be two different types. One is you can ask to the user
the consent for the assistant to share some information with your
action like the user name or the device location. The location
of the device where the conversation is happening. Date and time and place and
location are a way of asking an input to the user. I need a date for you, if, for
example, your actions allow users to book services and you want to
know when they want the service or a place or location if you
are arranging a delivery and you want to know where the deliver
the transaction. So starting from the first one, how can we
get the user name? In this slide what you see on
your left is a simulation of a conversation I did on a test app that I wrote
and on the right you can see some of the code from our node.
js client library. It’s not the complete code from the action, it’s just what is
relevant to what I’m showing. And you can see how it works.
First ask the permission, the permission to get the name of
the user. And you provide the context, that it’s very
important. Because with the context you are explaining to
the user why you need the name, how is your service going
to be better if you obtain that information. So in this case
I’m telling them hey I want to address you by your
name. In the Assistant they cover the conversation, ask the user can I
share this information? If it’s a positive rely you get
access and you can use it to greet the
user by the name. If not, you need to deal with it. A
knowledge that they didn’t want to share the information with
you, try to provide the best service that you can. This next slide is showing to
you the row JSON API. If you’re not using the client library and
instead you’re using the conversation webhook, this is
how you question the permission. You can see the intent name is
action intent of permission and then
there’s a pack of parameters that can go. They are different for every
helper intent. And they are defined saying this request
might contain this type of field. And you can see the
context explain using I want that information. This is showing you in the
following request that you received from the Assistant if
the user consented to give you access to the information, this
is how you receive it. Location is very similar because
it’s exactly the same intent, it’s an intent that is called
permission. And again the aft is asking the user
can I share the location of your device with the action? If the
user replies yes, you get access to the location and now in this
case you could resolve the latitude and longitude to the
place where they are and then try something that it’s near
that location. JSON is pretty similar, it’s
just only like a different value for the type of permission that
we are asking. And one thing that you can notice now is permission is an array which
means if you want to ask for both the name and location at
the same time, you can do it. You can just ask for multiple
permissions in that array. This is how you get the location
if the user consents to give it to you.
Now date, time is different. In this case we are not anymore
asking the user if the Assistant can
share some information with our action. We are directly asking the user
can you give me a date, a date value? And the most powerful thing when
you do this through the Assistant is
we can resolve ways of specifying the date that are not
just the date format. So if you look at the example, when would
you like to reserve the table? I reply tomorrow. The Assistant says what tomorrow
is to the time we are having the conversation.
Another thing you might notice in this case we have three
prongs, an initial prong that is the reason why we
want to ask that information but there’s also a date and time additional prong
that the Assistant uses if it gets only
parts of the information that you need from the user. So if I only specify a date like
in the case of the example where I just said tomorrow, the
Assistant will ask what time using the prompt that
you specify. Once you have that information you can access it in the case of node.js
client library. Again, not very different. Thp
is the JSON. So what is changing is just the parameters you are passing
when you are asked for this helper and
the value you get. It’s a little longer because you have
the date part and the time part. The last helper is the place and
location helper. Again, this is not anymore
asking the user can I share your current location with the
action. It’s asking the user give me a
place, give me a location. Again, you can see that if you
use this helper we can resolve
things like the public name of a place. For example, when the
action asks where would you like to pick it up? And I reply Shoreline am
theatre, what you get is the whole address.
Not a lot of difference again. You already know this API. It’s all — it’s a very familiar
API by now. The only things that are changing are the type
of parameters you can send along with helper and the name of the
helper itself. And the request in this case you
got a little more info because you have the address, the public
name and some more stuff. Why is it not going? Okay. In this example where you have
side to side two different situations. The first one is
the place. We have seen how we can resolve a public place, you ask for a Shoreline Amphitheater,
you get the address. But it’s showing a location. A location
is something that a user defines and it’s private to them. For
example my home or my workplace. And so in the example on the
right when I ask work, the Assistant
requests again another type of consent
which is can I share with the action what work is for you? And only if the user says yes,
then you can get that address. If you want to see more about
helpers we have a sample, GitHub dot
com/actions-on Google. It shows all the helpers that are
available on the platform. There’s information that can be
useful that we can find in every request without the need of
requesting any specific helper. The first one is just there and
it’s called last seen. It’s just the time stamp of the last
interaction with the user. The first time the user has an
interaction with the action it will be undefined. After that
you will always have the time stamp of the last interaction.
This means you can use it, for example, to greet them
differently. You can just say welcome the first time and then if the time stamp is is available, you can say
welcome. You can do more complex stuff.
You can calculate how much time passed since the last time they
visited you and depending on whether you consider that range to be short or long time
you can reengage with them in a different way.
The other one that is interesting for all of you that have Android
apps that allow users to purchase or
subscribe to entitlements, you can connect your action to an Android app, it’s
just in the verification that you actually own the Android
app. And all the entitlements that
that user bought on the Android app
and Play will be available in every request. This is an
example of how you would see it. In this case I’m just giving you
an JSON example because in the client
library we built the JavaScript object that
contains this information. What about storing information in the
conversation? We said that it’s important that
we can do it so we can give a more personalized experience
inside a conversation but also across all the different
conversations that the user can have with your action.
The first concept that I like to introduce to you is the
conversation token. The conversation token, it’s
available only if you’re using an action SDK action. And it’s a field of the response
and the request defined in the
conversation webhook API. It’s just a screen like if you
write to it we will send back what
you’ve written in the next request. If you change it we
will do it again. The main catch is it’s value is
always initialized to an empty screen. It’s always the current
conversation and all the values you stored in
there will be cleared when the conversation ends. You can use
it for improvements when those improvements make sense only in the like cycle of the current
conversation that you’re having. One example of how to write the
value, in the first case I’m just writing a simple string.
In the second case what I’m doing because I want to use a more
structure approach I’m serializeing the JSON object to
a screen so I can get a representation that contains an
object. Dialogue flow you don’t have
access to the conversation token but the
same functionality with a more
powerful abstraction called output context. You can have
more than one output context. Each one has a name and you can
identify the name, you can set the lifespan in terms of
conversation for which that data will be available and it
provides an interface that gives you access the a structured
value. To see an example of an output
context you can see an output context at
an array, it’s identified by a very long time. The lifetime count is five turns
in this case and I’m writing an
object for what are the value that I want
to store in this context. To use the client library and
node. js we give you the best of both
worlds. We provide you an obstruction, you write whatever
you want to the object. And depending on whether your action is an action SDK or dialogue
flow we use the conversation token in one case or the output
context in the other. To see an example of how you
could use it, let’s say you have some music that you want to
play. The user asks I want to listen
to some music. You can ask them do you want a random genre? A
specific genre? If they give you one, you can save it for the
rest of the conversation. You can keep giving them music
coming from that genre. And you can also have a counter
where you count how many songs for that genre you’ve been given
to the user and when they reach a certain point let’s
say ten songs you ask do you still want this genre or do you
want to change? In some other cases the
conversation token doesn’t work because let’s say you want to
store the preferences for an action that is giving you
weather forecast and you want to save
the zip or area code to identify the area
for where they want the forecast. If you save this in a
conversation token that value is lost at the next conversation.
So you would be asking again what is your zip? And that’s
not a good experience. And that’s when user storage
comes into play. This slide is exactly the same
API of conversation token so you already know how to use this.
It’s a field of response request, it’s a string, you can write just a
string or serialize a structured data. We
will recirculate it across all the conversation that you’re
having with the user. Future conversation, conversation on
different devices. The main difference is the
content can be clear only by the app itself
or by the usered only by the app itself or by the user. And we
will see it in a couple of slides.
So this is exactly the same example that I had for the
conversation token. The only thing that has changed is the
name of the field. Now I’m using user storage and that means I have access to a bigger
lifespan. Going back to the example of
storing, now that we store it in the user storage, if I started this
conversation on a smart speaker, save the zip, the moment I asked
for the forecast from my phone the action will know the
zip code and won’t need to ask for it again.
That’s it. This is how you clear the value. If you’re using the client
library you set the value of the storage to an empty object and
we clear it for you. Last thing I have to say about the user
storage is very important. This is mostly like a cookie and so
there are some countries that have strong regulations about
obtaining consent from the user before you can save or read data
from the user storage. So if you operate in one of these
countries, make sure that you use the helper that is a helper that
allows you to ask something to the user and get a yes or no question no request before you start
writing from the user storage. Now it’s time for Adam to talk
about identity. (Applause). >>ADAM DAWES: Thank you
Silvano. User storage is a terrific
feature to hold state and build continuity with the user across
multiple conversations. It works a lot like a cookie or
HTML5 might in a browser but we know that isn’t always enough so
let me talk a little bit about how you can use identity to
further deepen your experience with the user.
So the first thing that you get with identity is you get to know
who the user is. You get to know her name, her
e-mail address and you get access to her profile picture. This allows you to build a
direct relationship with the user where you can engage her via e-mail outside of
the context of using your app. The next thing that identity
provides is the ability to have a consistent
experience across multiple devices and on different
platforms. So with user storage you’re able to keep applications
state and user preferences but all that data is stored in Google’s Cloud and only
available to your conversational action.
But with identity, now you can store all of the user’s data on
your own back end. And whether or not the user
comes back via your mobile app, the web or your conversational
action, you’ll be able to re-set the state for the user and get her going where she left off the
last time she used your service. Finally with identity now that
you have all her data stored on your back end identity can now
help you secure that data so only the user has access to it
and nobody else does. So the actions on Google
platform has supported identity from the very beginning with
OAuth based account linking but the problem that we found is
that when a user engages with your action via voice, the only
way to be able to do that account linking experience was to punt the user
to the phone and get the user to go to your website, log in and
complete the linking experience. These kinds of cross device
flows are super difficult and as you can
imagine the conversion rate is pretty
low. That’s why we are super excited to announce the developerer preview
for the Google sign in for Assistant. You sign into your app
completely via voice without needing to leave
the context of the conversation. And like Google sine in on other
platforms woor are able to automatically log users in if
they previously used your service on a different device or another platformSign in on other
platforms woor are able to automatically log users in if
they previously used your service on a different device or
another platform. It comes in two different
developer modes. In multi platform this provides
a seem less voice base experience for
Google user yet still maintains a path
for users that registered with your service with another
identity. For Assistant only developers we are especially excited about what
Google sign in for a Assistant is going to be able to do. We
heard you that identity and account linking is a heavy lift
just to get started on the platform. Now for Google sign in for
Assistant you’re able to rely completely on Google to manage identity. Let’s take a look at how one of
these Assistant only developers
planning on using the tool. Starbutter AI was founded by
four self-described personal finance geeks who love to argue with each
other about the best way to manage their personal finances and look for financial
services products. They all found themselves with a similar
experience where they’re friends and families would would inundate them with
questions about the best credit cards or how to get a mortgage.
They realized the best way to recommend financial services
products was to get to know the user you there a question and
answer dialogue. They are invested so big into AI
and building on the Assistant platform. Just last year they won an award
for being one of the top actions ton Assistant platform. But to offer advice that will
keep users coming back again and again, Starbutter really needs a user’s
identity. Last week we gave them access to
Google Sign in for the Assistant so let’s look at their demo app. on the Assistant platform. But
to offer advice that will keep users coming back again and
again, Starbutter really needs a user’s identity. Last week we
gave them access to Google Sign in for the Assistant so let’s
look at their demo app. .>>Talk to mortgage helper
demo>>Talk to mortgage helper demo.
>>All right. Getting the test version of mortgage helper demo.>>Hi, I’m Morgan. I analyze millions of data
points to find you the best mortgage. Want to compare loan
offers or calculate how much you can
borrow?>>Calculate how much I can
borrow. >>To compare home loan offers
I’ll need your basic intoe to create a new
mortgage helper demo account. If you want more details say
tell me more. Can I have your name, e-mail address and pro
tile picture from Google? >>ADAM DAWES: Yes. >>Great, Alix I’ve created a
new mortgage helper demo account using your Google a couldn’t.
Are you purchasing a property or refinancing a property you
already own? >>ADAM DAWES: Wasn’t that so
much better. The user was able to create an
account without leaving the context of their conversation.
Let’s look behind the scenes at what I guess what is going on. Google Sign in for the Assistant operates like all others other
the platform. You went to get to get
information about who the user is. If the user says yes then we
generate an ID token and hand it back to your client. Now let’s take a look at one of
these ID tokens. An ID token is adjustment a JSON
web token and that’s basically a dictionary of JSON which is
cryptographiccly signed by Google so you can be sure that
the information inside the token is authentic. Inside the token
you’ll see all of the information that you need in
order to be able to make a decision about
whether the link the account or create a new account for the
user. That includes the user’s name,
profile picture and e-mail e-mail
address. You’ll also notice at the top of the screen there’s the AUD field,
short for audience. This is also a super important field
because that pretexts the security of the user’s data.
It’s what prevents another app from getting an ID token for that
Google user and replaying that token against
your backend to get access to that user’s data.
Let’s look at what Starbutter had to do to get this set up. The first thing they needed to
do was go to the actions on Google developer console and go
to the identity and account linking section. There they can
figure that they used Google Sign in as their linking type.
Next they need to update their action. So first they need to
initialize their client ID, this is what comes back in the
audience and protects the user’s data. And then they need to
find the right time to be able to ask the user to log in and
provide consent to give their account information. Now Starbutter did that
relatively early and you’re going to want to be thoughtful
about why you do this. It’s not best practice to immediately ask
the user to sign in when they open your action so think about
that carefully. The last thing that Starbutter
needed to do was to handle the
resolution. The user can agree to provide
their profile information or they can decide not to. If they agree, then we will be
able to return that whole set of JSON that comes back in the ID token and then
you’ll be able to extract the user’s e-mail address or their Google
identifier and you can start keying data on
your backend based on that user’s
identity. If the user decides not to provide that information you’re going to
want to handle that gracefully. You want to explain the benefit
of logging in. If they still don’t want to
share their information, you will try to forward the conversation to be
useful to the user to the user even if you
don’t know who they are. The multi-platform developer
wants to be able to support two types of users both the Google user as well as
any user who may have registered with their service given another identify
provider like Microsoft or out look. They need to have that
path to able to connect to data for both user types. I understand it may feel like a
little bit of a caveat that only Google users are able to get
that voice based account linking experience but it’s still going to be Sumer impactful for your
super impactful for your action. We know you invested a lot of
time and energy so we want to make sure that we bring as many
users through that front door. So by having Google users get in
streamlined voice experience, that’s going to bring many more
users to enjoy all the benefits that your
action provides. So let’s take another look at an
action that is on multiple platform also. This is Itcher. Their goal in life is to answer
that age old question what movie or
TV show should I watch tonight? They built a personalized
recommendation available on iOS, Android, web and the Assistant
platform and they offered many ways to
sign in, Google, face, even Twitter. Itcher needs a user’s identity. They can’t store user as likes,
dislikes, and make recommendations without having a
consistent experience across time and across across devices. Here is their demo app.
>>Talk to Itcher. >>Okay. Let’s get the test version of
Itcher. >>To get started I need your
basic info to personalize your experience. If you want more
detail, say tell me more. Can I have your name, e-mail address, and profile picture from Google?
>>Yes. >>Great. So now I can check to
see if you have an account with us. Just a sek. I found your
account, al liks. Your account is now linked to
Google. >>Hi, this is Itcher. I can
give you movies, TV shows. How can I
help? >>ADAM DAWES: That was a much,
much better experience. We were able to detect whether or not al liks had an account with
Itcher and seamlessly makeAlix had an
account with Itcher and seamlessly make the experience. How does Itcher make that
experience with Google Sign in for
Assistant? They need to go to the Google developer console. Next they do exactly like
Starbutter, find the right time in the conversation to ask the
user to sign in. And then this is where things get a little bit
different. So in that interaction we went
and we went to do a discovery to whether or not that account
already exists. In order to make that happen, Starbutter and the food both app
created a separate rest API from their conversational action
fulfillment end point. So we called this end point the
token end point and it allows us to do
account discovery and creation. So the way the logic of this end
point works is pretty straightforward. We ask the
user whether or not they want to sign in to the app. If they say yes then we generate
ID token and send to it the token end point. Food both validates the ID token
and extract the user’s e-mail address and do a look up on their account
database. If they find that user already
exists they link the account by taking the Google identifier and storing
that for the user and also generate their
own own credential and return it to Google. Google will store that
credential and return it in every conversation
back to the food bot app. If in that look up on whether or
not that user already exists, that user doesn’t exist, then food bot
will return an error saying I never heard of that user. So then what Google does is we
look up food bot settings and see if
they want to support account creation via voice. Not all
apps want to do that. Sometimes the registration process
requires that they gather more information than just the name
and e-mail address or maybe they want to make sure that the user
is really evaluated their privacy policy and terms of
service. But if they do decide they want
to do voice based account creation then we will ask the
user do you want to create an account with food bot? If the
user says yes, we will then again hit the token end point
and the token end points will again
validate the request is coming from Google and extract the
name, e-mail address and profile picture and store that
information as a new user in their user database and they
will also create a token and return that to Google
to store and then respond with every turn of conversation to
keep that context. Now if food bot decides they
don’t want to do voice based account creation or the user
decides they don’t want to create new account, they want to
log into their existing account that
might be based on Yahoo or Microsoft
identity, then we fall back to the regular O aupt base add
based account flowth based account
flow. This ensures all Google users have this streamlined
experience. So that’s pretty much it. It’s really easy the take
advantage of of Google Sign in for the
Assistant. The developer preview is
available today. Go and check out the actions on
Google developer docs and look for the identity section.
Now as a recap, Silvano talked to you about how you can build a more
personalized experience with the user by using helper intents to
get permission to get the user’s name, location, and time. He also talked about how to use
the request info to see when the user last used your action and
whether or not the user is already purchased info
from your service via the play store. Next he talked about how
to store information and build build
continuity with the user. And then I led you through how
you can use Google Sign in for Assistant. And for actions only developers,
now they don’t even need to have their own account system.
So we know your time is very available. We would love to
hear what you thought of our presentation. So please navigate back to the
I/O schedule, find this session and give us a rating. And if you want more
information, please go online, we have t we have got helpful docs that
describe Silvano’s experiences and best practices and you can learn more about
Google Sign in for Assistant. We will also be across the way
at the Assistant code lab and igloo tent so please come and
ask us questions. (Applause) .
. .
. .
. .
. .
. . “Realtime captioning on this
screen” . .
. .
>>Thank you for joining this session. Grand ambassadors will assist
you with moving to the next session. If you registered for
a session in this room, we ask that you clear the room and return through the line
outside. Thank you. Intro to Machine Learning on
Google Cloud platform. Platform. .
. .
. . Intro to Machine Learning on
Google Cloud platform. .
>>Thank you for joining this session. Grand ambassadors will assist
with directing you through the designated exists. We will be
making room for those who registered for the next session.
If you registered for the next session in this room, we ask that you
clear the room and return via the
registration line outside. Thank you. .
Intro to Machine Learning on Google Cloud platform. Platform. . .
>>Welcome. Please fill in the seats near the front of the
room. Thank you. .
. . Intro to Machine Learning on
Google Cloud platform. Platform. .
>>At this time please find your seat. Our session will begin
soon.>>SARA ROBINSON: Hello.
Everyone. Welcome to intro to Machine
Learning on Google Cloud platform. I’m Sara Robinson. I’m a developer advocate and I
focus on Machine Learning. You can find me on Twitter at S
Rob tweets. Machine Learning involves teaching computers to recognize patterns
in the same ways that our brains do. It’s hard to distinguish between
a cat and dog but much more difficult for a machine to do
the same thing. I’m going to focus on supervised learning. This is when during training you
give your model labeled input. The amount that we know about
how our model works is going to depend
on the tool we choose to use for the job and the type of Machine
Learning problem we are trying to solve. So that was a high
level overview but how do we get from input to
description? Again this is going to depend on the type of
Machine Learning problem we are trying to solve. So in the left side saying
you’re solving a generic task that someone already solved
before. You don’t need to start from scratch. Let’s say you’re solving a
custom task specific to your data set. More specifically let’s think of
this in terms of image
classification. Let’s say you want to label this
picture as a cat. We can use one of these
pretrained models. We don’t need to start from scratch. Let’s say this cat’s name is
Chloe, this is our cat, and we want to
identify her across our entire image library. We need to train a model using
our own data from scratch so it can differentiate Chloe from
other cats. Let’s say we want our model to
return a bounding box showing where she is in that picture. We need to train a model on our
own data. Let’s also think about this in
terms of a natural language processing problem. So let’s say we have this text
from one of my tweets and I want to
extract parts of speech from that text. This is a common
natural language processing task so I don’t need to start from
scratch. I can utilize an exampleing
model. Let’s say I want to take the same tweet and I want my
model to know this is a tweet about programming and more
specifically it’s a tweet about Google Cloud. I’m going to need to train my
model on thoughts of tweets so they can generate these predictions. Many people
see the term Machine Learning and are scared off. They think
it’s something only for experts. If you look back about 60 years
ago this is definitely the case. This is a picture of the first
neural network invented in 1957 and
this was a device that demonstrated an ability to
identify different shapes. Back then if you wanted to work on Machine Learning, you needed
access to extensive academic and computing resources. If we fast
forward to today we can see in the last 5 or 10 years the
number of products of Google using Machine
Learning has grown dramatically. At Google we want to put Machine
Learning into the hands of any developer and data scientist
with a computer and Machine Learning problem they want to
solve and that’s all of you. We don’t Machine Learning should
be something for only experts. Make you’re using a framework
like Scikit learn, maybe writing your
code in Jupiter notebooks, maybe you’re experimenting with different
models,ing proofs of concept or scaling for production? What I want you to take away is
no matter what your existing Machine Learning toolkit is, we
have something for you on Google cloud plat form. We have a whole speck room of
Machine Learning products. On the left we have products
targeted toward application developers. You need little to
know Machine Learning experience. On the right we have products
targeted more towards data scientists and
Machine Learning practitioners. First APIs that give you access
to pretrained models with a single
rest API request. As we move through the middle we have a new
product which I’m super excited about that we announced earlier
this year in January called AutoML. The first is Auto ML vision. Without requiring you to write
any of the model code. As we move further to the right
towards more custom models you want to build your model in TensorFlow,
we have a service called Cloud Machine Learning engine to let
you train and serve your model at scale. Then a couple months ago we
announced an open source project called Kubernetes flow. Let’s
say you have a Machine Learning framework other than TensorFlow
and you want to run it on GCP. You can use it on Google control engine or Kubernetes. On Google clout On Google Cloud
plat form we have five APIs, that let
you analyze images, converting audio
to text, translating that text. I’m going to show you cloud
vision. This is everything that vision API let’s you do. At its core the vision API
provides label detention. For this image it might return
elephant, animal. Web detection will search the
web for similar images and give you labels on what it finds. OCR, optical character
recognition, it let’s you find text, tells where you the text
is and what language it’s in. Logo detection, landmark
detection, crop hints and explicit content
detect. This is what a request to the vision API looks like.
You don’t need to know anything about how that pretrained model works
under the hood. You pass it in Google Cloud
storage or the basic encode Ed image and
tell it what types of feature detection
it you want it it to run. I have an
example in python here. I created image image annotateor client. ML kit, if you want to use ML
kit for Firebase you can call the vision API from Android or
iOS app and this is an example of calling it in swift.
So I don’t like to get too far into a talk without showing a live demo
so if we can switch to the demo. What we have here is the product
page for the vision API. We can upload images and see
what the vision API responds. I’m going to upload an image.
This is a selfie I took seeing Hamilton. I live in New York. Excited to score tickets to
Hamilton. Let’s see what we get back here.
Cranking. Live demos. You never now. There we go. We can see that it found my face
in the image so it’s able to identify where my face is,
different features in my face and detect emotion. We can see
joy is very likely here. I was super excited to be seeing
Hamilton. What I didn’t notice at first about this is it has
text in it. When I sent it to the vision API
I didn’t know there was text here but we can see it’s able to abstract
take play bill text from my image. Finally in the browser
we can see the entire JSON response we get back from the
vision API. This is a great way to try out
the API. Before you start writing code, see the response
you get back and I’ll provide a link at the end. That is the
vision API. If we can go back to the slides.
Next I want to talk about the natural language API which let’s
you analyze text with a single rest API request. First it let’s you extract key
entitys from text. It also let’s you tells you whether your
text is positive or negative. And if you want to get more into
the linguistic details of your text you can use the syntax
analysis method. And then content classification. It will classify your text into
over 700 different categories we have available. Here is python codes to call the
natural language API. It’s going to look similar to
the API vision code we saw on the previous page. We send in
our text and get back the result from the model. Let’s jump to a
demo of the natural language API.
So again this is our products page for the natural language
API and here we can enter text directly in this
textbook and see what the natural language API responds. I’m going to say I loved Google
I/O but the ML talk was just okay.
We will see what it says. This is a review I might find on
a session. Let’s say I wanted to extract key entities and see
what the sentiment was. It extracted two entities and we
get a score for each. The score is a value from negative 1 to 1
that will tell us overall how positive or negative is the
sentiment in this entity. Google I/O got 0.8. We even get
a link to the Wikipedia page. ML talk got a neutral score
score zero, because it was just okay. And we can look at the
syntax and details, see which words depend on other words, get
the parts of speech for all the words in our text. And if our
text was longer than 20 words we can make use of this content
categorization feature which you can try in the browser and see a list of
all the categories for that. That is the natural language
API. If we can go back to the slides. I’ll talk briefly about
companies using these APIs in production. Giphy is a website that let’s
you search for giphs. It adds search by text
functionality to all their gif . Hurst . Descript let’s you transcribe
meetings and interviews. So all of these three companies are
using just one API. We also have seen a lot of
examples of companies combining different Machine Learning API s so seenit ace I crowd sourcedd
video. They use video intelligence, speech and
natural language. Maslo is an audio journaling
app, you can enter your journal entries
through audio and they’re using speech API to extract audio and letting you
give insights about your journal
entries and they’re storing the data. So that is the natural
language API. So all of the products I’ve
talked about so far have abstracted that
model for you. A lot of times when I present on APIs people ask me those APIs sound
great but what if I want to train them on
my own custom data. This is AutoML Vision. This loets you train an image
classification model on your own image data. This is best seen
with a demo. For this demo let’s say that I’m a meteorologist. I want to pro dekt whether
trends and flight plans from images of clouds. Can we use the Cloud to analyze
clouds? And as I have learned there’s many, many different
types of clouds. And they all indicate different weather
patterns. So when I first started thinking about this demo
I thought maybe I should try the vision API first and see what I
get back. So as humans if you look at
these two images it’s obvious to us these
are completely different types of clouds. We wouldn’t know spl wouldn’t
know specifically what times of clouds these images were but
nothing as specific as cloud types for these images. Even for these images of
different clouds we get back pretty similar labels, sky,
cloud, blue, et cetera. So this is where AutoML Vision
comes in handy. It provides a UI to help with
every step from imparting the data,
labeling, training it and generating pro
additions. The best way to see it is by
jumporting the data, labeling, training it and generating pro
additions. The best way to see it is by jumping to a demo. Here we have the UU for AutoML
Vision. We import the data. We put our images in Google
cloud storage and create a CSV where
the first column is the URL of the image and the next column is
the label of of that image. Then we upload our images and
move over the the labeling tab. In this model I have five
different types of clouds. You can see how many images I have
for each one. AutoML Vision only requires ten labels but they recommend at least 100
for high quality predictions. The next step is to review my
image labels. I can see all my cloud pictures
here, see what label they are. This this one is incorrect I can
go in here and switch it out. Let’s say that I didn’t have
time to label my data set or I have a
massive image data set, I didn’t have time to label it. I can make use of this human
labeling service that will label your images for you and in just
a couple days you’ll get back a label image data set.
The next part is to train your model. And you can choose between a
base or advanced model. I’ll talk about that more in a
moment. To train it all you do is press this train button,
simple as that. Once your model is trained
you’ll get an e-mail and the next thing you want to do is see
how this model performed using some common Machine Learning
metrics. So I’m not going to go into all of them here but I do
want the highlight the confusion matrix. If it
looks confusing call it a confusion matrix. For a good
confusion matrix we want to see a strong diagonal from top
to left. It split our images for us into
training and testing. It took most of our images, used
those to train and model and reserved a subset of our images
to see how the model performed on images it had never seen
before. What this is telling us is for
all of our cloud images in our test set our model was able to identify 76%
of them correctly. On the train tab you saw the
base and advance models. I’ve trained both so we are looking
at the advance model here. I can use the UI to see how
different versions of my model performed and compare it.
So we would expect that the advanced model would perform
better across the board so let’s take a look. It looks like it
did indeed perform a lot better for most all of the categories
here. 23% increase for this 1, 11% for
this one. Hey, wait, if we look at our
Alto stratus images it did worse on that. What this is pointing out is
there may be problems with our training immamgs here so the
advance model did a better job of identifying where there’s
problems with our training data sesmt our model is only as good
as the training data we give it. 14% of our Alto stratus clouds
are being miss labeled as cume lus clouds ulous clouds. These are pretty confusing
images. This can identify where I need
too go back and improve my training data. The next part is generating
predictions on new data. I’m going to take this go back and
improve my training data. The next part is generating
predictions on new data. I’m going to take this image of a sirus cloud and see what our
model predicts. It has never seen this image before. It
wasn’t used during training and we will seal how it perform cirrus cloud and see what our
model predicts. It has never seen this image before. It
wasn’t used during training and we will see how it performs. So the UI is one way you can
generate predictions once your model has been trained. Chances
are you probably want to build an app that’s going to query
your trained model and there’s a
couple different ways to do this. I want the highlight the
vision API here. If you remember the vision API request
from a couple slides back you’ll notice this doesn’t look much
different. All I need to add is this custom detection label parameter and I
get an ID for that trained model that only I have access to or
anybody that I shared my project with. If I have an app that is just
detecting whether or not there’s a Cloud and image but then say I want to upgrade
my app, I don’t have to change much at all
about my app architecture. I just need to modify the
request JSON a little bit. So that is auto ML vision. If we
can go back to the slides. A little bit about companies that
are using Auto ML vision and have been part of the alpha.
Disney is the first example. They trained a model on
different Disney characters, product categories and colors
and they’re integrating that into their search engine to give
users more accurate results. Urban outfiters have a similar
use case. They train a model to create
different set of product attributes and are using
it to give users better search results. The last example is zoological
society of London. They have cameras deploy all over the wild
and they built a model to automatically tag all the
wildlife they’re seeing across those cameras so they don’t need
someone to manually review it. So the two products I’ve talked
about so far the APIs and AutoML have
entirely abstracted the model from us. We don’t know anything
about how that model works under the hood but let’s say you have
a more custom prediction task that is specific to your data
set or use case. One example is let’s say you have just launched a new product at your
company, you’re seeing it posted on social media and you want to
identify where in an image that product is located. You are
need to train a model on your own data to do that. Lelts say you have a lot of
blogs come in and you want to analyze those
logs to find am nomlys in your
application. We have two tools to help you do this. TensorFlow
for building your models and Machine Learning engine for
training and serving those models at scale.
So from the beginning the Google brain team wanted everyone in
the industry to be able to benefit from all the Machine
Learning products so they made TensorFlow an open source
project on GitHub and up take has been
phenomenal. It has over 90,000 GitHub stars.
It just crossed over 13 million down loads. Because it’s open
sourced you can train and serve your TensorFlow models anywhere.
So once you built your TensorFlow model you need to
think about training it and then generating predictions at
scale also known as serving. So if you’re app becomes a major
hit, you’re getting thoulgs of prediction requests per minute
you need to find a way to serve that model at scale
sands of prediction requests per minute you need to find a way to
serve that model at scale. Because TensorFlow is over
source you can train them anywhere but this is about Google Cloud platform. On
Cloud machine engine learning you can run distributed training
with GPUs and TPUs. You can also deploy your trained model
to Machine Learning engine and then use the ML engine API to access
scalable online and batch prediction for your model. One
of the great things is there’s no lock in. Let’s say I want to
make use of ML engine for training my model but then I
want to download my model and service it someplace else. That’s easy to do and I can do
the reverse as well. I’m going to talk about two different times of custom models using
tvrs. The first is transfer learning
which let’s us update an xating trained model use our own data
and then talking about training a model from scratch using only
your data. So transfer learning is great if
you need a custom model but don’t have enough training data. It let’s you utilize an existing retrained model to do something
similar to what we are trying to do and take the weights of that and take a
couple of layers with our own training data. So I wanted to build an end to
end example showing how to train a
model and building an app. I know a little bit of swift so
I decided to build an iOS app. You upload a picture of your
pet, it’s able to detect where the pet is
in the image and what type of breed it’s. There’s a library of top of
TensorFlow to let you do object detection
which is identifying a bounding box. I trained the model on
Machine Learning engine. And then I used a couple of
Firebase APIs to build a front end for my app. This is a full
architecture diagram. IOS client is actually a pretty tlin
client. What it’s doing is it’s uploading images to closed
storage for Firebase and I’ve got a cloud function
that is on that bucket so that function is going to be triggered any time a image is
uploaded. It sendss to Machine Learning
engine for prediction and the prediction I get back is going
to be a confidence value, label and bounding box detail
for where that pet is in my image. I store the metadata in fire
store. Here is an example showing how
the front end works. So this is a screen shot of fire store. We can see when ever my image
data is uploaded to fire store I write the new image with the box
around it to a Cloud storage bucket.
So that was an example of transfer learning learning. Say we have a custom task and
can build enough data entirely from
scratch. This is a demi of a model that
predicts the price of wine. Can we predict the price? This is what an example input
and prediction for a model would be. One reason this is well-suited
for Machine Learning is I can’t write rules to determine what
the price of this wine should be. I can’t say any wine with the vanilla in the description
is going to be a cheap wine. So I want to see if I can build a
Machine Learning model to extract insights from that data.
Because I’m training this model from scratch there’s no existing
model out there that does exactly this task. I’m going to need a lot of data.
I use Kaggle. It’s part of Google. If you’re new to
Machine Learning and looking for interesting data sets to play
around with I recommend checking out Kaggle. It has a wine
review data set. I found this on 150,000
different kinds of wine. For each wine it has a lot of data
on description, the points rating, the price, et cetera.
I’m just using the description, the variety and the price for
this model. The next step is to choose API
I’m going the use to build my model which TensorFlow API and
the type of model I want to use. So I chose to use the Tf.Keras
API. And the TensorFlow API includes
a full implementation of Keras. It makes it easy for us to
define the layers of our model. You can also use a lower level TensorFlow API. So I chose to use the tf. Keras API and chose to build a
wide and deep model. It’s a fans way of saying I’m going to
represent the inputs to my model in two different ways. A linear model, that’s the wide
part. That’s good at memorizing relationships between inputs. And a deep neural net which is
good at generalizing data it hasn’t seen before. The input to our wide model is
going to be sparse around the input to the deep is going to be dense
embedding vectors. This is all using the Keras
functional API to define the wide model. I’m going the use
what is called a bag of words representation. What this does
is takes all the words that occur across all my wine
descriptions and I’m going to choose the top — I choose to
use the top 12,000 because this is a hyper parameter you can
tune. Each input to my model is going
to be at 12,000 element vector with
ones and zeros indicating whether or not the word from my vocabulary is present in
this description. This wide model just takes into
account whether or not this word from my vocabulary is present in
that specific description. This type of input is going to look
like this. It’s going to be a 12,000
element bag of words vector. The way I’m representing my
variety, this is going to be a 40 element
one hot vector with each index and the
vector corresponding to a different variety of wine. I’m
going to output to my model indicating what the price of
that wine is. If I just wanted to use the wide
model I can take the wide model I have
here, run run training and evaluation on
it using Keras but I fund better accuracy
using wide and deep. Word em betting let’s us define
the relationships between words in vector space. Words that are
similar to each other are going to be closer together in vector
space. There’s lots of reading out
there so I’m not going to focus on the details. I can choose the dimensionality of that
space. In this case you can see I used an eight dimensional
embedding space. I obviously can’t feed the text
directly into my model. I need to put night a format that my
model can understand. I encoded each word as an
integer. This what is the input to my deep model is going to
look like. The inputs need to be the same
lengths but not all my description are the same Lindt.
None of my descriptions are longer than 170 words. If they’re shorter I’ll tack
that vectors with zeros. We are still predicting the price. Using the Keras functional API
it’s relatively straightforward to combine the outputs and my combined
model. I choose to do Machine Learning
engine. First is to package and put it
in Google Cloud storage. I put it in trainer director. I put my wine in data directory
and have this defining the
dependencies that my model is going to use. To run that training job I can
use G Cloud which is our Google Cloud
CLI for interacting with a number of different Google Cloud
products. I save my model file which is a
binary file to Google cloud storage. Let’s see a demo of generating
some predictions on that model. I’m going to use a tool called
co-lab and this is a cloud hosted Jupiter notebook.
What I’ve done is saved that model file in Google Cloud storage and
now I’m going to save predictions. Here we are just
imparting libraries we are going to use, I’m loading
my model, next step is to load the
tokenizeer from my model. This is just an index associating all
the words in my vocabulary with a number and I’m loading my variety and
coder, too. I’m going to ignore that warning. Here I’m going to
load in some raw data. What I have here is data on five
different wines. I’ve got the description, the variety and the
associated price for each of these wines.
And I wanted to show you what the input looks like for each of
these. Now we need to encode each of those into the wide and
deep format that our model is expecting. We have this
vocabulary look up. I’m printing a subset of it
here. This is going to be 12,000 elements long for the top
12,000 words in our vocabulary and Keras has utilitys to help us extract those top words.
And each index corresponding to a different variety. Let’s see
what the input to our wide model looks like.
So the finish thing is our text. This is a bag of words vector.
It’s a 12,000 element vector with zeros and ones indicating
the presence or absence of different words in
the vocabulary. I believe that first wine was a
pinot noir. We can confirm yes that was. So that’s the input
to our wide model. Our deep model we are just encoding all
the words from that first description as integers and
padding it with zeros. All I need to do to generate
predictions on Keras model is called dot predict. What I’m
doing here is now going to loop through those predictions and
see how the model performed compared to the actual price.
We can see it did a pretty good job on the first 1, 46 compared
to 48. This one was about $30 off but
it was still able to understand it was a higher priced wine and
it did pretty well on the rest of these as well. I’ll have a
link to this at the end. You can go to this URL, enter
your own wine descriptions and see how the model perform also. If we can go back to the slides
I’ll talk about a few companies using
TensorFlow and Machine Learning. The first example is is Rolls-Royce . Oh cado is the UK’s largest
grocery delivery service. They got tons of customer
support e-mails every day. They built a model to predict
whether the e-mail requires an urgent
response or no response at all. Airbus has built a custom image
classification model to identify things in satellite I imagery. So I
know I covered a ton of different products. I wanted to give you a summary
of what resources all these products require. These are
four resources I thought of that you need to solve a Machine
Learning problem. There’s probably more that I don’t have
on here. Training data, mobile code,
training and serving infrastructure and overall how
long is this task going to take you? So if you look at our
Machine Learning APIs, great thing about these you don’t need
any training data. You can start generating prediction on
one image. You don’t need to write any of the model code you don’t need to
provide any serving or serving
infrastructure and you can probably get started with these
in less than a day. The cool thing about AutoML is
you will be providing your own training data. It will take a
little more time because you would need to spend
time processing images, uploading
them to the cloud and maybe labeling
them. And finally if we think about a
custom model TensorFlow running on Cloud ML engine you would
need a lot of training data, you will have to write a
lot of the model code yourself and you would think about
whether you want to run your training and service jobs on
premise, and obviously this process will
take a little bit more time. Finally if you remember only
three things from this presentation,
first thing you can use a pretrained API to accomplish
common Machine Learning tasks. Second thing, if you want to
build an image classification API, train on your own data, use AutoML Vision.
I’m interested to hear if you have a specific use case for Auto ML
vision. Find me afterwards. I’ll be in the Cloud sandbox
area. You can build a TensorFlow model
with your own data. So here is a lot of great resources
covering everything I talked about today. I’ll let you take
a picture of that slide. The video will also be up after
so you can grab it from there as well. And that’s all I’ve got.
Thank you. (Applause).
. .
. .
. .
. .
. “Realtime captioning on this
screen” . .
. .
>>Thank you for joining this session. Grand ambassadors will
assist with directing you through the designated exits.
We will be making room for those who registered for the next
session. If you’ve registered for the next session in this room, we ask
that you please clear the room and return
via the registration line outside. Thank you. .>>.
. .
. . What’s new with constraint
layout and Android studio design tools. New With Constraint Layout and
Android Studio Design Tools . .
>>Welcome. Please fill in the seats near the front of the
room. Thank you. At this time please find your
seat. Our session will begin soon. .
. .
. .
. .
>>Hi. (Applause).
>>All right. So good afternoon, everyone. My name Nick has Nicholas . You might have noticed we
released a really cool — there was a session this morning, it’s
on YouTube, check it out. It’s really cool. So we are going to talk about
the walk we have been walking on this
past year, specifically we will focus on
sample data and we will also talk about constraint layout 2.2 and
what is coming up with these libraries. First for the
section on the layout editor, Vadim will present for
you. >>VADIM CAEN: Thank you. Hi,
everyone. To start with we are going to
talk about layout editor. It’s a graphical cool without
having to write any XML and it’s supposed to make you more
productive. So what is nice with the layout
editor it also works with your custom views. So if in your code you have a
custom view, the nice thing is that any
custom view from your project will appear in the Palette. Our goal is to make you more
productive. We added some nice features this
year to do so. The first one I want to talk
about is the conversion. So if you right click on a view you’ll see the conversion option that
will make this pop up appear and you’ll be able to select one of the constraint component.
And this view is contextual so if you right click on a view which is
not a view group it will show some
non-view group views. The next thing I want to talk
about is the navigation between included
layout . If you double click on the
component it will jump automatically. And if you jump to the arrow it
will jump back. This is handy to be fast between
layouts. The layout editor is really for constraint layout
and constraint layout is a great
layout. You might not want to write all your XML by hand. What you should do instead is
take all the menus we have in the layout
editor to generate this XML for you. Again, right click on the
view and then you can use the — align
any of the top option to choose your option. We have many of
them so feel free to check them after this session and it’s going to be make you more
productive. Most of you be probably
populating your layout, you have data from the internet and you want to
populate them. To serve this issue Diego will
talk about sample layout. >>DIEGO PEREZ: Before we talk
about sample layout let’s talk about
the tools attributes. You actually use them to give information to the Android
studio. We are not going to talk about all of them but let’s cover one
example. You probably have seen this. So in your string file you get a
warning saying you haven’t translated in
a particular stream. That’s it, the error is gone. But there
might be situation where you don’t want this. It could be
it’s the name of the application and you don’t want the translate it in every single
language, only certain languages. You can basically tell look I
know about the server but I don’t
want you to tell me about it because it’s fine. So that is
one example. There are many others. But what we want to
talk about now is actually design time attributes. This is the tools attributes
that I talked about. There are additional information
and content to the layout editor. Let’s see one example. So in this case what we have is
we created our Toolbar. We had to keep the layout on its
own so we use constraint layout. This is how it will look. What you can see is the Toolbar
without any context but you done see the layout so you don’t know
how it going to look. So there is something we can do. You can again tell the layout
editor this layout is going to be included in this other layout
here. So using this attribute to show.
We are letting the layout know that we want to embed this header into
the main layout. And that way we can see the
context so now when we have editing we know
if the colors match we can tweak our design and see how every chain impacts our
layout. And the best thing is again you
can do it — you can edit it right there. You don’t have to
see with that blank space or anything like that. Let’s see another interesting
use of this. When you have the text view this
is usually what you get so you have
hello world and that’s it. But you can tell to the layout
editor that you know this is going to look different and do
you that but using the tools. In in case we are replacing both
the text and the text color. And we are telling the layout
editor at run time this is going to look different. And we are good to see how this
is very useful. When you are in the recycler
view this is what you get by default. You
can see how it’s laid out but it’s not very useful. It’s just
a list of elements with a lot of empty space. We can use the tools attributes,
you basically let the layout editor know. This is the layout
that I’m going to use. You have probably seen list item many
times, very simple. You can say we can do other things. We can tell the layout editor
how is this going to look with five elements? All these things
you usually cannot do them in the layout editor because these
are things that happen at run time so usually the source of
information for a recycler view is an adapter. It
might depend on many things. So doing this you give us
context and then the layout editor will
render correctly similar to what you have in run time. So let’s now go into sample
data. Sample data was introduced in 3.
0. It helps you to actually populate data that is not
available at this time. So in this case what we are doing is
telling the layout editor that I want to use the theme property
and get the data from this data source which we
call a sample. And that data source is simply a
file with a list of colors that you see there.
So what happens in the design time is the layout editor will
get a different color for every item. So that looks better.
But we can do more. So let’s see how you create that
file. You go to your project, new,
sample data directory and that’s how we
create the directory what we are going to put every data serve
that we have. We want colors so new file and
then we name it and we just put the list
of colors and that’s it. It’s that simple. Now you can use it
in any property. So this is one of the types of
data sources that you can use in sample data. So it’s basically
a list of colors. You can have lists of any kind of that you want like regular text and
just replace it. But you can have other things more
interesting like you can have dimensions, so if you have
dimensions using the tools attributes you can to the layout
editor I want to use this dimension and I wants it to be
different for every item and you can do that using sample data.
We also have other types of data sources. You can have images. You’re create it in a similar
way. Instead of a file you create a director and drop all
your images in there. And recycler view will get a
different image for every item so it’s very simple.
The last part I want to tell you about is JSON files. JSON files
is a similar way to the list that we had before but it
allows to do something, you can have all your data together in
one place. You can add maybe context about
why a particular item is like that or you can even get a sample from your AVI
and put it into your sample data. We have a list of notes. We have a title and an album. You can reference any of those
elements. Let’s see how to do that. In this case what we are
telling the layout editor is I want you to
replace text of design time and do it
with the name of the JSON file and then
the node that we saw before. In this case we are doing it with
title and album. Remember this will change depending on the
name of the JSON file. It doesn’t even need to be called
JSON. If you don’t want to provide your own data sources, Android studio
will also provide — it already comes with a set of predefined ones so you don’t
have to create your own. We give you a set of them, we give
you names, and we also give you a
set of images. For example you can use Avatars
and you don’t need to create images. You can just use the
ones that we give you. Again, how to use that is very
similar. The only difference that you can see here is we added tools of the
data source. Instead of being a sample now is
a tool sample. Using syntax completion you can see
all the data sources we have. In 3.2 we added new features. So everything we saw was already available in 3 L3 L. .>>Sample data is a great tool
so we decided to put a UI on top of it. So what does it look
like? I’m sure most of you already did that so we have
image view. You’re taking it from the pallet
and the pallet. Since you started your project
you only have the launcher icon available. You have the sample data and you
can pick any resource data from your
project. So now our image looks better
and we can see it in our recycler view. But what if you want to make
some change to this sample data? What if I want to use one simple
image for myself? We introduced in 3.2 a new
design time helper. So here I’m going to uncheck and
this will give me the ability to
select only one image from my set. And I can try it out with
any e-mail from this set. But now let’s say we want to give it
another look and pick images from another data set. So we can just use this and
select another data set and here we go, our recycler view has a
brand new look so it’s nice to preview what it would look like.
So you can always jump back to the resource picker and there’s
a full set of resources available by
clicking the button on the bottom of the pop
up. And the same options are available. If you select a
sample data on the resource picker it will populate
design image so it will be removed at one time.
We have the same design time helper for text view. So here I’m just selecting
different sets for each view and as you
can see on the recycler view each item is
taking one instance of each own set for each item. And finally
we have also added this feature for recycler view . You can just jump back to the
file and preview your changes. All the design time attributes
are automatically populated so you don’t have to do anything
and take full advantage of sample data. But it’s even more helpful if
you use it with constraint layout. Nicholas is going to
come on stage and do a deep dive with constraint
layout.>>NICHOLAS ROARD: We released
constraint layouts officially last year. So if you don’t know
about it, it’s a library that allows you to
create your interfaces. And we tried to use flat
hierarchies which is coming very handy for automations. It’s compatible essentially
everywhere. It’s small. . It comes with a great UI builder. Please try
to use it. It’s something we are working
onto help make your life easier. We release it constraint layout
1.2 last year. You can set up your0 last year.
You can set up your UI, there’s a lot of capabilities here. And there’s some helper object
like guidelines to help you set up your screen. So it’s already very full set of
features. But a month ago we related
constraint layout 1.1 with a lot of fixes, improvements, new features, now ways of increasing
your layout. There’s one which allows you to position an element relative to
a set of elements so whatever the position and dimension of those elements that
element on the right is going to position correctly.
So essentially with constraint layout 1.1 we have really
flexible layout, something that should essentially lept you
express any UI. And you can just add to it yourt
you express any UI. And you can just add to it your project, we are now shifting so
it should be very easy for you to
use it. But I think a lot of you here or
in this room to hear me talk about constraint layout are in this room to hear me talk
about constraint layout 2.0. It’s a great base. It gives
flexibility that you need to create a UI and comes with a UI
builder. One of the concepts was the
concept of helper object. Object that even though you can manipulate them into the UI
builder, they do not appear on your screen when you run the
application but essentially they help you create your UI so
one of them is the guideline object, the designer typically specifies a
UI with horizontal lines, well you can
simply replicate those in your UI so it’s a lot easier to set
up your screen. With 1.1 we have the barrier I talked about.
And what is nice with those helper objects we have for them
in the UI builder so you can manipulate them and add elements to those helper
objects in the component tree by dragging elements into them.
Something you might not realize is that those helpers objects are not
gibberish. They keep reference to the views
so we still keep the view tree very
flat. Because they’re just references we can have one
object being referenced in multiple helpers. And very interesting for us. You can think about the helpers
a way of keeping a reference to a
bunch of views, you can still get flat
hierarchy and the helper gives you a way
of encapsulating. 2. 0 woe are going to expose those
helpere are going to expose those helpers. If you have a
person on your project similar to what we have with
custom views they will be able to
present. We calculated three board categories for the helper. One category is layout
manipulation. Helper that is going to help you increase
create a layout. Another is going to help you after the
layout is done so post layout. And we have helpers because
helpers are views or can be views as
well, helpers that we can use to do some specific rendering. For the virtual layouts there’s
a concept that we are really keen on. The idea is that the helper is
going to set the constraint for you. The simplest example I can think
of is a linear helper. If you created a chain in constraint layout one that will allow you
to do exactly the same thing but you manipulate it like the
layouts, for instance. The only difference is you have all the features that chains provide
you. The way you use them it’s very easy. It’s a normal view.
You can constrain the view itself but the only thing you
need here is that list of ID’s that you reference.
We have a virtual layout that we launch with 2. 2 that essentially implements
the flex box layout semantics. Those are pretty useful but at
the same time I think pretty expected. We come up with additional use
cases. So one example of post layout
object is a flying object. You can
reference this object with a flying decorator and that’s all. So no code, it’s purely
derivative, add that to your file, reference
that object and it will animate on first launch. We have
another object that’s very powerful and layers, essentially
a lot of things. It It has a set of views and
apply operation to them so there’s a
bunch of graphic operation that you can apply and we will do the math. You can also set it up so it
takes by default the views it references and you can set up a
background very easily that way. Another use case that will be
supposed you can use the layer a little bit like you would use layers in
graphical photo editors to a bunch of layers to specify your elements and when you are
okay with those elements constraints you can lock that layer and be sure to modify
them. So it’s a very powerful helper.
Here is a quick example on how it would look by drawing a back ground. And we set the constraint
manually . Just to prove that we have a
flat tree, we can apply this operation that I’m talking about and everything
will just go but the background stays
where it is. Another useful decorator is circular
reveal. It does what it says. The typical effect that you’ve
been familiar with with design, if I
press the button, the interesting thing here is that you notice that it only
applies to the element reference. We have the ones that are not
referenced. We did not apply to them f you play with circular reveal you see
it’s a bit tricky to do that. Interesting here as well we are
not creating our own circular
reveal, we are justing into just using the
normal circular reveal. It’s a way to use your layouts. So on that same idea we have
decorators that are helperss that are here to draw things. And same things we reference use
and you can draw something with it. So that’s confusing in terms of
what type of affect you can get with that. You may want an affect like
that, like a lava lamp affect. And what you see is the result
of that affect will depend on the position of those blobs but the final
rendering has to take into account all of them. And that
would normally be very difficult to achieve. You can think about
it that way, you have a canvas that you paint on. You’ll have actually some image
views that are essentially sitting on top of that and they’re normal image
views, normal views, you can apply the usual things. But we
put them on background to be transparent and then we use a
decorator to draw the background. You can easily get
an affect like that. A little slower, there you goep
you go. Just to show you the bots are correct we still
have our images. Another type of decorator is the
bottom panel decorator. Let’s say that you may have
created by using a chain, very easy. If you like to set a
background on those objects and maybe change
the colors. You can simply apply this
decorator. And because we can do a little more interesting
stuff here, I can show you what it may look like when you click
on those buttons. So just to give you an idea of
the type of effects you can get with
decorator s, so I’m clicking on this, I
can press that, I press is and there’s a
circular reveal. Just to give you an idea of what
you can do. What is very interesting here is that if you look at it in studio and
if I zoom under component tree, you
will notice that actually we mostly
adjust views and just a bunch of
helpers. There’s actually no code here so it’s a really useful addition to try
to really separate the field with
the actual — your actual data and
your actual application. So to summarize with the
helpers, you can tag your views with those,
you can encapsulate and it’s all declarative. You have been probably playing
with constraints on the constraint
layout 1.1. And if you switch between two of those states, you
can animate them. So that’s verystates on the
constraint layout 1.1. And if you switch between two of those
states, you can animate them. So that’s very nice but
initiating those things is a little
cumbersome. So in 2. 0 we will have a separate file
that let’s you specify states because that’s basically what
they are. I have a layout that is
different pre presentation, different states. So the way you would use them is
by description. You condense by state ID and
that’s it. One nice thing is that you can
specify a region when a specific constraint set is going to be
applied. And by plugging that into
onconfiguration changed, I’m showing that on the Chromebook,
you’ll see automatically depending on the
size of the Window will switch layouts. If you want a little more fancy,
you can on the prelayout change plug transition manager and it
will animate the layout transition. So I think we have devices,
Chrome books, es that’s the type of use case that it’s going to be more and
morethat’s the type of use case that it’s going to be more and
more useful. So to quickly recap on
constraint layout, we have the helpers, we have the virtual layouts and decorators,
we also have stuff I haven’t talked
about like for manipulating the constraints directly and we have motion layouts. I’m happy to introduce John.
>>JOHN HOFORD: Thanks. I’m excited to show you guys motion layout. So let’s kind of get into it. It starts off with a sub class
of constraint layout called motion
layout. It’s a layout that has all the properties of constraint
layout but the whole trick with animateing
between two constraint sets, that will be done for you by emotionally motion
layout. If you have two constraint sets it will switch
between them and animate between them by itself. But notice because it’s a
constraint layout, you can actually use the
helpers that were shown there. But it also provides you with
the ability to edit the custom
attributes of the system. So notice the blue light there,
it’s changing because it’s a custom
attribute in the system. So constraint sets now can have
custom attributes, allows to you animate anything. One of the other things that it
does is it allows you to control on
touch directly. So it will manage your touch
events by tracking your velocity of your
finger and matching it to other velocity of views in the system and
naturally give you a smooth transition between them by a
touch. We also support helpers because
it’s a constraint layout. Nicholas just showed they’re
there, too. They can work together and apart.
Now let’s change the transition a little bit. We want to move
the eye into the middle of the screen but if we look closely
there’s a bit of a problem here. You see the arrow clashes with
the eye. We have to fix that. The way we
do that is a feature we call key frames. Essentially if we have
a beginning and an end of a path, we can distort
the path by adding a key frame. Once we add the key frame, it
will avoid the path. Which is pretty cool. So now we have a motion scene.
It has constraint sets. It has on touch, key frames,
custom attributes. With a little flag on your
device you can see the path of all the objects you’ve set on
it. So you can understand what all your views are doing on
device. So one of the nice things we
support is nesting of constraint sets of
motion layouts. In this particular example we
have two views being animated by a third
view. They’re both constraint motion
layouts and they’re being being driven by a third emotionally motion layout. The top view is a fade to black
but this is an animated view that
was custom written by Nicholas that is being driven by a motion layout because it
implements on progress method. So you can drag your own custom animations directly from motion
layout. So one of the things it’s really
good at is synchronizing motions. Anything that is very
complex sequences of many things moving on the screen it will
handle it and will also allow you to interact with it with the
touch. So we added one more thing to
the system, something we call cycles. This is the ability to build
into the key frames s on sill s on sill oscillatetory cycles. I can take one and have it roll
because it’s oscillated. Or I can have a different
effect, it can bounce along the edge. I got bored and had that one
flying around. [Laughter]. Another typical example you put
a little shake at the end. So now one of the cool things is
we build editors. So we are building an editor for
that thing. We call it the motion editor. It’s part of the design surface. But now you’ll be able to do
those edits directly in the design surface. So kind of to take you through
how you would do that in the design
surface, based on constraint layout you
just just create your constraints for your
first constraint set, switch to the
second, edit constraints for the second set. If you hit play it
will show the animation between those two
constraint sets. There’s a check box to enable
showing you the path so now you can see the path. But how would
you add a key frame? Just position the cursor where
you want it, add the type of key frame
that you want to add, and you can just
distort the position by selecting the key frame and
moving it around on this screen. Let’s do that one more time so
you can see it. Just select a position, any
position, create a key frame, and then we
just move it it. So here is much more
complex motions. All animated. This has quite a few key frames
in it designed to let you set
transparency on some things, move them around. But it all worked in the IDE.
Thank you. (Applause) .
>>So one last thing. You probably are here to try all
those things. So a lot of that is actually 3.
2. All the sample data we showed
you, all the current integration with 1.
1, go, try it out. We released canary 14 model this Tuesday so
give it a try. The library with emotionallyout
we are doing it in a few days hopefully. And motion editor it’s an on
going but we are excited about it. So think about it, there is
codelabs as well for the Chromeos resize. We
also want to hear from you in general. We are reachable, but
for this particular session there’s feedback you can fill. We also have an office hour this
after even when after the session. Please come see us.
That completes our talk. Thank you, very much.
(Applause). .
. .
. .
. .
. .
>>Thank you for joining this session. Grand ambassadors will assist
with helping you move. If you
registered for the next session in this room we ask that you
clear the room and enter via the
registration line outside. Thank you. .
. .
. . Paging with recycler view : Managing infinite lists with recycler
view and paging With Recycler View: Managing
Infinite Lists With Recycler View and Paging .
>>Welcome. Please fill in the seats near the front of the
room. Thank you. room. Thank you. Page with recycler view:
Managing infinite lists with recycler
view and paging. .>>At this time please find your
your seat. Our session will begin soon.>>CHRIS CRAIK: Hey, I’m Chris
Craik from the Android framework team.>>YIGIT BOYAR: My name is
Yigit. I also work with the framework team. Today we are
going talk take about managing infinite lists and
recycler view. Pretty much every app as a list
of something they want to display. Let’s see how you
would model a list like this. So you’ll have a view model but
keep the list of items so it survives
configuration change. You put your data into database so that
your application works off line. And of course you have some
component where you pull the data from.
Usually data is dynamic, it changes. So how does this change the
database to return live data list of
things or observeible list of things so you can have updates? For instance, if you put the
updated version of records from your
database and realizes this a query
observing this table, let me reveal the result,
fast forward the the view model and
you can pull from the view model and
update the animations. This looks cool but there is actual a
a problem here. There’s something we don’t like. To
better understand the issue let’s look at the interaction
between the database and the UI. When I mention that database is
going to reveal the result, it is going to reveal all of the results so if
you ask for the users ordered by their last name and if you have 10,000 users,
it’s going to create 10,000 users, pats
over to the UI and it’s very inefficient because user only
sees 8 to 10 items, why would you create 10,000 items? So we
don’t like it. We want to change this.
What are we looking for here? First we like that convenience,
it’s so nice to tell the database
give me a live list of user, we want to
handle multiple layers, be able to bring the data from the
server, put in the database and the UI and this
should be very easy to implement. We want it to be
fast. We don’t want to do any big
chunk of work. It should be efficient. We want it to be life cycle
aware so if user is not looking at the screen it shouldn’t do
any work. Last but not least we want it
tot be flexible. Everyone has their different API, different
data structures. Our solutions should work with all of those
things. Now if you go back to our first
example, how do we implement these lists
if we don’t use paging? If you’re using, you return it to your
live list of users. You hold the reference to it and serve it
to your UI. Your activity you would use this
list but if you’re not following our releases closely you may not
have heard about it. This works on that one. List adapter is a recycler view
adapter that displays a list. So if you have the recycler view
on the right, when you call the list adapter it just displays
it. The nice thing is if you call
the same with a different list it’s going to calculate the
difference between these two lists on a background and
update the UI with the correct animations. It’s available in recycler view
27. 1 and same function exists in
the async list differ. Okay. Let’s go back to our
example. We have our adapter. We observe the live data, send
the list to the adapter. When we see the adepter we need
to give it a call back. It has two functions. First it can
check if two items are equal identically and other one
checks whether checks whether their
contents are equal or not. Once you have that you can call
the get item function, obtain the item in the list and do
whatever you want with it. Super simple. >>CHRIS CRAIK: We have seen
what that looks like but you also started with with this is
not good enough. Let’s see what it looks like when using a new
paging library. The most important component of the
paging library is the page list. This is a list that load data in
pages asynchronouslily. It’s backed by a data source,
swaps it in, and updates the list. So this serves as a replacement
to the list in all the examples we have
done before. We have this view model. Let’s swap out the list with a
page list and go back to our Dao. We want to ask our Dao can you
give me a live data of a page list but
there’s an issue with this. So if we look over at what the data
looks like on the database side, you could imagine paging this in
one way and you could imagine paging this in a different way.
There’s an interesting decision here, you showing large items or tiny
items where you want to have a lot at once? The page needs to be
configurable to serve every app’s needs. So instead room can produce this
data source that back pages it that
way it can access it and load the data directly into the page
list. Because we say we want multiple
of them, we want something that is observable. We create a new
data source. All right. So instead of the live data page
list, what the user Dao provides us
here is a data source factory keyed off of imageer because we
are using positions. So we go back to our view model
here and define how are we going to load this data? It’s not
much code. We just say get the data source
factory and use this live page list builder class from the
library to create a live data. The minimum amount of data you
need to pass is page size. Here we have 30. Now all the
repository changes are done. Let’s go to the activity. So now we are using list adapter
before. You change this into a page list adapter. It’s the exact same thing as
list adapter. It’s loading the pages as
content is internally loading those page list. So if we look at how we
implement this list adapter though, we would have to switch over the new page list
adapter and use the exact same dif code.
The only thing that changes is the user object becomes nullible
and we will get into why in just a little bit.
So let’s go through how you can do further configuration because
we don’t stop at just page size. So the code that we showed
before creating a live page list
builder you can just pass a minimum page
size and that’s the minimum amount of data you need to pass
but you can also create a configuration object where you
can declare more. You might set an initial load
size hint. This suggests the initial load is larger to make
that initial load avoid initial page stretches immediately after
you’ve fetched it. Another thing you can change is pre-Fitch prefetch distance.
You can configure this. You can also control place
holders which are an important part of this library so let’s
start talking about those. Place holders false is the
expectation you might have where you load the page, the recycler
view has access to the number of items you’ve loaded and you have
a big scroll bar. The user can scroll through it. Once they get to the bottom,
there’s no more data. Once the page loads the scroll
bar jumps. We also support a completely different type of
paging. If you asked for place holders
and this is the default, we will
present the list like this. The scroll bar is smaller and
that’s because we are presenting the entire data set immediately.
As the user scrolls down you’ll see there are unloaded items and
those are presented as nuls in the adapter. These items as the
data eventually loads, display and you’ll get a
nice animation from that. So let’s talk about place
holders, they don’t work everywhere but we think they’re
useful in a lot of cases. The user can control past what is
loaded. They don’t have to hit a block at the end because you
don’t have anymore data yet. The scroll bar looks correct and
you can use fast scrollers very easily because you have the
entire data set presented to the recycler view so you can jump
anywhere. And another nice feature is you don’t have to implement the loading
spinner at the bottom because the users can
see it and know the item is still
loading. It’s important that your items stay the same size.
If you can’t guess what the item height is going to be before you
have content, the cross animation
doesn’t look great. The adapter have to handle null
items. And then your data source has to be able to count items. If you’re
using something that is loading data from say the
network, your back end may not be able to
provide a precise count. So we have been talking about
live data thus far but as Yigit
introduced we also want to support our RxJava
developers. You may want to produce an
observable and we provide the class Rx
paged list builder to do exactly that. You can change the return
type here use an Rx page list builder and specifically request observable or flowible
out of it. Now let’s go under the hood
because we have shown how small of a code change this can be to
change the paging library but how does that work underneath?
Here on the left you have the repository that represents the
data loading portion of your application and then on the
right you have the view model which is how it
communicates to the UI. Inside the repository we want to produce something that will push
updates to the UI. When you call live paged list
builder. build, we want the producer side
as well. Once someone starts observing
that live data we will create a new
page list because that’s the way we start passing information
down this pipeline and to do that we create a new data source
out of the factory that it’s able to
produce. To pass this page list over we
want don’t want to send an empty list. We would like to load
data first and we do that on a background
thread, initialize the data and create a page list out of it. The first page list we produced
has data only in the very front. We can send that to the UI
thread . Submit this to the adapter
and adapter can start presenting those items. We might need to load more data
so the page list internally will trigger a data load from its
data source. And append that data directly to
the page list. Now the cross fade animation
occurs because we have signaled the recycler view that these new items are
updated again but what happens when if the database says this
table has been validated something changed, we invalidate
it so let’s look at what happens in order to start pushing those
updates from the other side. So on the other side of things
we can see the database had an item added to it, that’s why we invalidated
the previous data source and the
producer side is listening to the signal
to say we need a new page list. We can create a new page list to
send over. When we load the initial data we
are careful to initialize that based on the loading position that was
signaled by the adapter. We send that over to the UI.
And now we submit list again. Because these are two different
lists and we really don’t want to call
notify data set changed we compute an a synchronous dif.
Immediately you get a new item showing up and we only have to
do the minimal amount of UI work to bind and show the new item.
So fundamentally under the hood we do a lot of trickery to make
this work but from the outside we try and make it look as close
as possible to a live data of a list because this is a really
nice experience. It let’s you keep your UI really simple and
avoid all the information about paging on that side and
let you configure and construct your flow in one place from the
repository. The page list adapter let’s you
handle the new pages lists as they flow
in and the internal updates of the paged list as it loads. .
>>YIGIT BOYAR: Let’s talk about data sources. So imagine that page list is a
list of implementation that works with a data source and we
have different types of data sources. You can have a positional data
source, an item keep data source and
page keep data source. Let’s start with with positional
data source. If you have the data locally but user may want
to jump into arbitrary positions, positional data
source is your best option. This is actually what uses
behind the scenes. If you have a data source you extend this position data source clause and specify the type of items. Let’s look at an example. Zoom out, go to the bird’s eye
view and have a data source. The very first time time come to the recycler view , page size, start position and
load size which is usually larger than the page size
because you want to have more items at the beginning and
whether place holders are enabled or not. Our data source
will return the data, tell us where the data starts and it will give us total number of
items in the data source. Based on that we will start
displaying the data but also display the
place the place holders so the size of the page is equal to the
total number of items in the data source. As soon as user starts
scrolling, it’s going to run out of data and
call this on the positional data source, the start position of the first item is
missing, and page size, and get the new
data appended to the list. So in this case use the fast
scroller and jumps into a position where
we don’t have data yet. When this happens it won’t call
the data source position. With a recycler view we load
that page so we can display to the user. This is where positional data
sources is really good if user can jump into arbitrary
positions. You never block them. Let’s say something
happened in the database so we will get a new
data source. Each data source represents a snapshot of the
data. So we get this new one, we
create new pages for it and load the initial page from this data source based on
where user is in the previous one. We will bring this one as
always on a background thread and calculate the difference
between these two lists and update the recycler view with
the correct animations. Second one is the item keyed
data source. Imagine you have data like a
list with some names. If you look at the page out of
this, you can identify the items before this page by using the
first item in the list and then the next page by using the last
item in the list. Basically every item can
identify a page after or before it.
If your data source is like that you should implement the item keep
data source clause. You provide the key type. In this case we
are using names so string and item type in the list.
Let’s get an example. So first time you come to the recycler view, let’s call the load initial method, request initial key, request load size
and place holders enabled. Data source
will only return this data, it’s not going to count it because
there’s no place holders. As soon as users start scrolling we
will extract a key from the last item we have and call the data source
load after method to load that other page. Similarly as user keeps
scrolling and then loads another page the
same way. Now at this point let’s say the
data is coming from the database, similarly something
has changed so we lost that data source. Minimal data. We will get the new data source
by recreating new page list, we are going to extract the key
from one of the items that is currently visible in the
recycler view and load the page from this new data source. And we get that page, we
understand these are pages, we will calculate the dif and update the UI with the
correct animations. Now, we don’t have the data on
top anymore so if user tries to
scroll up we need more data. Same thing, you just take the
key from the last item, call load, get
that page appended to the page list and now user can scroll
upwards. This is always lazy, always
reactive. Third one is the page keyed data
source. This is a really common way of paging especially on the
server side APIs. Your client sends a question and
when the response comes from the server it include the data and also
includes pointers for the next and
previous keys. So if you have a data source
like that you should implement the page key data source, specify the type of keys
using the pointers and item type. Let’s look at an example. First time user comes comes, we called load initial,
give it a size and ask whether place
holders enabled or not. We will assume disabled, easier.
And then the data source turns the data but also gives us pointers
for the previous key and the next page key.
Let me start displaying it. If user scrolls, we need more
contact so we are going to call the load
after method, use the key that was returned in the previous
request to get the next page from the data source. Now this next page comes with a
field called adjacent page key. It’s like a list of pages. You take that, user can scroll.
Now the difference in this data source is how we have the invaldation. Hit the refresh button on the
screen, we will get the new data source,
but in the load initial method there’s no key anymore. This is because I mentioned it’s
like a link list so if the previous list is invalid, links
in the list doesn’t mean anything. So always need to log
the very first page again and display to the UI. This is
usually not a problem in practice because you only do
this if user does swipe to refresh so they’re already on
top of the list. the list.
>>CHRIS CRAIK: You talked through what we might see implementing a data source but
what if we want multiple sources at
once. You can page from your back end that you have with all
the benefits of the local cache. You can have a really nice off
line support, you can resume quickly
if the application has been killed and restarted and you can minimize
network traffic by taking advantage of data already on
device. Let’s look at how this might look compared to what we
were showing before with a single source of data. How does
the network fit into the system? One way that we could do this is
we could say well the network is my source of data when I’m connected and
the database is my source when I’m not. If I’m connected I
page data from the network and if I’m not I page
data from the database. The one problem there is that
you don’t have anything storing data in your database even when
you’re loading a network but it’s pretty easy as
a side affect you can load into the database. However there’s a
couple problems that are important to discuss. This
switching model has the first problem. This connected state
is really over simplifying. In reality individual requests to
your servers can succeed and fail and a user that is connected like 20%, some
of the packets go through, that really doesn’t fit nicely into
this model. The other big problem is we are
not using local data when it’s present. Let’s go about looking
at a different way to do this. So what if instead we just
monitor the database and use that as our local single source of truth. What we can do is say well the
only times I need to load data is when the database tells me
it’s out of data. I can use that as a signal to load more
data from the network, store it into the database, and then I
have my entire solution built. I just load data when I need to
but I can present only the database which makes things a
lot simpler. So we get the benefits of
consistent data presentation, we have a similar process but this degrades on
failures. If your user is in in 20%
connected state you can use all the data you have locally and try and fetch or
retry when the network is around. Potentially you might
say this doesn’t keep my data fresh. An easy way to work around that
is say when ever anyone starts new data
we start a new fresh. And that’s especially important
usually when you have frequently updating data.
So in that proposed model here that we saw we need an out of
data signal from the database because the rest of that we
basically already built in the first few slides. So when we have that signal we
can trigger loads from the network directly into the
database and the UI doesn’t have to enter into any of that.
So paging built exactly this signal for exactly this reason and we call
the the boundary call back. Let’s look at what that might
look like. The first most important part of
the boundary call back that you implement is that you pass it to — you want
to provide it two different sources of data, the database
and the network, because that’s its job. So the important call back we
have here is on I’m end loaded. The last item of the database
has been loaded from the paged list and if there’s more from
the network it’s time to load it.
So the first thing that we do in response to this is over run the
network thread, request the service hey give me more data.
In this particular case we are using the item at the end to feed
which data we need more of because we are in an item key
case similar to the item key data source that you saw before. Now if that request is
successful we we simply jump over the the
database thread, insert that data and we are basically done.
We connected that signal that we needed and now we added network
to something that was purely
database. It’s possible when the database is being invalidated locally for
multiple add end signals to trigger. You can protect this
with a simple boolean to say if I’m already
loading, don’t try. And then we can re-set that at
the end. So using boundary call back is
pretty simple. You can add this in your Rx page
list builder and that gives you the
database plus network solution all
isolated in that one call back. Let’s talk about what the paging
library is. So the paging library provides
paging from database network and importantly both as we just saw
and it can load that data directly into recycler view. It extends this live data of
lists, this observable list pattern that we
like because it keeps our UI simple and let’s us contain all
the complex logic on one side. It’s configurable. We have configurable load size, pre-Fitch and place holders. 1. 0 was just released so so please try and give it a
spin. >>YIGIT BOYAR: Where to go
next? We will provide more details and more samples here.
We have an amazing code lab. You can change it out in the
code lab area or you can try it online. We also have have samples on
GitHub where we implement different
data sources and you can compare and contrast the behavior have
different data sources and you can also see how you can handle things like errors, retries. But if you have been sitting
here for the last 30 minutes and
wondering whether these two guys never
heard about cursors, rest assure we did hear about that. What we
wanted to use initially when we started this project because we already have cursor adapters,
courseier takes care of all the work, right? Wrong. It’s very unpredictable and unefficient in your cursor
becomess locked. Be very careful. Room and paging together avoids
this problem because we create much
smaller queries not to rely on the paging behavior of cursors. There’s an amazing blog post
about this that you can watch the link. We will post it after
the talk. It’s amazing. Cursor as paging is not the
right way to the paging. We did it wrong.
>>CHRIS CRAIK: Did it wrong ten years ago at least.
>>YIGIT BOYAR: Ten years ago. But it’s not me. Okay. [Laughter] so paging is part of
Jetpack. Our new initiative to accelerate
Android development. We have great many talks at this I/O.
This was the last one. If you were not able to go all the
sessions check everything on YouTube, it’s recorded. And you can learning about
Jetpack on our website. Thank you. (Applause) .
>>Thank you for joining this session. Grand ambassadors will
assist with directing you through the designated exits.
We will be making room for those who registered for the next
session. If you registered for the next session in this room we
ask that you please clear the room and return via the
registration line outside. Thank you. Distributed TensorFlow training. Training Distributed TensorFlow Training . Distributed TensorFlow training. Training .
>>Welcome. Please fill in the seats near the front of the
room. Thank you. Disputed TensorFlow training. Training Eps .
>>At this time please find your seat. Our session will begin
soon. Distributed TensorFlow Training. .
>>Hello, everyone. Welcome. What a busy last few days this
has been. Thank you for being with us
until the very end. My name is Priya.
>>ANJALI SRIDHAR: And I’m Anjali.
>>PRIYA GUPTA: We are on the TensorFlow team. We are so excited to be here
today to tell you about distributed TensorFlow training.
Let me grab the clicker. Okay. Hopefully most of you know what
TensorFlow is. It’s an open source Machine Learning framework used extensively both
inside and outside Google. For example if you’re trying the
smart compose feature that was launched a couple days ago, that
feature uses TensorFlow. TensorFlow allows you to build,
train and predict using neural such as
this. In training we learned the parameters of the network
using data. Training complex neural networks
with large amounts of data with often
take a long time n the graph you can
see the training time on the X axis and
the accuracy of predictions on the Y axis . As you can see, it took more
than 80 hours to get to 75% accuracy. If you have some
experience running complex Machine Learning models this
might sound rather familiar to you and it might make you feel something
like this. If you’re training takes only a
few minutes to a few hours you’ll be productive and happy
and you can try out new ideas faster. When it starts the take a few
days, maybe you can manage and run a few things in parallel.
When it starts to take a few weeks your progress will slow down and
it becomes expensive to try out every new idea. And when it starts to take more
than a month I think it’s not even worth thinking about. And this is not an exaggeration. Training complex models can take
up to a week on a single but powerful
GPU like a Tesla P 100. So natural question to ask is
how can we make training fast? There are a number of things you
can try. You can use the faster
accelerators such as a TPU. I’m sure you’ve heard all about them
in the last couple of days here. Your input pipeline might be the
bottle neck. There are a number of guidelines on the TensorFlow
website that you can try and improve the performance or your
training. In this talk we will talks on
distributed training that is running training in parallel on
multiple devices such as CPUs, GPUs, or TPUs in
order to make your training faster.
With the techniques that we will talk about in this talk, you can
bring down your training time from weeks to
hours with just a few lines of code
and a few powerful GPUs. In the graph you can see the images per
second processed while training an image recognition model. As
you can see as we increase the number of GPUs from 1 to 4 to 8,
the images per second process can
almost double every time. We will come back to these
performance numbers later. Before we get into the details
of scaling in TensorFlow, first I want to cover a few high level concepts
and architectures in distributed training. This will give us a strong
foundation to understand the various solutions. As you’re focused on training
today, let’s take a look at what a typical training loop looks
like. Let’s say you have a simple model like this with a couple of hidden
layers, each layer has a bunch of of
weights and biases also called trainable variables. A training step begins with
processing on the input data but then feeds this input into the
model and computes the productions in the forward pass.
We then compare the productions with the input label and compare —
compute the locals. Then in the backward pass we
compute the gradient and finally we
update the models using the gradient. This is known as one training step. It repeats until you reach the
desired accuracy. Let’s say you begin your training with a
simple machine under your desk with a multi-core CPU. Luckily TensorFlow handles
scaling for you automatically. Next you may speed up by adding
an accelerator to your machine but distributed training you can go
even further. You can go from one machine with
a single device to one machine with multiple devices and finally to
multiple machines with possibly multiple devices each connected
over the network. With a number of techniques
eventually it’s possible to scale to
hundreds of devices and that’s what we do in a lot of Google systems. In the rest
of this talk we will use the terms device, worker or
accelerator to refer to processing units such as GPUs or
TPUs. So how does distributed training
work? There are a number of ways to go
about when you think about distributing your training.
What approach you pick depends on tb size of your model, the
amount of training data you have, and the available devices. The most
common architecture and distributed training is what is
known as data parallelism. In data parallelism we run the same
model and computation on each worker but with a different
slice of the input data. Each device computes the loss and gradient, uses gradients to
updates the model parameters. And the updated model is used in
the next round of computation. There are two common approaches
within you think about how do you update the models using these gradients. The first approach is what is
known as async parameter server approach. In this approach we designate
some devices as parameter servers as
shown in blue here. These servers hold the
parameters of the model. Others are designated as
workers, as shown in green leer. Workers do the bulk of the
computation. Each worker fetches the parameters from the parameter server, it
then computes the loss and gradients, sends the gradients
back to the parameter server which then updates the models
parameters using these gradients.
Each worker does this independently so this allows us
to scale this approach to a large number of workers. This
has worked well for many models in Google where training workers
might be preempted by high priority production jobs or where there’s asymmetry
between the workers or where machines might go down for
regular maintenance. And all of this doesn’t hurt the scaling
because the workers are not waiting on each other.
The down side of this approach, however, is that workers can get out of
sync, they’re computing their gradients on stale parameter values and this
can delay convergence. The second approach is what is
known as synchronous allreduce. This has become more common with
the rise of fast accelerators such
as TPUs or GPUs. In this approach each worker has a copy
of parameters on its own, there are no special parameter
servers. Each worker computes the loss
and gradients based on a subset of
training samples. Once gradients are computed the
workers communicate among themselves to propagate the
gradients and update the model parameters. All the workers are synchronized
which means the next round of computation doesn’t begin until
each worker received the updated gradients and
updateed this model. When you have fast devices in a controlled environment, the
variants of set time between the different workers can be small.
When combined with strong communication links between the different devices, over all
overhead or synchronization can be small. When ever practical
this approach can lead to faster conversions. A class of algorithms called
allreduce can be used to efficiently
combine the gradients across the different workers. By adding them up and then
copying them to the different workers, it’s a
fused algorithm that can be very
efficient. There are many allreduce
algorithms available depending on the time of communication
available between the different workers. One common algorithm is what is
known as ring allreduce. Each worker sends gradients and receives gradients from its predecessor. At the end of of the algorithm
each worker has received a copy of the combined gradient. Ring allreduce uses network
Bandwidth optimally. It can also overlap the gradient
computation at lower layers in the network with transmission of
gradients at the higher layers which means it can further
reduce the training time. Ring allreduce is just one approach and one approach. We have a team in
Google working on fast implementations of
allreduce for various device topologys. The bottom line is allreduce can
be fast when working with multiple devices on a single
machine or a small number of machines. So given these two architectures
and data parallelisms you may be wondering which approach should
you pick. There isn’t one right answer. Parameter server
approach is preferable if you have a large number of not so powerful or not so
reliable machines. If you have a large cluster of machines with
just CPUs. The synchronous allreduce
approach is is preferible if you have —
parameter server approach has been around for a while and has been supported
well. TPUs use allreduce approach out of the box.
In the next section of this talk we will show you how you can
scale your training using the allreduce approach on
multiple GPUs with just a few lines of code.
Before I get into that I just want to mention another type of
distributed training known as model parallelism that you may
have heard of. A simple way to think about
model parallelism is when your model
is so big that it doesn’t fit in the memory of one device so you
divide the model into smaller parts and through those
computations on different workers with the same training
samples. For example, you could put different layers of your model on
different devices. These however most devices have big
enough memory that most models can fit in their memory. So in
the rest of this talk we will continue to focus on data
parallelism. Let’s see how you can do this
TensorFlow. As I already mentions we are going to focus
on multiple GPUs with the allreduce architecture. I’m
pleased to introduce the new distribution strategy API. This allows you to distribute
your training in TensorFlow with very little modification to your
code. The distribution strategy API
you no longer need to place your ops or
parameters on specific devices. You don’t need to worry about
structuring your modeling in a way that the gradients or losses across
devices are aggregated correctly. Distribution
strategy does that for you. It is easy to use and fast to
train. Now let’s look at some code to
see how you can do this API. In our example we are going to
be using TensorFlow’s high level API call
estimator. If you’ve used this API before
you might be familiar with the following snippet of code to create a
custom estimator. It requires three argument also. The first one is a function that
defines your model so defines the parameters of your model,
how you compute the loss and the gradients and how you update the
model’s parameters. The second argument is the directory where
you wants to persist the state of your model. And the third
argument is a configuration call run con fig
where you can specify things. In this case we used the default
run config. Wujs you create the estimator
you can start your training by kaug the
train method with the input function that provides your
training data. Given this code to do the
training on one device, how can you change
it to run on multiple GPUs? You simply need to add one line
of code. Something called mirrored strategy and pass it to
the run config. That’s all the code changes you need to scale
this code to multiple GPUs. Mirrored strategy is a type of
distribution strategy API that I just mentioned. For this API you don’t need to
make any changes to your model
function or input function or training loop. You don’t need
to specify your devices. If you want to run on all available
devices it will automatically detect that and run your
training on all available GPUs. So that’s it. Those are all the
code changes you need. This API is available and you
can try it out today. Let me quickly talk about what
mirror strategy does. Mirrored strategy implements the
synchronous allreduce architecture that we talked
about out of the box for you. In mirrored strategy the model’s
parameters are mirrored across devices hence the name mirrored
strategy. Each device computes the last
end gradient based on a subset of the input data. The gradients are aggregated
across the workers using an allreduce algorithm that is
appropriate for your device topology.
As I already mentioned with mirrored strategy you don’t need
to make any changes to your model or your training loop. This is because we have changed
line components of TensorFlow to be distribution aware. For example, optimizer,
summaries, et cetera. You don’t need to make any
changes to your input function as long as you’re using the recommended TensorFlow
data set API. You can save with within or no distribution
strategy and resume with another. And summaries work as
expected as well so you can continue to visualize
your training in TensorFlow.
I’ll now hand it off to Anjali to show you cool demos and performance
numbers. (Applause). >>ANJALI SRIDHAR: Thanks,
Priya, for the great introduction to mirrored
strategy. Before we run the demo, let’s get familiar with a
few configurations. I’m going to be running the res
net 50 model from the TensorFlow model garden. It uses fit connections for
every gradient flow. The TensorFlow model garden is a
repo where there are a collection of
gardens. If you’re new to TensorFlow this is a great
resource to start with. I’m going to be using the image net
data set as input to model training. It’s a collection of
over a million images that have been categorized into a
thousands labels. I’m going to intansate the N 1 standard
and Tesla V 100. Let’s run the demo now. As I mentioned I’m creating an N
1 standard instance attaching
eight in video Tesla V Nvidia Tesla V 100. To run our TensorFlow model we
need to install a few drivers and
packages and here is a gist with all the
commands required. I’m going to make this public so
you can set up an instance yourself and try running the
model. Let’s open an SSH connection to
the instance by clicking on a button
here. This should bring up a terminal like this. I’ve already cloned the garden
model repo. We are going to be running this command inside the res net
director . We are are going to run the
image net main fail. A model director is going to
point Batch size of 1024 or 128 per GPU. to the bucket. We point our
data director to the SSD disk writ has the image net data set
and the number of GPUs is eight of which we want to be able to
train you’re model. So lets run this model now.
Let’s take a look at some of the code changes that are involved to
change the res net model function. This is the res net
main function in the garden model repo. First we instantiate the
strategy and pass it to the run config. We create an estimator object
with the run config and then and those are all the code
changes you need to distribute the res net model. Let’s go
back and see how our training is going. So we have run for a few hundred
steps. At the bottom of the screen you
could see the metrics. Let’s look at tensor board. This is
from a run where I’ve run the model for 90 thousands steps.
The orange and red lines are the training and evaluation losses. So as the number of steps very
you see the loss decreasing. Let’s look at evaluation
accuracy and this is when we are training res
net 50 over eight GPUs. Around 91,000 steps we are able
to achieve a 75% accuracy. Let’s see what this looks like
when we write on a single GPU so let’s
toggle the tensor board buttons on the left and look at the lost
cause. So the blue lines are one GPU
and red and orange are eight. You can see the loss doesn’t
decrease as rapidly as it does with eight GPUs. Here are the evaluation accuracy codes.
Let’s compare using wall time. So we run the same model nor the
same amount of time. And when we run it over multiple GPUs we
are able to achieve higher accuracy facility accuracy faster or
train our model faster. Let’s look at a few performance benchmarks on the DGX one. It’s
a machine on which1. It’s a machine on which we run deep
learning models. We are running mixed precision
training with a batch size of 256. It also has V 100 GPUs. So the graph shows X axis the
number of GPUs on the X axis and images per second on the Y axis.
As we go from one GPU to eight we are able to achieve a speed up of 7
X. 8 we are able to achieve a speed up of 7 X. And this is
performance right out of the box with no tuning. We are actively
working on improving performance so that you’re able to achieve more speed up and get
more images per second when you distribute your model across
multiple GPUs. So far we have been talking
about the core part of model training and distributing your
model using mirrored strategy. Let’s say now you deployed your
model on multiple GPUs. You’re going to expect to see
the same kind of boost in images per second when do you that but
you may not be able to view as many images the second as xarld
to one GPU. You may not see the boost in
performance and the reason for that is often the input pipelinecompared
to one GPU. You may not see the boost in performance and the
reason for that is often the input pipeline. Run you run your model on a
single GPU the input pipeline is
preprocessing the data and making it available for
training. But GPUs process and compute
data much faster than a CPU. This means that when you
distribute your model across multiple GPUs, the input
pipeline is often not able to keep up with the training. It quickly becomes a bottle
neck. For the rest of the talk I’m going to show you how TensorFlow makes it
easy for you to use data to build
efficient input pipelines. Here is a simple input pipeline
for res net 50. We are going to use TF. data APIs. When you have lots
of data and different data formats and you want to perform
complex transformations on this data you want to be using tf.data APIs to build your
pipeline. First get the list of input files that contain your
image and labels. Then we are going to read these files use the tf record data set
reader. We are going to shuffle the records. Repeat them a few times
depending on if you want to run your model for a couple of epochs and finally
apply a map transformation. This process transformation center,
and finally batch the input and into
a batch size that you desire. The input pipeline can be
thought of as an ETL process as an ETL
process. In the extract phase we are reading from storage. In the transform phase we are
applying the different transformations.
And finally in the load phase we are providing this process data
to the accelerator for training. So how does this apply to the
example that we just saw? In the extract phase we list the
files and read it using the tf. data record data set . And finally in the load phase
we tell TensorFlow how to grab the data
from the data set. This is what our input pipeline
looks like. We have the action the extract,
transform and load phases happening sequentially. This
means when the CPU is busy preprocessing the data the
accelerator is idle. When the accelerateor is training the
model the CPU is idle. But the different phases of the
ETL process use different hardware
sources. . The training happens on the
accelerator. So if we can parallelize these different phases, then we can
have the preprocessing of data on the CPU with training of the
model on the GPU. This is called pipelining. So we can use pipelining and
some parallelizeation techniques to build more efficient input pipelines.
Let’s look at a few of these techniques. First you can parallelize file
reading. Let’s say you have a lot of data
across the cloud storage service. You want to read multiple files
in parallel and you can do this use can the num parallel reads. This allows you to increase your
effective. We can also parallelize map
function for transformations. The
different transformation for the map function by using the num
parallel calls argument. Typically the argument may
provide a number of calls of the CPU. And finally you want to call pre-fetch at the end of
your pipeline. You can buffer data for the next training step
while the accelerator is still training the current step. This is what we had before. And
this is what we can get an improvement on. Here the different phases of the
input pipeline are happening in parallel with training. We are able to see the CPU is preprocessing data in step two
while the training step is still in
step one. The training time is now a
maximum of preprocessing and training on the accelerator. As
you can see the accelerator is still not 100% utilized. There are few advanced
techniques that improve this. We can use fused transformation
ops for some of these API calls. Shuffle and repeat, for example,
can be replaced by fused op. So this parallelizes buffering, epoch and plus one. We can also replace map and
batch with equivalent fused op. This parallelizes the map transformation with adding the
input tensors to batch. With these techniques we are
able to process data much faster and make t available to the accelerator for
training and improve the training speed. it available to the accelerator
for training and improve the training speed.
I hope this gives you a good idea of how you can use APIs to build
efficient input pipelines when you train your model.
So far we have been talking about training on a single
machine and multiple devices. What if you wanted to train on
multiple machines? You can use the estimators train
and evaluate API. Train and evaluate API using the
async parameter server approach. This API is used widely within
Google and it scales well to a large
number of machines. Here is a link to the API where you can
learn more on how to use it. We are also excited to be
working on a number of new distribution strategies. We are working on a
multi machine mirrored strategy. We are also working on adding
distribution strategy support to TPUs and directly in tf.Keras. In
this talk we talked about the different concepts related to
distributed training architectures and API. When you go home today here are
three things to keep in mind when you train your model.
Distribute your training to make it faster. To do this, you want to use
distribution strategy APIs. They’re easy to use and fast. Input pipeline performance is
important . Here are a few TensorFlow
resources. First we have the distribution strategy API. You can try using mirrored
strategy to train your model across multiple
GPUs. Here is a link to the res net 50
model garden example so you can try running this example. It has mirrored strategy API
support enabled. Here is a link also to the input
pipe lin performance guide which has more
techniques you can use to build efficient pipelines. And here is the link to the gist
that I mentioned in the demo. You can try setting up your on instance and running the res net
50 model garden example. So this is a combined effort
involving a lot of folks on the TensorFlow team and we are
really excited to be heading in this direction. We are happy to
take questions off line near the community meet up spot. Thank
you for attending our talk and we hope you had a great I/O. (Applause) . (Session concluded)

Only registered users can comment.

Leave a Reply

Your email address will not be published. Required fields are marked *