-
wlach
hey all!
-
wlach
pmoore: thanks for setting this up
-
wlach
I have some thoughts on this, I was wondering if we might kick things off by starting a discussion in mozilla.tools?
-
pmoore
wlach: you're welcome! thanks for joining =)
-
pmoore
wlach: sure
-
wlach
ok, I'll write something up today :)
-
pmoore
wlach: note eventually there should be a dedicated mailing list, but it totally makes sense to start a thread in mozilla.tools to get discussion started. see
bug 1281017
-
wlach
I just noticed that you had already posted to m.tools.treeherder
-
wlach
(and m.tools.taskcluster)
-
wlach
too many tools mailing lists :)
-
pmoore
:)
-
pmoore
yes, i tried to pick the mailing lists for the teams inside platform operations - i'm not sure if i got them all
-
pmoore
from my sketchy memory of the teams that were presented in the platform operations all hands meeting we had
-
pmoore
wlach: i'm curious what your thoughts were
-
pmoore
wlach: so for taskcluster products, we have schemas like these to define the APIs:
references.taskcluster.net/queue/v1/api.json (just an example)
-
wlach
pmoore: interesting, is that a standard?
-
pmoore
wlach: that refers to schemas for request/response payloads (e.g.
schemas.taskcluster.net/queue/v1/create-task-request.json)
-
pmoore
-
wlach
so one thing from the treeherder end of things is that we already have an ecosystem for generating documentation and browsable api's, mostly with django-rest-framework
-
pmoore
wlach: no, it is a bit custom home-grown. there are some competing standard like json hyperschema, swagger, etc - but we have our own which has entities around things like taskcluster authorization scopes
-
wlach
I'm not really inclined to want to spend time recreating that, though I think generating something like what you show should be pretty easy
-
pmoore
wlach: sure
-
wlach
and since treeherder is switching to taskcluster authentication
-
wlach
... likely the same sorts of concerns you have for your API's apply to ours
-
pmoore
right - it would nice to share on standards to cut down the work on all teams going slightly different ways
-
AutomatedTester
pmoore: I would love to see some kind of RFC model, like the Rust group use, for proposing APIs and thing
-
pmoore
and also to have a homogeneous view of all the apis across teams
-
pmoore
AutomatedTester: ah interesting, do you have a link to the rfc model they use?
-
AutomatedTester
-
pmoore
cool, thanks
-
AutomatedTester
pmoore: it's kinda like the python PEP model
-
AutomatedTester
that way we can build up from a core
-
AutomatedTester
and then make changes and people can comment
-
pmoore
wlach: so i don't necessarily propose we do everything the taskcluster-way - it was more about just getting together, seeing what we are all doing, finding common ground and working out how we can move in a direction where we intend to adopt some mechanism so moving forwards we can expose apis in a shared ecosystem of docs, clients, etc
-
pmoore
AutomatedTester: that sounds like a good approach, i will read up on it
-
wlach
I'm not sure if such a heavyweight process would be right for internal tools. python and rust have the rfc process because they have so many users
-
AutomatedTester
wlach: we can make it as heavy weight or lightweight as we want
-
AutomatedTester
but being open and a good process for comment is good
-
wlach
well no arguments that communication is good :) but what would you like to see above and beyond people announcing changes in mozilla.tools or where-ever, which is more or less what happens now?
-
AutomatedTester
wlach: yes I would like it to be before we announce
-
AutomatedTester
that it is done
-
wlach
that seems fair, at least for features/api's that people are using
-
wlach
with treeherder we had some internal api's which we recently removed, but no one noticed because they weren't using them
-
AutomatedTester
wlach: my concern is that people will pick a model that suites the engineer coding them without any input from other people who would consume them
-
AutomatedTester
wlach: which is what we have now
-
AutomatedTester
EngProd has a history of creating stuff without input before or during the process
-
AutomatedTester
and then people go "That's not really what I wanted"
-
AutomatedTester
Treeherder is a great example of this
-
wlach
oh don't I know that :)
-
AutomatedTester
so I would love a process, RFC/PEP style or whatever, where people can comment
-
AutomatedTester
if they dont well thats then their problem
-
AutomatedTester
pmoore: I do have some concerns about auto-generated tools from schemas
-
wlach
I think I have observed this problem more at the level of projects than api's
-
wlach
but obv the api is a reflection of the project, so if there are requirements/design problems there, then that will be reflected in the api's as well
-
AutomatedTester
wlach: exactly that
-
AutomatedTester
wlach: Platform Ops has the brightest people I have ever worked with yet we make so many stupid mistakes
-
AutomatedTester
whatever process we decide it should help limit our stupid mistakes :)
-
pmoore
AutomatedTester: tell me about your concerns
-
wlach
yeah I think for larger initatives/features/projects some kind of RFC project would make sense
-
pmoore
(i also have concerns, i'm just curious to get them all out in the open) :)
-
wlach
I just don't want to have to write a 500 word RFC to remove a useless API that no one uses in treeherder :)
-
AutomatedTester
pmoore: so... from experience, the APIs might not be idiomatic to the language that they generate for
-
wlach
(although I'm happy to write a post on mozilla.tools letting people know I'm going to do that)
-
AutomatedTester
pmoore: more often than not they create God Objects
-
pmoore
wlach: i noticed in the rust rfc guide it says "Many changes, including bug fixes and documentation improvements can be implemented and reviewed via the normal GitHub pull request workflow. Some changes though are "substantial", and we ask that these be put through a bit of a design process and produce a consensus among the Rust community and the sub-teams."
-
pmoore
wlach: so maybe it can just be that we propose something, and people can flag it as substantial if they have concerns, and if nobody flags it, we can proceed with a lightweight process?
-
wlach
pmoore: I think I would trust people to make the right call for their own projects... if we have a persistent problem with people not asking for feedback on larger changes we can always revisit
-
AutomatedTester
pmoore: take Bugsy as example (
bugsy.readthedocs.io/en/stable) I think the search API would be horrendous if it wasnt a fluent API and I don't think autogenerated clients would notice that (see God Object argument again)
-
AutomatedTester
but hand generating tools also has a cost...
-
AutomatedTester
it's a balance
-
pmoore
AutomatedTester: agreed. i guess the interface of any auto-generated client is always going to encapsulate the API calls, rather than an abstract object view of the system - e.g. you are not going to be sending and receiving objects which have useful methods associated with them
-
pmoore
it will always be classes for calling API points, and getting back objects/types which represent the received payload
-
pmoore
however, maybe that is still useful for consistency, and more object-oriented clients can be built on top, if wanted? at least to get a solid and consistent treatment of the API (backoff settings, retry mechanics, correct encoding / parsing etc)
-
AutomatedTester
pmoore: sure
-
AutomatedTester
autogenerate the primatives and then have handcrafted APIs for doing the heavy lifting
-
pmoore
i think we probably have quite a long path until we get to that point though - i'd propose we start by at least working out what features of an API we'd like to capture and how we could present those APIs in a useful fashion
-
pmoore
AutomatedTester: have you had a chance to take a look at the generated clients that taskcluster provides? (just as a starting point for discussion)
docs.taskcluster.net/manual/tools/clients
-
pmoore
(i only talk about taskcluster, as it is the system i know about - i'm not familiar with the other APIs we have, but am keen to find out)
-
pmoore
so, for example, here are the go libraries generated from those same schemas:
godoc.org/github.com/taskcluster/taskcluster-client-go
-
pmoore
obviously, the generated libraries can only be as good as the references and schemas they were generated from :)
-
AutomatedTester
pmoore: so the api.method(....) stuff makes it that nothing is discoverable
-
AutomatedTester
while that is great for lowlevel stuff, for a human to use it they need to know the transport layer stuff
-
AutomatedTester
which imo is not something people should ever care about
-
AutomatedTester
I want X so I should call tc.get_x()
-
AutomatedTester
what ever happens below that should not matter to me
-
AutomatedTester
if TC decide to change the endpoint, I shouldnt have to update my code
-
AutomatedTester
at worst I get the latest version of my hand crafted API
-
pmoore
AutomatedTester: i'm not sure i entirely agree - so for example, the APIs are the interface, rather than the client methods - i.e. no team should just change the method, since that defines the agreed interface
-
pmoore
AutomatedTester: this way, the interface is clearly bounded, it is well described, and can be called from any language etc (where as if the interface is the client, it forces the user to use client side code of ours)
-
pmoore
AutomatedTester: i think this model has worked well for service providers with much bigger APIs than ours (e.g. aws) - so I think it can work to document the main services with a list of endpoints, and the user to find the right endpoint
-
pmoore
AutomatedTester: e.g. i need to know the command "git" and the subcommands I can pass to it, and the options of those subcommands, is a bit like i need to know the queue services, the individual API endpoints, and the options for those API endpoints...
-
pmoore
AutomatedTester: e.g. a command line client might support a call like `mws taskcluster queue get-task --task-id XYZ`
-
pmoore
or `mws treeherder submit-job --payload '{.....}'`
-
pmoore
i think that could be a reasonable interface
-
AutomatedTester
pmoore: so you're saying that people would rather api.method("{username:foo, password:bar}:, arg1=login, arg2=to, arg3=tc) instead of tc.login(foo, bar) ?
-
pmoore
no
-
AutomatedTester
AWS's python client hides the transport stuff and gives you boto.connect_s3()
-
pmoore
so my example was for a command line client above, rather than a language library, but we can talk about that too if you like
-
pmoore
i see the authentication as being managed via any client, just like you get a ~/.aws/config which can store credentials, so aws calls don't need the username, password
-
AutomatedTester
that's not my point
-
AutomatedTester
my point is, people will take primatives and make something "human" on top
-
AutomatedTester
e.g. get_running_instances() over api.method(arg1=get, arg2=running, arg3=instances)
-
pmoore
but didn't we agree on that? ("autogenerate the primatives and then have handcrafted APIs for doing the heavy lifting")
-
AutomatedTester
so not sure what your disagreement is with them
-
AutomatedTester
then*
-
pmoore
AutomatedTester: so i wanted to talk about what was possible in terms of client generation, but what i disagreed with is that a supported interface should be a client library - i think the supported interface should be the APIs. if people want to write handcrafted APIs on top, whether inside or outside the team, i think the APIs should be the agreed
-
pmoore
interface
-
pmoore
sorry, handcrafted libraries
-
AutomatedTester
ahhh right
-
pmoore
i think realistically, people should care about transport, since knowing what is going on under the hood has an impact on performance and cost
-
AutomatedTester
experience tells me they wont care
-
AutomatedTester
and tbqh... a good API should mean they shouldnt have to care
-
pmoore
so i think to focus on what we agree on, it may be useful to generate primitives, and we'd certainly all like to have generated docs
-
AutomatedTester
👍🏼
-
pmoore
:)
-
AutomatedTester
no disagreement there :)
-
pmoore
cool
-
jgraham
I'm curious which part of the treeherder API is "not really what people wanted"
-
jgraham
I mean plenty of the internals obviously are
-
» jgraham doesn't really understand why autogeneration of HTTP APIs is so useful
-
jgraham
I mean HTTP is already quite a usable protocol
-
jgraham
*utogeneration of wrappers around HTTP APIs
-
dustin
I'm curious why we're so focused on autogenerating clients
-
wlach
I put some thoughts into m.tools newsgroup
-
wlach
hopefully not derailing this effort :)
-
wlach
dustin: I wonder if it may just have been the most controversial part of the proposal
-
pmoore
jgraham: for example, type safety in statically typed languages, so you get compile time errors if you try to call a method that doesn't exist, or pass it a type it doesn't expect
-
pmoore
jgraham: another might be consistent handling of authentication, encoding, http backoff strategy, etc
-
jgraham
That doesn't sound like generated libraries, it sounds like handwritten utilities for the most part
-
jonasfj
the auto-generated clients are "stupid" clients. Similar to aws-sdk with exception of boto... all they do is hide HTTP, ensure retries, auth, encoding, and in some languages depending on schema friendlyness facilitate type checking...
-
jonasfj
IMO, boto (not boto3) is an example of how smart client libraries trying to wrap everything in objects can make REST apis hard to use :)
-
jonasfj
(because you never know when you're making an API call, or what REST call you are making)
-
jonasfj
anyways, agree generated need not be the focus... the stuff we would want to generate is just dumb REST api wrappers... useful, but nothing fancy or complicated... (and yes, sometimes you'll write a handwritten library on-top of that too, depending on the use-case and the REST api)
-
jonasfj
final note, before I disappear again: converting taskcluster reference files to swagger or json hyper schema is likely possible... if someone would bother to figure out how either of those work :)
-
jonasfj
I tried, but turns out I was too lazy... maybe one day I'll find the energy to properly understand them... and figure out how to turn them into dumb client libraries and docs..
-
ekyle
pmoore: do you have a problem statement? Maybe something specific? I am at a loss of how a meta api makes life easier.
-
wlach
I imagine the high-level vision is to provide something like "Amazon S3 for Mozilla Platform Services"
-
wlach
(which I ++ approve of)
-
ekyle
wlach: I am not sure what "Amazon S3 for Mozilla Platform Services" is either. I assume it is not just storage.
-
wlach
ekyle: yeah I would think it would include visualization and querying... I think the idea is that for adding new stuff, or querying existing stuff, developers should be able to self-serve without needing to ask us for permission (or to do extra work)
-
ekyle
Even the most trivial type system for a meta API will be enormous work. The important part of an API is how the various calls interact with each other: The overall model it uses, which is best described in english, or some other human language.
-
wlach
maybe it would be better to define this in terms of user stories
-
ekyle
wlach: I agree, a unified permission system is a good start
-
wlach
certainly with treeherder people have needed a lot of handholding to get new systems reporting to it, it would be nice if that weren't required
-
ekyle
wlach: I agree discoverability is a problem we should solve
-
ekyle
wlach: is the hand-holding a problem of explaining the API, or a permissions/security model (like issuing apikeys)
-
wlach
ekyle: probably moreso explaining, but the permissions/security model is also a pain
-
wlach
it would be pretty neat if submitters to treeherder just used taskcluster credentials
-
ekyle
wlach: yes, that would be nice to unify the permission models
-
ekyle
AWS has a nice generic model that can be managed by a central agent (one that does not know the details of a "resource") and other systems can rely on it to manage permissions
-
jonasfj
taskcluster-auth pretty much is an attempt at that... but without replicating the IAM policies that AWS has, which nobody can reason about.
-
wlach
actually, even better would be if people could (mostly) just submit stuff to taskcluster, and didn't even have to worry about treeherder
-
jonasfj
that is pretty much what taskcluster-treeherder ensures...
-
ekyle
jonasfj: you have a model for taskcluster-auth ? I would like to see an entity relationship graph.
-
ekyle
(or db schema)
-
jonasfj
not sure why db schema would be interesting (we have no db, but nosql)
-
jonasfj
at high level, it's Clients(clientId, accessToken, scopes)
-
jonasfj
Roles(role, scopes)
-
jonasfj
with a rule that the scope: "assume:<role>" expands to all the scopes for a role..
-
ekyle
jonasfj: are "scopes" the same as aws "resources"?
-
jonasfj
and the general rule for scopes that "<anything>*" matches any scope starting with "<anything>" (ie. '*' expands, but only if at the end of the scope string)
-
jonasfj
I guess sort of...
-
jonasfj
you can't make any conditions on them or anything like that...
-
jonasfj
in aws you can make all sorts of powerful things like conditions, regexps etc...
-
jonasfj
-
jonasfj
as a permission model it's really simple... just sets of strings
-
jonasfj
with "*" at the end of a string having special meaning...
-
jonasfj
a permission set is expressed as a set of strings... and you can easily do the subset operation... which you can't do for IAM policies
-
jonasfj
ie. given two IAM policies A and B, there is no way to tell if A > B or A < B or if they are incomparable...
-
jonasfj
with TC scopes that's easy to test... this is super powerful
-
ekyle
jonasfj: looks like TC has one type of conditional (prefix matching), and scopes only have one property: "name", i guess?
-
jonasfj
yeah
-
jonasfj
scopes are just strings... they aren't registered anywhere...
-
jonasfj
well
-
jonasfj
we register which scopes a user has... but that's it...
-
jonasfj
for example I can require a scope: queue:cancel-task:<taskId>
-
ekyle
jonasfj: thank you for including what *can not* be done in your slideshow. That helped.
-
jonasfj
yeah, once you understand it it's very simple... limitations are imo bearable...