monolithic and microservice services

I suspect this may be out of scope, but here's a possibly cool (possibly
dumb) idea, which I figure there's no harm in floating. (Sorry if I'm using
the wrong terminology below.)

Some services really make sense as remote microservices (e.g. long running
tasks with small messages), while others make more sense as monolithic
"local" functions (e.g. quick tasks with very large messages). I basically
think it would be cool (and possibly useful) if nameko could figure out
which is best in each case. E.g.

- to run "remotely", it uses a normal RPC call
- to run locally, the service class is just imported and used directly
(maybe incorporating multiprocessing somehow)
- nameko does some smart inference (e.g. based on message size etc.), as
well as testing of past behaviour (e.g. maybe hello_world runs 100x quicker
locally), and can choose which is best. Alternatively, the user can specify
how they want it run.

I (naively?) don't think it's actually that hard to do, at least in a hacky
sense. The local/remote thing is already supported, apart from
un-decorating the @rpc'd methods for local. The rest would just be adding
tweaks like recording a time for service calls etc., plus the inference
logic. That said, doing it in a nice, robust, generic, user-friendly way
will probably require a lot more thought.

I think I understand the rationale — to sometimes offload processing to
another machine and other times to do it in-process? This is traditionally
more in the realm of a tool like celery, rather than nameko.

There are many reasons to adopt a microservice architecture, and they’re
almost all structural:

* Services can be developed by different teams
* Services can be deployed independently and run on different hardware
* Services should be internally cohesive and externally decoupled

Most of these would make it impossible to implement or are otherwise at
odds with something like you suggested.

Having said that, is it possible to pretty much reimplement celery using
nameko. You could fairly easily write a service that decided on the fly
whether to process a request in-process or offload it to its peer via async
RPC. This would just be an application built with nameko though, rather
than any new features or structural changes.

···

On Friday, August 19, 2016 at 4:37:48 AM UTC+8, kodonnell wrote:

I suspect this may be out of scope, but here's a possibly cool (possibly
dumb) idea, which I figure there's no harm in floating. (Sorry if I'm using
the wrong terminology below.)

Some services really make sense as remote microservices (e.g. long running
tasks with small messages), while others make more sense as monolithic
"local" functions (e.g. quick tasks with very large messages). I basically
think it would be cool (and possibly useful) if nameko could figure out
which is best in each case. E.g.

- to run "remotely", it uses a normal RPC call
- to run locally, the service class is just imported and used directly
(maybe incorporating multiprocessing somehow)
- nameko does some smart inference (e.g. based on message size etc.), as
well as testing of past behaviour (e.g. maybe hello_world runs 100x quicker
locally), and can choose which is best. Alternatively, the user can specify
how they want it run.

I (naively?) don't think it's actually that hard to do, at least in a
hacky sense. The local/remote thing is already supported, apart from
un-decorating the @rpc'd methods for local. The rest would just be adding
tweaks like recording a time for service calls etc., plus the inference
logic. That said, doing it in a nice, robust, generic, user-friendly way
will probably require a lot more thought.

Yup, correct understanding.

Performance was a big consideration regarding microservices within our
team. Partly that the normal bits aren't that much slower (due to messaging
etc.), but also that we can get better performance in bottlenecks when we
need to, simply by spinning up more instances of services. Also, part of
our motivation was to get away from celery, with which we have had a
somewhat stormy relationship.

I'm quite possibly wrong, but the 'manager service' doesn't seem to do
stuff 'in process' - at best it'll still have to
serialize/transmit/deserialize the message to the manager service, then
serialize/transmit/deserialize to the actual service, then run the service,
and repeat in reverse. I was thinking that it literally runs the service
method in-process on the thing in memory (like normal monolithic python),
and skip all the middle-men. Does that make sense? I'm also not sure how
it'd be at odds with the first point you've listed, or necessarily the last
two - the services are still decoupled and deployed independently, but if
they also exist in-process (for whatever reason, e.g. someone has their
services in a single code base), then there's the option to run them
in-process.

Agreed, it wouldn't be useful in the format you've suggested, but if users
could simply use e.g. `UberRpcProxy` instead of `ClusterRpcProxy`, then it
might be quite handy. (As an aside, is there an e.g. `FakeRpcProxy` which
allows users to run services all in-process? Could be handy for testing, or
those who don't e.g. have RabbitMQ installed, etc.)

Anyway just an idea - if it's not useful to a significant number of others,
then best to leave it out.

···

On Friday, August 19, 2016 at 3:01:02 PM UTC+12, Matt Yule-Bennett wrote:

I think I understand the rationale — to sometimes offload processing to
another machine and other times to do it in-process? This is traditionally
more in the realm of a tool like celery, rather than nameko.

There are many reasons to adopt a microservice architecture, and they’re
almost all structural:

* Services can be developed by different teams
* Services can be deployed independently and run on different hardware
* Services should be internally cohesive and externally decoupled

Most of these would make it impossible to implement or are otherwise at
odds with something like you suggested.

Having said that, is it possible to pretty much reimplement celery using
nameko. You could fairly easily write a service that decided on the fly
whether to process a request in-process or offload it to its peer via async
RPC. This would just be an application built with nameko though, rather
than any new features or structural changes.

On Friday, August 19, 2016 at 4:37:48 AM UTC+8, kodonnell wrote:

I suspect this may be out of scope, but here's a possibly cool (possibly
dumb) idea, which I figure there's no harm in floating. (Sorry if I'm using
the wrong terminology below.)

Some services really make sense as remote microservices (e.g. long
running tasks with small messages), while others make more sense as
monolithic "local" functions (e.g. quick tasks with very large messages). I
basically think it would be cool (and possibly useful) if nameko could
figure out which is best in each case. E.g.

- to run "remotely", it uses a normal RPC call
- to run locally, the service class is just imported and used directly
(maybe incorporating multiprocessing somehow)
- nameko does some smart inference (e.g. based on message size etc.), as
well as testing of past behaviour (e.g. maybe hello_world runs 100x quicker
locally), and can choose which is best. Alternatively, the user can specify
how they want it run.

I (naively?) don't think it's actually that hard to do, at least in a
hacky sense. The local/remote thing is already supported, apart from
un-decorating the @rpc'd methods for local. The rest would just be adding
tweaks like recording a time for service calls etc., plus the inference
logic. That said, doing it in a nice, robust, generic, user-friendly way
will probably require a lot more thought.