Nameko and multiprocessing

Hi folks,

We are currently using nameko in my team, and we are happy with it for the
most part.

We would like to launch heavy computation from nameko services, and those
computations needs to be done in parallel. I'm not talking about a nameko
entrypoint, but an entrypoint that calls a code that needs to be run on
several processes.

As of today, we are using python's multiprocessing :
https://docs.python.org/2/library/multiprocessing.html.

We tried to wrap our code into a nameko service, and everything went south
when using the library, namely because it doesn't play nicely with eventlet.

Has anyone ever used multiprocessing with a nameko service? Or is there a
preferred way to run computations in parallel with nameko (with or without
multiprocessing)? Of course, we have already tried to use greenthreads for
our computations. As they produce a lot of writing operations on a
database, performances were not good, even worse than what we are achieving
without multiprocessing at all.

Thanks in advance,

Manfred

Hi Manfred,

I've not tested it, but I would not expect Eventlet and Multiprocessing to
play well together.

You should not need multiprocessing though. Nameko will happily run many
concurrent entrypoints. The concurrency is controlled by the `max_workers`
config key. The default value is 10 so if you've not changed it, it's
probably restricting throughput.

A limitation is that a service's concurrent workers all run within the same
Python process (unlike multiprocessing, which spawns a new Python process
for each job). Python processes are restricted to a single core, so on a
multi-core machine multiprocessing will perform much better than a single
Nameko service (if the work is CPU-bound). You can work around this by
simply running multiple instances of your service.

Matt.

···

On Monday, April 3, 2017 at 10:19:59 AM UTC+1, Manfred Chebli wrote:

Hi folks,

We are currently using nameko in my team, and we are happy with it for the
most part.

We would like to launch heavy computation from nameko services, and those
computations needs to be done in parallel. I'm not talking about a nameko
entrypoint, but an entrypoint that calls a code that needs to be run on
several processes.

As of today, we are using python's multiprocessing :
https://docs.python.org/2/library/multiprocessing.html.

We tried to wrap our code into a nameko service, and everything went south
when using the library, namely because it doesn't play nicely with eventlet.

Has anyone ever used multiprocessing with a nameko service? Or is there a
preferred way to run computations in parallel with nameko (with or without
multiprocessing)? Of course, we have already tried to use greenthreads for
our computations. As they produce a lot of writing operations on a
database, performances were not good, even worse than what we are achieving
without multiprocessing at all.

Thanks in advance,

Manfred

One simple way we achieve this is to have nameko dispatch events which are
handled by another nameko service or the same service dispatching the
event. Might be a naive way to solve the problem but it kinda fits the
solution pattern in general and the code base becomes quite neat :slight_smile:

···

Den måndag 3 april 2017 kl. 11:19:59 UTC+2 skrev Manfred Chebli:

Hi folks,

We are currently using nameko in my team, and we are happy with it for the
most part.

We would like to launch heavy computation from nameko services, and those
computations needs to be done in parallel. I'm not talking about a nameko
entrypoint, but an entrypoint that calls a code that needs to be run on
several processes.

As of today, we are using python's multiprocessing :
https://docs.python.org/2/library/multiprocessing.html.

We tried to wrap our code into a nameko service, and everything went south
when using the library, namely because it doesn't play nicely with eventlet.

Has anyone ever used multiprocessing with a nameko service? Or is there a
preferred way to run computations in parallel with nameko (with or without
multiprocessing)? Of course, we have already tried to use greenthreads for
our computations. As they produce a lot of writing operations on a
database, performances were not good, even worse than what we are achieving
without multiprocessing at all.

Thanks in advance,

Manfred

Running a MultiProcess pool inside a Nameko process is super-hairy (I used
concurrent.futures).

You end up with all sorts of ghost services, that you can no longer manage
via Supervisord (which is what I use), and you have to kill them manually.
I spent a decent amount of time wondering why my workers were returning
results from outdated code!

My fix, was to switch to MultiThreading, and use that to marshall a
collection of Nameko processes. In essence, I have a top-level Nameko
process, which then spawns additional processes. With this, I can
distribute the processing load across hosts, as well as processors.

Geoff

Thanks you two for your replies.

@Johan I'm not sure I understand your solution correctly. I would like to
use multiprocessing _after_ the nameko worker has been spawned. As nameko
workers are spawned in process-bound greenthreads, you can use
multi-processing in those.

I don't see why using an event entrypoint would change any of this?

Cheers,

Manfred

···

2017-08-26 16:14 GMT+02:00 Johan Frisell <johan.henric.frisell@gmail.com>:

One simple way we achieve this is to have nameko dispatch events which are
handled by another nameko service or the same service dispatching the
event. Might be a naive way to solve the problem but it kinda fits the
solution pattern in general and the code base becomes quite neat :slight_smile:

Den måndag 3 april 2017 kl. 11:19:59 UTC+2 skrev Manfred Chebli:

Hi folks,

We are currently using nameko in my team, and we are happy with it for
the most part.

We would like to launch heavy computation from nameko services, and those
computations needs to be done in parallel. I'm not talking about a nameko
entrypoint, but an entrypoint that calls a code that needs to be run on
several processes.

As of today, we are using python's multiprocessing :
https://docs.python.org/2/library/multiprocessing.html.

We tried to wrap our code into a nameko service, and everything went
south when using the library, namely because it doesn't play nicely with
eventlet.

Has anyone ever used multiprocessing with a nameko service? Or is there a
preferred way to run computations in parallel with nameko (with or without
multiprocessing)? Of course, we have already tried to use greenthreads for
our computations. As they produce a lot of writing operations on a
database, performances were not good, even worse than what we are achieving
without multiprocessing at all.

Thanks in advance,

Manfred

--
You received this message because you are subscribed to the Google Groups
"nameko-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to nameko-dev+unsubscribe@googlegroups.com.
To post to this group, send email to nameko-dev@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/
msgid/nameko-dev/847e605a-8cb6-417f-a985-35ff751c228d%40googlegroups.com
<https://groups.google.com/d/msgid/nameko-dev/847e605a-8cb6-417f-a985-35ff751c228d%40googlegroups.com?utm_medium=email&utm_source=footer>
.

For more options, visit https://groups.google.com/d/optout.

--
* Manfred Chebli*

  > manfred.chebli@forcity.io

  <https://www.linkedin.com/company/forcity>
<https://twitter.com/forcity_co> <http://www.forcity.com>

Hi Matt,

Sorry for the late reply.

We still have a same solution: we use nameko services as an entrypoint, and
then delegate the work to another process, that can span new python
processes instead of greenthreads. It's not ideal, and we are looking to
improve this part in the future, but we did not took the time to reflect on
this at this point.

Presumably the reason for using multiprocessing was to escape the

limitations of the GIL, and use more than one core. I think what Johan was
suggesting was to ditch multiprocessing altogether -- if you distribute the
work using events, then each one can be picked up by a separate service
instance (running in a separate process).

This could be a good solution, but this will need some refactoring here and
there in our code before we can implement this. I'm not sure some
refactoring can be avoided, so I'll keep this idea in mind.

Thanks for your help on this matter.

Manfred

···

2017-10-13 22:18 GMT+02:00 <jukowitz@gmail.com>:

Running a MultiProcess pool inside a Nameko process is super-hairy (I used
concurrent.futures).

You end up with all sorts of ghost services, that you can no longer manage
via Supervisord (which is what I use), and you have to kill them manually.
I spent a decent amount of time wondering why my workers were returning
results from outdated code!

My fix, was to switch to MultiThreading, and use that to marshall a
collection of Nameko processes. In essence, I have a top-level Nameko
process, which then spawns additional processes. With this, I can
distribute the processing load across hosts, as well as processors.

Geoff

--
You received this message because you are subscribed to the Google Groups
"nameko-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to nameko-dev+unsubscribe@googlegroups.com.
To post to this group, send email to nameko-dev@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/
msgid/nameko-dev/a2444bdf-8b14-4d1d-9784-3a2d89523b59%40googlegroups.com
<https://groups.google.com/d/msgid/nameko-dev/a2444bdf-8b14-4d1d-9784-3a2d89523b59%40googlegroups.com?utm_medium=email&utm_source=footer>
.

For more options, visit https://groups.google.com/d/optout.

--
* Manfred Chebli*
Ingénieur Logiciel

04 78 18 75 38 | manfred.chebli@forcity.io

  <https://www.linkedin.com/company/forcity>
<https://twitter.com/forcity_co> <http://www.forcity.com>
<http://www.forcity.com/forcity-start-en-hyper-croissance-laureate-du-pass-french-tech/>

Manfred,

Did you come to a solution for this in the end?

Presumably the reason for using multiprocessing was to escape the
limitations of the GIL, and use more than one core. I think what Johan was
suggesting was to ditch multiprocessing altogether -- if you distribute the
work using events, then each one can be picked up by a separate service
instance (running in a separate process).

Matt.

···

On Tuesday, September 5, 2017 at 2:31:49 PM UTC+1, Manfred Chebli wrote:

Thanks you two for your replies.

@Johan I'm not sure I understand your solution correctly. I would like to
use multiprocessing _after_ the nameko worker has been spawned. As nameko
workers are spawned in process-bound greenthreads, you can use
multi-processing in those.

I don't see why using an event entrypoint would change any of this?

Cheers,

Manfred

2017-08-26 16:14 GMT+02:00 Johan Frisell <johan.henric.frisell@gmail.com>:

One simple way we achieve this is to have nameko dispatch events which
are handled by another nameko service or the same service dispatching the
event. Might be a naive way to solve the problem but it kinda fits the
solution pattern in general and the code base becomes quite neat :slight_smile:

Den måndag 3 april 2017 kl. 11:19:59 UTC+2 skrev Manfred Chebli:

Hi folks,

We are currently using nameko in my team, and we are happy with it for
the most part.

We would like to launch heavy computation from nameko services, and
those computations needs to be done in parallel. I'm not talking about a
nameko entrypoint, but an entrypoint that calls a code that needs to be run
on several processes.

As of today, we are using python's multiprocessing :
https://docs.python.org/2/library/multiprocessing.html.

We tried to wrap our code into a nameko service, and everything went
south when using the library, namely because it doesn't play nicely with
eventlet.

Has anyone ever used multiprocessing with a nameko service? Or is there
a preferred way to run computations in parallel with nameko (with or
without multiprocessing)? Of course, we have already tried to use
greenthreads for our computations. As they produce a lot of writing
operations on a database, performances were not good, even worse than what
we are achieving without multiprocessing at all.

Thanks in advance,

Manfred

--
You received this message because you are subscribed to the Google Groups
"nameko-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to nameko-dev+unsubscribe@googlegroups.com.
To post to this group, send email to nameko-dev@googlegroups.com.
To view this discussion on the web, visit
https://groups.google.com/d/msgid/nameko-dev/847e605a-8cb6-417f-a985-35ff751c228d%40googlegroups.com
<https://groups.google.com/d/msgid/nameko-dev/847e605a-8cb6-417f-a985-35ff751c228d%40googlegroups.com?utm_medium=email&utm_source=footer>
.

For more options, visit https://groups.google.com/d/optout.

--
* Manfred Chebli*

  > manfred.chebli@forcity.io
  
  <https://www.linkedin.com/company/forcity>
<https://twitter.com/forcity_co> <http://www.forcity.com>