Nameko Web Server Issues

Hi everyone,

I have been using Nameko in production for the last 10 months now and I've
suddenly run into some weird issues for my http endpoints. Out of all the
calls made to any of the http endpoints in the services, about 50% of the
requests end up giving a 404.

Can someone please point me to the right direction as to what I might be
missing.

I have tried increasing max_workers without any effect.

I am attaching the typical behaviour. The requesting server is set to retry
as a result it does get through in 2-3 tries. But I am unable to figure out
the cause for this error in the first place.

[13/Dec/2017 10:27:26] "POST /api/v1/analytics/device_gps HTTP/1.1" 404 361
0.000279

[13/Dec/2017 10:27:26] "POST /api/v1/analytics/device_gps HTTP/1.1" 200 195
0.003991

[13/Dec/2017 10:27:29] "POST /api/v1/analytics/device_gps HTTP/1.1" 404 361
0.000294

[13/Dec/2017 10:27:29] "POST /api/v1/analytics/device_gps HTTP/1.1" 200 195
0.003918

[13/Dec/2017 10:27:29] "POST /api/v1/analytics/device_gps HTTP/1.1" 200 195
0.003909

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 404 361
0.000287

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 404 361
0.000244

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 404 361
0.000234

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 404 361
0.000227

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 200 195
0.003943

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 200 195
0.003900

[13/Dec/2017 10:27:47] "POST /api/v1/analytics/device_gps HTTP/1.1" 404 361
0.000288

[13/Dec/2017 10:27:47] "POST /api/v1/analytics/device_gps HTTP/1.1" 200 195
0.005321

[13/Dec/2017 10:27:49] "POST /api/v1/analytics/device_gps HTTP/1.1" 200 195
0.003912

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 404 361
0.000278

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 200 195
0.004134

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 404 361
0.000276

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 404 361
0.000232

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 404 361
0.000243

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 404 361
0.000292

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 200 195
0.003832

Another instance where I am seeing this behaviour:

[13/Dec/2017 07:22:47] "POST /api/v1/analytics/write_distance HTTP/1.1" 200
198 0.007928

[13/Dec/2017 07:26:01] "POST /api/v1/analytics/write_distance HTTP/1.1" 200
198 0.007584

[13/Dec/2017 07:27:12] "POST /api/v1/analytics/write_distance HTTP/1.1" 404
361 0.000307

[13/Dec/2017 07:28:44] "POST /api/v1/analytics/write_distance HTTP/1.1" 200
198 0.007077

[13/Dec/2017 07:31:08] "POST /api/v1/analytics/write_distance HTTP/1.1" 404
361 0.000284

[13/Dec/2017 07:32:43] "POST /api/v1/analytics/write_distance HTTP/1.1" 200
198 0.007327

[13/Dec/2017 07:33:49] "POST /api/v1/analytics/write_distance HTTP/1.1" 404
361 0.000291

[13/Dec/2017 07:36:53] "POST /api/v1/analytics/write_distance HTTP/1.1" 404
361 0.000303

[13/Dec/2017 07:38:15] "POST /api/v1/analytics/write_distance HTTP/1.1" 404
361 0.000247

[13/Dec/2017 07:39:16] "POST /api/v1/analytics/write_distance HTTP/1.1" 404
361 0.000289

[13/Dec/2017 07:42:37] "POST /api/v1/analytics/write_distance HTTP/1.1" 404
361 0.000322

[13/Dec/2017 07:43:20] "POST /api/v1/analytics/write_distance HTTP/1.1" 200
198 0.007246

I am running Nameko 2.6.0, Python 2.7.12
Please let me know if any additional info is needed.

PS: All the rpc related entrypoints are working absolutely fine for me

Thanks,
Someshwar

Hi Someshwar,

There's not enough info here to know what's going on. Can you post your
service code and more details about how you're hosting and running them?

Are these logs from the same nameko service instance, or a cluster? (i.e.
does a single instance alternate between 200 and 404?)

Matt.

···

On Wednesday, December 13, 2017 at 10:48:17 AM UTC, some...@vahanalytics.com wrote:

Hi everyone,

I have been using Nameko in production for the last 10 months now and I've
suddenly run into some weird issues for my http endpoints. Out of all the
calls made to any of the http endpoints in the services, about 50% of the
requests end up giving a 404.

Can someone please point me to the right direction as to what I might be
missing.

I have tried increasing max_workers without any effect.

I am attaching the typical behaviour. The requesting server is set to
retry as a result it does get through in 2-3 tries. But I am unable to
figure out the cause for this error in the first place.

[13/Dec/2017 10:27:26] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000279

[13/Dec/2017 10:27:26] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.003991

[13/Dec/2017 10:27:29] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000294

[13/Dec/2017 10:27:29] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.003918

[13/Dec/2017 10:27:29] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.003909

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000287

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000244

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000234

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000227

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.003943

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.003900

[13/Dec/2017 10:27:47] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000288

[13/Dec/2017 10:27:47] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.005321

[13/Dec/2017 10:27:49] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.003912

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000278

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.004134

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000276

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000232

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000243

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000292

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.003832

Another instance where I am seeing this behaviour:

[13/Dec/2017 07:22:47] "POST /api/v1/analytics/write_distance HTTP/1.1"
200 198 0.007928

[13/Dec/2017 07:26:01] "POST /api/v1/analytics/write_distance HTTP/1.1"
200 198 0.007584

[13/Dec/2017 07:27:12] "POST /api/v1/analytics/write_distance HTTP/1.1"
404 361 0.000307

[13/Dec/2017 07:28:44] "POST /api/v1/analytics/write_distance HTTP/1.1"
200 198 0.007077

[13/Dec/2017 07:31:08] "POST /api/v1/analytics/write_distance HTTP/1.1"
404 361 0.000284

[13/Dec/2017 07:32:43] "POST /api/v1/analytics/write_distance HTTP/1.1"
200 198 0.007327

[13/Dec/2017 07:33:49] "POST /api/v1/analytics/write_distance HTTP/1.1"
404 361 0.000291

[13/Dec/2017 07:36:53] "POST /api/v1/analytics/write_distance HTTP/1.1"
404 361 0.000303

[13/Dec/2017 07:38:15] "POST /api/v1/analytics/write_distance HTTP/1.1"
404 361 0.000247

[13/Dec/2017 07:39:16] "POST /api/v1/analytics/write_distance HTTP/1.1"
404 361 0.000289

[13/Dec/2017 07:42:37] "POST /api/v1/analytics/write_distance HTTP/1.1"
404 361 0.000322

[13/Dec/2017 07:43:20] "POST /api/v1/analytics/write_distance HTTP/1.1"
200 198 0.007246

I am running Nameko 2.6.0, Python 2.7.12
Please let me know if any additional info is needed.

PS: All the rpc related entrypoints are working absolutely fine for me

Thanks,
Someshwar

Hi Matt,

Thanks for getting back.

This issue is happening to only one of my projects

Now, I have 4 services in this file called service.py and I'm not running
it with a ServiceContainer or ServiceRunner. Just 4 classes inside a file.

This is only 1 service instance and it is alternating between 200 and 404
for the same. I am using Supervisor to run the service with the usual:

exec ./analytics_env/bin/nameko run --config ./config.yml service

Surprisingly enough, I removed one of the http entrypoints that I could
move somewhere else and the problems seems to have been mitigated. Now, I
am consistently getting 200s for the requests.

My followup comments:
1) The above seems to make no sense.
2) There is no chance that there was an error in that function
3) The function was calling another api in return internally and processing
a few things (I am using this function as sort of a router to hit other
services further along my stack).
4) I do not believe that having another api call inside this http
entrypoint could be the issue because I have other entrypoints that are
doing so without any issues.

Even though it seems to suddenly have been resolved, I want to understand
if I should continue to have http api endpoints with nameko or should I use
a more robust platform like nodejs/flask to cater to apis while nameko
works well with rpc services. (I already do most of my api servings from a
nodejs server anyways).

Thanks,
Someshwar

···

On Wednesday, December 13, 2017 at 5:29:25 PM UTC+5:30, Matt Yule-Bennett wrote:

Hi Someshwar,

There's not enough info here to know what's going on. Can you post your
service code and more details about how you're hosting and running them?

Are these logs from the same nameko service instance, or a cluster? (i.e.
does a single instance alternate between 200 and 404?)

Matt.

On Wednesday, December 13, 2017 at 10:48:17 AM UTC, > some...@vahanalytics.com wrote:

Hi everyone,

I have been using Nameko in production for the last 10 months now and
I've suddenly run into some weird issues for my http endpoints. Out of all
the calls made to any of the http endpoints in the services, about 50% of
the requests end up giving a 404.

Can someone please point me to the right direction as to what I might be
missing.

I have tried increasing max_workers without any effect.

I am attaching the typical behaviour. The requesting server is set to
retry as a result it does get through in 2-3 tries. But I am unable to
figure out the cause for this error in the first place.

[13/Dec/2017 10:27:26] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000279

[13/Dec/2017 10:27:26] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.003991

[13/Dec/2017 10:27:29] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000294

[13/Dec/2017 10:27:29] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.003918

[13/Dec/2017 10:27:29] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.003909

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000287

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000244

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000234

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000227

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.003943

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.003900

[13/Dec/2017 10:27:47] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000288

[13/Dec/2017 10:27:47] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.005321

[13/Dec/2017 10:27:49] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.003912

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000278

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.004134

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000276

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000232

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000243

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000292

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.003832

Another instance where I am seeing this behaviour:

[13/Dec/2017 07:22:47] "POST /api/v1/analytics/write_distance HTTP/1.1"
200 198 0.007928

[13/Dec/2017 07:26:01] "POST /api/v1/analytics/write_distance HTTP/1.1"
200 198 0.007584

[13/Dec/2017 07:27:12] "POST /api/v1/analytics/write_distance HTTP/1.1"
404 361 0.000307

[13/Dec/2017 07:28:44] "POST /api/v1/analytics/write_distance HTTP/1.1"
200 198 0.007077

[13/Dec/2017 07:31:08] "POST /api/v1/analytics/write_distance HTTP/1.1"
404 361 0.000284

[13/Dec/2017 07:32:43] "POST /api/v1/analytics/write_distance HTTP/1.1"
200 198 0.007327

[13/Dec/2017 07:33:49] "POST /api/v1/analytics/write_distance HTTP/1.1"
404 361 0.000291

[13/Dec/2017 07:36:53] "POST /api/v1/analytics/write_distance HTTP/1.1"
404 361 0.000303

[13/Dec/2017 07:38:15] "POST /api/v1/analytics/write_distance HTTP/1.1"
404 361 0.000247

[13/Dec/2017 07:39:16] "POST /api/v1/analytics/write_distance HTTP/1.1"
404 361 0.000289

[13/Dec/2017 07:42:37] "POST /api/v1/analytics/write_distance HTTP/1.1"
404 361 0.000322

[13/Dec/2017 07:43:20] "POST /api/v1/analytics/write_distance HTTP/1.1"
200 198 0.007246

I am running Nameko 2.6.0, Python 2.7.12
Please let me know if any additional info is needed.

PS: All the rpc related entrypoints are working absolutely fine for me

Thanks,
Someshwar

Well that is curious. There is probably something funny going on with the
four services sharing the same werkzeug WebServer (SharedExtensions are
shared per-process, not per-service). If you're able to share the code
(privately) I will try to figure out what was going on.

Re: Nameko HTTP vs NodeJS, I think it depends on your use-case. There are
some things that you will get for free by using Nameko, such as the
built-in RPC proxies to talk to other services and event handlers. Having
said that even at my workplace our client-facing APIs are in NodeJS.

···

On Wednesday, December 13, 2017 at 12:57:39 PM UTC, some...@vahanalytics.com wrote:

Hi Matt,

Thanks for getting back.

This issue is happening to only one of my projects

Now, I have 4 services in this file called service.py and I'm not running
it with a ServiceContainer or ServiceRunner. Just 4 classes inside a file.

This is only 1 service instance and it is alternating between 200 and 404
for the same. I am using Supervisor to run the service with the usual:

exec ./analytics_env/bin/nameko run --config ./config.yml service

Surprisingly enough, I removed one of the http entrypoints that I could
move somewhere else and the problems seems to have been mitigated. Now, I
am consistently getting 200s for the requests.

My followup comments:
1) The above seems to make no sense.
2) There is no chance that there was an error in that function
3) The function was calling another api in return internally and
processing a few things (I am using this function as sort of a router to
hit other services further along my stack).
4) I do not believe that having another api call inside this http
entrypoint could be the issue because I have other entrypoints that are
doing so without any issues.

Even though it seems to suddenly have been resolved, I want to understand
if I should continue to have http api endpoints with nameko or should I use
a more robust platform like nodejs/flask to cater to apis while nameko
works well with rpc services. (I already do most of my api servings from a
nodejs server anyways).

Thanks,
Someshwar

On Wednesday, December 13, 2017 at 5:29:25 PM UTC+5:30, Matt Yule-Bennett > wrote:

Hi Someshwar,

There's not enough info here to know what's going on. Can you post your
service code and more details about how you're hosting and running them?

Are these logs from the same nameko service instance, or a cluster? (i.e.
does a single instance alternate between 200 and 404?)

Matt.

On Wednesday, December 13, 2017 at 10:48:17 AM UTC, >> some...@vahanalytics.com wrote:

Hi everyone,

I have been using Nameko in production for the last 10 months now and
I've suddenly run into some weird issues for my http endpoints. Out of all
the calls made to any of the http endpoints in the services, about 50% of
the requests end up giving a 404.

Can someone please point me to the right direction as to what I might be
missing.

I have tried increasing max_workers without any effect.

I am attaching the typical behaviour. The requesting server is set to
retry as a result it does get through in 2-3 tries. But I am unable to
figure out the cause for this error in the first place.

[13/Dec/2017 10:27:26] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000279

[13/Dec/2017 10:27:26] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.003991

[13/Dec/2017 10:27:29] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000294

[13/Dec/2017 10:27:29] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.003918

[13/Dec/2017 10:27:29] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.003909

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000287

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000244

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000234

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000227

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.003943

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.003900

[13/Dec/2017 10:27:47] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000288

[13/Dec/2017 10:27:47] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.005321

[13/Dec/2017 10:27:49] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.003912

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000278

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.004134

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000276

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000232

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000243

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000292

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.003832

Another instance where I am seeing this behaviour:

[13/Dec/2017 07:22:47] "POST /api/v1/analytics/write_distance HTTP/1.1"
200 198 0.007928

[13/Dec/2017 07:26:01] "POST /api/v1/analytics/write_distance HTTP/1.1"
200 198 0.007584

[13/Dec/2017 07:27:12] "POST /api/v1/analytics/write_distance HTTP/1.1"
404 361 0.000307

[13/Dec/2017 07:28:44] "POST /api/v1/analytics/write_distance HTTP/1.1"
200 198 0.007077

[13/Dec/2017 07:31:08] "POST /api/v1/analytics/write_distance HTTP/1.1"
404 361 0.000284

[13/Dec/2017 07:32:43] "POST /api/v1/analytics/write_distance HTTP/1.1"
200 198 0.007327

[13/Dec/2017 07:33:49] "POST /api/v1/analytics/write_distance HTTP/1.1"
404 361 0.000291

[13/Dec/2017 07:36:53] "POST /api/v1/analytics/write_distance HTTP/1.1"
404 361 0.000303

[13/Dec/2017 07:38:15] "POST /api/v1/analytics/write_distance HTTP/1.1"
404 361 0.000247

[13/Dec/2017 07:39:16] "POST /api/v1/analytics/write_distance HTTP/1.1"
404 361 0.000289

[13/Dec/2017 07:42:37] "POST /api/v1/analytics/write_distance HTTP/1.1"
404 361 0.000322

[13/Dec/2017 07:43:20] "POST /api/v1/analytics/write_distance HTTP/1.1"
200 198 0.007246

I am running Nameko 2.6.0, Python 2.7.12
Please let me know if any additional info is needed.

PS: All the rpc related entrypoints are working absolutely fine for me

Thanks,
Someshwar

Thanks Matt for this information (SharedExtensions are shared per-process,
not per-service).

I have sent you the offending entrypoint function on gitter and we can
continue there. I don't think there is any point following up on this post
since nothing really came out of it. You can actually go ahead and delete
this post entirely from the google group to reduce unactionable clutter :stuck_out_tongue:

···

On Wed, Dec 13, 2017 at 7:24 PM, Matt Yule-Bennett < bennett.matthew@gmail.com> wrote:

Well that is curious. There is probably something funny going on with the
four services sharing the same werkzeug WebServer (SharedExtensions are
shared per-process, not per-service). If you're able to share the code
(privately) I will try to figure out what was going on.

Re: Nameko HTTP vs NodeJS, I think it depends on your use-case. There are
some things that you will get for free by using Nameko, such as the
built-in RPC proxies to talk to other services and event handlers. Having
said that even at my workplace our client-facing APIs are in NodeJS.

On Wednesday, December 13, 2017 at 12:57:39 PM UTC, > some...@vahanalytics.com wrote:

Hi Matt,

Thanks for getting back.

This issue is happening to only one of my projects

Now, I have 4 services in this file called service.py and I'm not running
it with a ServiceContainer or ServiceRunner. Just 4 classes inside a file.

This is only 1 service instance and it is alternating between 200 and 404
for the same. I am using Supervisor to run the service with the usual:

exec ./analytics_env/bin/nameko run --config ./config.yml service

Surprisingly enough, I removed one of the http entrypoints that I could
move somewhere else and the problems seems to have been mitigated. Now, I
am consistently getting 200s for the requests.

My followup comments:
1) The above seems to make no sense.
2) There is no chance that there was an error in that function
3) The function was calling another api in return internally and
processing a few things (I am using this function as sort of a router to
hit other services further along my stack).
4) I do not believe that having another api call inside this http
entrypoint could be the issue because I have other entrypoints that are
doing so without any issues.

Even though it seems to suddenly have been resolved, I want to understand
if I should continue to have http api endpoints with nameko or should I use
a more robust platform like nodejs/flask to cater to apis while nameko
works well with rpc services. (I already do most of my api servings from a
nodejs server anyways).

Thanks,
Someshwar

On Wednesday, December 13, 2017 at 5:29:25 PM UTC+5:30, Matt Yule-Bennett >> wrote:

Hi Someshwar,

There's not enough info here to know what's going on. Can you post your
service code and more details about how you're hosting and running them?

Are these logs from the same nameko service instance, or a cluster?
(i.e. does a single instance alternate between 200 and 404?)

Matt.

On Wednesday, December 13, 2017 at 10:48:17 AM UTC, >>> some...@vahanalytics.com wrote:

Hi everyone,

I have been using Nameko in production for the last 10 months now and
I've suddenly run into some weird issues for my http endpoints. Out of all
the calls made to any of the http endpoints in the services, about 50% of
the requests end up giving a 404.

Can someone please point me to the right direction as to what I might
be missing.

I have tried increasing max_workers without any effect.

I am attaching the typical behaviour. The requesting server is set to
retry as a result it does get through in 2-3 tries. But I am unable to
figure out the cause for this error in the first place.

[13/Dec/2017 10:27:26] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000279

[13/Dec/2017 10:27:26] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.003991

[13/Dec/2017 10:27:29] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000294

[13/Dec/2017 10:27:29] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.003918

[13/Dec/2017 10:27:29] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.003909

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000287

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000244

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000234

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000227

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.003943

[13/Dec/2017 10:27:39] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.003900

[13/Dec/2017 10:27:47] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000288

[13/Dec/2017 10:27:47] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.005321

[13/Dec/2017 10:27:49] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.003912

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000278

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.004134

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000276

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000232

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000243

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 404
361 0.000292

[13/Dec/2017 10:28:07] "POST /api/v1/analytics/device_gps HTTP/1.1" 200
195 0.003832

Another instance where I am seeing this behaviour:

[13/Dec/2017 07:22:47] "POST /api/v1/analytics/write_distance HTTP/1.1"
200 198 0.007928

[13/Dec/2017 07:26:01] "POST /api/v1/analytics/write_distance HTTP/1.1"
200 198 0.007584

[13/Dec/2017 07:27:12] "POST /api/v1/analytics/write_distance HTTP/1.1"
404 361 0.000307

[13/Dec/2017 07:28:44] "POST /api/v1/analytics/write_distance HTTP/1.1"
200 198 0.007077

[13/Dec/2017 07:31:08] "POST /api/v1/analytics/write_distance HTTP/1.1"
404 361 0.000284

[13/Dec/2017 07:32:43] "POST /api/v1/analytics/write_distance HTTP/1.1"
200 198 0.007327

[13/Dec/2017 07:33:49] "POST /api/v1/analytics/write_distance HTTP/1.1"
404 361 0.000291

[13/Dec/2017 07:36:53] "POST /api/v1/analytics/write_distance HTTP/1.1"
404 361 0.000303

[13/Dec/2017 07:38:15] "POST /api/v1/analytics/write_distance HTTP/1.1"
404 361 0.000247

[13/Dec/2017 07:39:16] "POST /api/v1/analytics/write_distance HTTP/1.1"
404 361 0.000289

[13/Dec/2017 07:42:37] "POST /api/v1/analytics/write_distance HTTP/1.1"
404 361 0.000322

[13/Dec/2017 07:43:20] "POST /api/v1/analytics/write_distance HTTP/1.1"
200 198 0.007246

I am running Nameko 2.6.0, Python 2.7.12
Please let me know if any additional info is needed.

PS: All the rpc related entrypoints are working absolutely fine for me

Thanks,
Someshwar

--

You received this message because you are subscribed to a topic in the
Google Groups "nameko-dev" group.
To unsubscribe from this topic, visit https://groups.google.com/d/
topic/nameko-dev/SWUysVY77ps/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
nameko-dev+unsubscribe@googlegroups.com.
To post to this group, send email to nameko-dev@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/
msgid/nameko-dev/7b6f946d-a5a3-4478-ae15-204d9ba9f675%40googlegroups.com
<https://groups.google.com/d/msgid/nameko-dev/7b6f946d-a5a3-4478-ae15-204d9ba9f675%40googlegroups.com?utm_medium=email&utm_source=footer&gt;
.

For more options, visit https://groups.google.com/d/optout\.

--
Someshwar Dash
Co-Founder and CTO,
Vahanalytics

@Someshwar_Dash hi, did you figure out a reason for that? I have absolutely the same behaviour for one of my services. I have 8 of them build with nameko using absolutely same principles and all of them are working just fine except one. Each has two simple http functions for health and readyness check and they both failed for one specific service 50/50 with 404 status code.

@http('GET', '/healthz')
def healthz(self):
    return 200, json.dumps({'message': 'Healthy service'})

I have the only one pod for this service.

The difference is that service uses WebSocketHubProvider and when I comment WS part out it starts receiving 200 in 100% of the time without any problems.

Python 3.6, Nameko 2.12.0

@vashchukmaksim We determined (in private Gitter) that the problem above was caused by an upstream misconfiguration in nginx or similar.

Your problem has the same symptoms but must be a different cause. Can you post the full code for the service? I will try to reproduce the problem.

@mattbennett Sure! Thanks for the response. Here is a service code. It has some inner deps such as authentication dependency etc. I tried to remove auth completely but it didn’t solve the problem (and it works 100% fine in all other services).

import json
from datetime import datetime, timedelta

from nameko.rpc import rpc
from nameko.web.handlers import http
from nameko.extensions import DependencyProvider
from nameko.dependency_providers import Config
from nameko.web.websocket import WebSocketHubProvider, rpc as rpc_ws

from nameko_sentry import SentryReporter
from nameko_sqlalchemy import DatabaseSession

from tg_nameko.http import http
from tg_nameko.logger import WorkerLogger
from tg_nameko.auth import AuthDependencyProvider as Auth
from tg_nameko.rpc.utils import rpc_errors, rpc_response

from ..models import SocketsModel
from ..conf import DeclarativeBase

__all__ = [
    'WebSocketsGatewayBaseSvc'
]

AVAILABLE_CHANNELS = ["scheduler"]

def auth() -> Auth.Api:
    """
    Hack to have correct auth dependency object
    in Service methods
    """
    return Auth()
  

class ContainerIdentifier(DependencyProvider):
    def get_dependency(self, worker_ctx):
        return id(self.container)

class WebSocketsGatewayBaseSvc():
    
    # Service name
    name = 'service_ws_gateway'
    
    # Dependencies
    config = Config()
    db = DatabaseSession(DeclarativeBase)
    # container_id = ContainerIdentifier()
    websocket_hub = WebSocketHubProvider()
    log = WorkerLogger('service_ws_gateway')
    sentry = SentryReporter()
    auth = auth()

    @http('GET', '/healthz')
    def healthz(self):
        return 200, json.dumps({'message': 'Healthy service'})

    @http('GET', '/readyz')
    def readyz(self):
        return 200, json.dumps({'message': 'Service is ready'})

    @rpc_ws
    def unsubscribe(self, socket_id, channel):
        if channel in AVAILABLE_CHANNELS:
            try:
                self.websocket_hub.unsubscribe(socket_id, channel)
            except Exception as e:
                self.log.logger.error(e)
                return 'can not unsubscribe from a channel {}, unexpected error'.format(channel)

        # Result message
        return 'unsubscribed from {}'.format(channel)

    @rpc_ws
    def subscribe(self, socket_id, channel, token):

        # Check if channel that user wants to be subscribed to
        # is available
        if channel in AVAILABLE_CHANNELS:

            # Validate token and get user UUID
            user_uuid = None
            if token:
                try:
                    decoded_token = self.auth.decode_any(token)
                    user_uuid = decoded_token.get('sub', {}).get('uuid')
                except Exception as e:
                    self.log.logger.error(e)
                    return 'can not subscribe to a channel {}, token is invalid'.format(channel)

            # Subscribe to channel
            try:
                self.websocket_hub.subscribe(socket_id, channel)
            except Exception as e:
                self.log.logger.error(e)
                return 'can not subscribe to a channel {}, unexpected error'.format(channel)

            # Write user_uuid and socket_id to the db
            socket_data = {
                'channel': channel,
                'user_uuid': user_uuid,
                'socket_id': socket_id,
                'expire': datetime.utcnow() + timedelta(hours=4)
            }

            socket = SocketsModel(**socket_data)
            self.db.add(socket)
            self.db.commit()

        # Result message
        return 'subscribed to {}'.format(channel)

    @rpc_ws
    @rpc_errors(log.logger)
    def send(self, user_uuid, channel, event, data):

        # Find all the sockets for the given user with a given channel
        # to send a message to
        sockets = self.db.query(SocketsModel).\
            filter(SocketsModel.user_uuid == user_uuid).\
            filter(SocketsModel.channel == channel).\
            filter(SocketsModel.expire > datetime.utcnow()).\
            all()

        # If list is not empty send a message
        # to each of them
        failed = []
        for socket in sockets:
            try:
                self.websocket_hub.unicast(socket.socket_id, event, {
                    **data,
                    'channel': channel
                })
            except Exception as e:
                failed.append(socket.socket_id)

        # Some messages were not delivered
        if len(sockets) == 0 or len(failed) == len(sockets):
            return rpc_response(400, {'user_uuid': user_uuid, 'channel': channel}, {
                'message': 'No sockets available for user and channel'
            })
        
        return rpc_response(200, None, {
            'message': 'Message was sent to a socket'
        })

When I comment all functions with rpc_ws decorator http requests pass 100% of the time. It’s hard to share a repo since anyway deps are in a different repo and it won’t run. Please let me know if any additional information is needed. Also I have the following setup for WebServer but it is also used in every other service and works fine.

import json
from werkzeug.wrappers import Response

from nameko.web.server import WebServer as BaseWebServer
from nameko.web.handlers import HttpRequestHandler as BaseHttpRequestHandler

from .exceptions import HttpError

__all__ = [
    'WebServer',
    'HttpRequestHandler',
    'http'
]


class WebServer(BaseWebServer):
    """
    WebServer that sends auth information in
    request context
    """

    def context_data_from_headers(self, request):
        context_data = super().context_data_from_headers(request)

        # Access token
        access_token = request.cookies.get('access_token', None)
        if access_token:
            context_data['access_token'] = access_token

        # Refresh token
        refresh_token = request.cookies.get('refresh_token', None)
        if refresh_token:
            context_data['refresh_token'] = refresh_token

        # CSRF token
        csrf_token = request.cookies.get('csrf_token', None)
        if csrf_token:
            context_data['csrf_token'] = csrf_token

        # CSRF token (header)
        csrf_token_header = request.cookies.get('csrf_token_header', None)
        if csrf_token_header:
            context_data['csrf_token_header'] = csrf_token_header

        # Updated context
        return context_data


class HttpRequestHandler(BaseHttpRequestHandler):
    """
    Http handler that handles custom context data
    and HTTP exceptions
    """

    server = WebServer()

    def response_from_exception(self, exc):
        if isinstance(exc, HttpError):

            # Common information
            error = {
                'status': exc.status_code,
                'code': exc.error_code,
                'title': exc.title,
                'detail': exc.detail
            }

            # Optional information
            for prop in ['id', 'meta']:
                if exc.__getattribute__(prop):
                    error.update({[prop]: exc.__getattribute__(prop)})

            # Convert to JSONAPI Spec
            jsonapi_error = {
                'errors': [error]
            }

            # Respond
            return Response(
                json.dumps(jsonapi_error),
                status=exc.status_code,
                content_type='application/vnd.api+json'
            )

        return HttpRequestHandler.response_from_exception(self, exc)


# http decorator
http = HttpRequestHandler.decorator

I will try to remove as much as possible from that service and create a single repo.

Hi, don’t know if you had time to go through the code above. I’ve created a repo without inner dependencies so it’s easier to run. You can find it here: https://github.com/vashchukmaksim/nameko-ws-gateway-test

Still can’t make it work. Any thoughts?

Hi @vashchukmaksim,

I finally had a chance to look at this. Sorry for the delay.

This is happening because you’ve defined your own WebServer subclass. It is an artefact of Nameko’s SharedExtension logic, and in particular the default sharing_key.

When you have a service with multiple @http entrypoints, they all need to share one web server instance, so nameko.web.server.WebServer is a SharedExtension subclass. The first @http entrypoint that is bound to the service container creates a WebServer instance, and then all subsequent @http entrypoints on the container re-use that previously created one.

The sharing_key on the SharedExtension subclass determines when instances can be re-used, and it defaults to type(cls). This is a bit fragile, unfortunately.

The WebSocketHubProvider also declares a WebServer sub-extension, because it needs a server, and it ought to use a pre-existing one if it exists, because you’re only listening one on port.

The WebSocketHubProvider doesn’t know about your subclass though, and is using the built-in webserver at nameko.web.server.WebServer. So what’s happening here is that you’ve got two WebServer instances, because they have different sharing keys (nameko.web.server.WebServer and src.service.http.WebServer).

I don’t understand why there is no error thrown in this situation, but the two web server instances are somehow listening to the same port and so requests get distributed between them.

You can work around this by setting an appropriate sharing_key. It’s a bit clunky but adding this to your WebServer subclass will fix it:

    @property
    def sharing_key(self):
        return BaseWebServer

Wow, yep, not obvious but perfectly clear now. Works fine after a fix. Thanks for the solution and detailed explanation :raised_hands: