high availability rabbitmq hangs on reply?

Hi,
We're trying to use the nameko framework over a HA rabbitmq cluster.
It seems that the server receives the messages, but the replies get "stuck"
in the reply queue until the ClusterRpcProxy object throws a timeout.
Is there an issue with HA?
Thanks

A few words regarding our architecture:
We have a cluster of 3 rabbitmq servers behind a load balancer. All are
configured in ha-all policy. The clients that are using ClusterRpcProxy
have the URI of the load balancer.

Should we remove the load balancer, expose the 3 rabbit servers and replace
the AMQP_URI with the URI of the 3 servers?

···

On Tuesday, May 31, 2016 at 1:57:02 PM UTC+3, tsachi...@gmail.com wrote:

Hi,
We're trying to use the nameko framework over a HA rabbitmq cluster.
It seems that the server receives the messages, but the replies get
"stuck" in the reply queue until the ClusterRpcProxy object throws a
timeout.
Is there an issue with HA?
Thanks

It should work. Rabbit clustering and HA are transparent to nameko (or any
other client). The docs <https://www.rabbitmq.com/clustering.html&gt; state
that you can read from or write to any node in the cluster.

We are currently using a load-balanced rabbit cluster for RPC with no
issue, although we don't have HA turned on. I have tested a fully HA'd
cluster previously though and my recollection is that it worked fine, so I
don't think that's the issue.

What are you using for load-balancing? I think that's likely to be the
culprit. We use ELB without issue, and I've seen HAProxy used too.

Have you tried removing the load-balancer and connecting to nodes on a
round-robin basis?

···

On Tuesday, May 31, 2016 at 12:38:37 PM UTC+1, tsachi...@gmail.com wrote:

A few words regarding our architecture:
We have a cluster of 3 rabbitmq servers behind a load balancer. All are
configured in ha-all policy. The clients that are using ClusterRpcProxy
have the URI of the load balancer.

Should we remove the load balancer, expose the 3 rabbit servers and
replace the AMQP_URI with the URI of the 3 servers?

On Tuesday, May 31, 2016 at 1:57:02 PM UTC+3, tsachi...@gmail.com wrote:

Hi,
We're trying to use the nameko framework over a HA rabbitmq cluster.
It seems that the server receives the messages, but the replies get
"stuck" in the reply queue until the ClusterRpcProxy object throws a
timeout.
Is there an issue with HA?
Thanks

Hi,

We are using Amazon's ELB for load balancing.
And yes, connecting directly to the nodes works well.
Connecting to the load balancer hangs on the reply most of the times, but
occasionally it works.

We don't know why it happens.

···

On Tuesday, May 31, 2016 at 9:15:16 PM UTC+3, Matt Bennett wrote:

It should work. Rabbit clustering and HA are transparent to nameko (or any
other client). The docs <https://www.rabbitmq.com/clustering.html&gt; state
that you can read from or write to any node in the cluster.

We are currently using a load-balanced rabbit cluster for RPC with no
issue, although we don't have HA turned on. I have tested a fully HA'd
cluster previously though and my recollection is that it worked fine, so I
don't think that's the issue.

What are you using for load-balancing? I think that's likely to be the
culprit. We use ELB without issue, and I've seen HAProxy used too.

Have you tried removing the load-balancer and connecting to nodes on a
round-robin basis?

On Tuesday, May 31, 2016 at 12:38:37 PM UTC+1, tsachi...@gmail.com wrote:

A few words regarding our architecture:
We have a cluster of 3 rabbitmq servers behind a load balancer. All are
configured in ha-all policy. The clients that are using ClusterRpcProxy
have the URI of the load balancer.

Should we remove the load balancer, expose the 3 rabbit servers and
replace the AMQP_URI with the URI of the 3 servers?

On Tuesday, May 31, 2016 at 1:57:02 PM UTC+3, tsachi...@gmail.com wrote:

Hi,
We're trying to use the nameko framework over a HA rabbitmq cluster.
It seems that the server receives the messages, but the replies get
"stuck" in the reply queue until the ClusterRpcProxy object throws a
timeout.
Is there an issue with HA?
Thanks

That is very interesting. In my tests I have not combined load-balancing,
clustering and HA all together, but it should obviously work.

It's unlikely to be a bug in nameko, but it might be one in kombu or
pyamqp. Which versions of those libraries and rabbitmq are you using?

Can you create a reproducible example? I would be interested in playing
around with it.

···

On Tuesday, May 31, 2016 at 8:02:15 PM UTC+1, tsachi...@gmail.com wrote:

Hi,

We are using Amazon's ELB for load balancing.
And yes, connecting directly to the nodes works well.
Connecting to the load balancer hangs on the reply most of the times, but
occasionally it works.

We don't know why it happens.

On Tuesday, May 31, 2016 at 9:15:16 PM UTC+3, Matt Bennett wrote:

It should work. Rabbit clustering and HA are transparent to nameko (or
any other client). The docs <https://www.rabbitmq.com/clustering.html&gt;
state that you can read from or write to any node in the cluster.

We are currently using a load-balanced rabbit cluster for RPC with no
issue, although we don't have HA turned on. I have tested a fully HA'd
cluster previously though and my recollection is that it worked fine, so I
don't think that's the issue.

What are you using for load-balancing? I think that's likely to be the
culprit. We use ELB without issue, and I've seen HAProxy used too.

Have you tried removing the load-balancer and connecting to nodes on a
round-robin basis?

On Tuesday, May 31, 2016 at 12:38:37 PM UTC+1, tsachi...@gmail.com wrote:

A few words regarding our architecture:
We have a cluster of 3 rabbitmq servers behind a load balancer. All are
configured in ha-all policy. The clients that are using ClusterRpcProxy
have the URI of the load balancer.

Should we remove the load balancer, expose the 3 rabbit servers and
replace the AMQP_URI with the URI of the 3 servers?

On Tuesday, May 31, 2016 at 1:57:02 PM UTC+3, tsachi...@gmail.com wrote:

Hi,
We're trying to use the nameko framework over a HA rabbitmq cluster.
It seems that the server receives the messages, but the replies get
"stuck" in the reply queue until the ClusterRpcProxy object throws a
timeout.
Is there an issue with HA?
Thanks