Multiple instances of same service, different args/options?


#1

Hi,

I am trying to write a service that will subscribe to some external streaming data source, and publish that data in RabbitMQ. I already wrote an Entrypoint class that subscribes to the external data source and triggers an event for each incoming message, so all the service class does is publish this message to Rabbit.
This all works fine.
However, in the next phase, this will be a LOT of data, far too much for one instance of the service to handle. I was thinking of spinning up many instances of the service, and [somehow] indicate to each instance which subset of external data it should listen to and publish. (Identifying the subsets of data is easy, and can be identified by a simple string). I was thinking perhaps a command line param could be passed to the service when starting up, but I don’t know if/how that can be done.
Do you have any suggestions? Thanks very much in advance.


#2

It’s pretty simple to do exactly as you’ve described.

Your custom extension can read whatever is parsed from the config file through the ServiceContainer. In your extension, simply add:

def setup(self):  #probably want to do it in setup
    subset_identifier = self.container.config.get("SUBSET_IDENTIFIER")
    ...

You can put any arbitrary information you like into config.yaml


#3

Hi Matt,
Thanks for the quick response. I was considering this. I noticed in the docs that environment variables can be set and then the config file can reference those environment variables. Is there a particular reason the entrypoint couldn’t just read the environment variable directly at startup? We’ve been trying to avoid config files until now because it complicates deployment. (We compile the service project into a conda package that is deployed whole, so knowing the exact path to the config file to reference it on the command line at startup is not trivial…)
Thanks again for your help, and for making an awesome product - looking forward to the new version!

  • Alan

#4

You can read directly from os.environ if that’s easier for you.

Note the changes in this PR too, which should land soon. This makes the parsed config available as a global variable. It also introduces the --define argument, which allows you to specify config values directly on the command line.


#5

awesome, thanks Matt.

  • Alan