Event subscription in a PyTango device
|
|
---|---|
Hi all, Here at MAX-IV, we have a lot of higher level devices that subscribe to change events from lower level devices (typically, valve devices subscribing to a PLC device). One issue I have with the event subscription is that the event callback thread and a client request thread might run concurrently, since there is no implicit locking. Therefore I'm using explicit locking but it is not a perfect solution; it is actually quite hard to maintain. For instance, the device can deadlock if those two threads end up waiting for the explicit lock and the monitor lock (the event callback thread might try to acquire the monitor lock if it pushes an event for instance). A solution could be to have the monitor lock (called AutoTangoMonitor in C++) accessible via a device method:
Event better, the device itself could provide a method for event subscription:
The implementation of the method would look like:
This is the best solution I could find, and doesn't look too hard to implement. Please let me know if there is a better way to deal with that kind of issue. Otherwise, I'll fill up a feature request. Thanks, Vincent |
|
|
---|---|
|
Hi Vincent, I stumbled across a similar problem some time ago. I thought about exposing the tango monitor to python but I suspect that this would just create another deadlock between the tango monitor and the python GIL. Example: th1: client request to read attribute: th1: lock tango monitor th2: event callback th2. lock python GIL th2. lock tango monitor th1. lock python GIL — Deadlock — My workaround for the problem is to have a worker thread waiting for jobs. Anytime I have a tango blocking call I throw it into the worker thread. I call this a workaround on purpose because it is not actually a solution. I think the real problem is actually that tango is using the same lock to handle different things. Anyway, another thing you might try is to completely disable the tango serialization model:
This will disable the Tango Monitor completely. The default value is BY_DEVICE which normally prevents concurrent access to the device. If your device cannot handle concurrent read/command then it would be completely up to you to implement the serialization you need. Hope it helps Tiago |
|
|
---|---|
Thanks Tiago for the quick answer! The GIL deadlock makes sense, I haven't thought about that. However, disabling the serialization or having a worker thread is not going to work for me. My plan is to have the devices working like a typical single-threaded asynchronous program. The fact that the server is actually multi-threaded is not a problem as long as each client request / polling callback / event callback run one after the other. This solution presents the following characteristics: - no lock, easier to maintain, less cognitive load - no performance drawback since the parallelization is limited by the GIL anyway - blocking IO calls are not an issue since I'm mostly relying on change event from lower devices. Another way to achieve the same idea would be to use a local client calling a dedicated command:
But it feels like overdoing it (extra thread, extra socket) and it has limitations (no exception handling, single data type). I thought about two solutions to avoid the GIL/monitor deadlock. The first one is to expose the Tango monitor but to release the GIL when the program tries to acquire it. Just like AutoPythonGIL does (in pytgutils.h) but the other way around. They're is a paragraph in the boost.python HowTo but I'm definitely not a boost expert so I might miss something. The second one seems even simpler. An optional lock argument could be added to the subscribe method to pass the device reference to the c++ callback:
The subscribe_event method save the device reference in the PyCallBackPushEvent object, just like it does with the callback. Then, when an event is received, the monitor lock is acquired before the GIL. In src/boost/cpp/callback.cpp:
Again I might miss something but let me know if you plan to experiment, I'll be happy to help! Thanks, Vincent |
|
|
---|---|
|
Hi Vincent, Sorry for the late answer. Vincent M I assume you are I/O bound (not CPU bound). If that is the case, you might consider gevent . I have been working on an experimental PyTango gevent friendly server. The results seem promising. The code is already available in the last version of PyTango. Here is a snippet. I have already used this in a server which is in production in some beamlines at the ESRF. Be aware that if your server communicates with other devices, it should use ``PyTango.gevent.DeviceProxy``. Anyway, I think your suggestion to export the TangoMonitor is feasible. Do you fell confident enough to make a pull request in github or do you prefer I do the implementation? As a principle I try to keep the API as close as I can to the TANGO C++ API so I would avoid changing the signature of subscribe_event if possible. FYI,
is equivalent to:
|
|
|
---|---|
Hi Tiago, Yes, most of our devices are actually IO bound, that's why I'm very interested in taking profit of single-threaded asynchronous libraries like gevent. Your example looks very promising! Is it possible to disable the monitor lock completely in your example? A typical usecase for that is:
Also, I've been looking into asyncio and I find those explicit coroutines very interesting! Especially with the new await and async keywords now available in python 3.5. In a tango server, it could provide us with very elegant syntaxes like: And there's a library called aiozmq to interface asyncio with zmq, so it could help us to move toward a pure python implementation, once the corba-to-zmq transition is over All right for the push request! I don't really feel comfortable with boost, but I'll see if there is someone in the team to help me with it. Since you don't want to change the DeviceProxy.subscribe_event prototype, we'll try to make the monitor lock available through the server API. Probably with a method called Device.get_monitor_lock() that returns a context manager. Oh, I haven't thought of replacing `push_event` with `fire_events` in order to avoid the deadlock. I'll give it a try as well! Thanks, Vincent |
|
|
---|---|
|
Vincent MWhen you run the server in gevent mode, the TANGO serial model is set to NO_SYNC virtually disabling the monitor. Yes, asyncio rocks! I agree the new keywords look very very interesting. I have to investigate a little better if/how to change the TANGO event loop to make it asyncio friendly. Honestly I haven't though about it too much because they are still changing a lot of things in the python API. That's why until now all my efforts have been into gevent. If there is any asyncio expert/fan listening: be glad to team up with you to make this work Didn't now about aiozmq. Might by be interesting to see how they did it to steal some ideas for Tango. A pure python implementation would help in making it coroutine friendly (Matias Guijarro has already tested a pure python implementation of TANGO using a pure python CORBA implementation). The problem is that implementing all the TANGO logic for both client and server would take a lot of effort. Vincent Mme neither . I can help if you need. Thanks for the insights Cheers Tiago |
|
|
---|---|
TCoutinho Alright, that definitely makes sense! TCoutinho Well, I had a look and realized it was pretty easy to implement, since you did most of the work by using an executor for gevent. I commited an asyncio executor here but it is still untested. I'll play with it a bit before sending a pull request TCoutinho Yes it surely is a huge amount of work! Maybe some day… I'll have a discussion with the team to see if someone else is interested in having the monitor lock available in PyTango. I think people are also interested in exposing other methods like fill_attr_polling_buffer and I have a few pieces of code that could be useful to have in the library, so you might receive a few pull requests over the next weeks Cheers, Vincent |