Thread hold GIL(Global interpreter Lock) around 6-12 second while reading from proxy of remote DS
|
|
---|---|
Dear All, We are using Python 2.7.12, Tango 9.2 ,PyTango 8.1.8 and taurus 3.7. In our application architecture we have two threads, the main Thread handles GUI updates operation and other thread handles data collection operations. In data collection thread main task is to create device proxy for local as well as remote machine and also collect attribute list(by using get_attribute_list() methods). we are updating the attribute list and it's properties on regular time interval(5 sec). We have observed that while reading the attribute or getting attribute list from remote device server, we are suspecting that it holds GIL for 6-12 second. This issue in-turn hangs GUI because main Thread is not getting GIL for that particular time. I have attached example and log file to support above scenario.In which, I have created two thread, main thread prints Count value and second thread creates device proxy and read attribute repetitively.In this example we observed that main thread get GIL after 10-12 secs.Please go through log file for same. This is some portion of log file : 2018-03-08 18:59:07,223 MainThread : Thread: MainThread,count:787 2018-03-08 18:59:07,216 Thread-1 : Exception occurred 2018-03-08 18:59:07,223 Thread-1 : Reading attribute Date: 2018-03-08 18:59:13,233 Thread-1 : Exception occurred 2018-03-08 18:59:13,233 Thread-1 : Reading attribute Date: 2018-03-08 18:59:19,244 Thread-1 : Exception occurred 2018-03-08 18:59:19,244 Thread-1 : Reading attribute Date: 2018-03-08 18:59:13,233 MainThread : Thread: MainThread,count:786 2018-03-08 18:59:25,254 Thread-1 : Exception occurred 2018-03-08 18:59:25,254 Thread-1 : Reading attribute Date: 2018-03-08 18:59:25,254 MainThread : Thread: MainThread,count:785 2018-03-08 18:59:31,265 Thread-1 : Exception occurred 2018-03-08 18:59:31,265 MainThread : Thread: MainThread,count:784 Explanation about logs: In between main thread count 786 to 785, GIL was held by thread 1 which is around 12 second.This is observed throughout the code. Is this type of behavior expected ? is there anything wrong in our implementation? If this is the desired behavior then can we have attribute read method with timeout parameter. Thanks & Regards, TCS GMRT
Regards,
TCS_GMRT |
|
|
---|---|
As a general rule, libtango and pytango should share the same major and minor version (for a version X.Y.Z, X and Y should match). Could you try with pytango 9.2.2 and see if it makes any difference? Thanks, |
|
|
---|---|
Thanks VIncent for prompt reply. I have tried with PyTango 9.2.2 but not get any help for above issue.The issue is still observed while reading attribute. This is some portion of logs: 2018-03-09 19:54:39,889 MainThread : Thread: MainThread, count:954540 2018-03-09 19:54:39,884 Thread-1 : Exception occured 2018-03-09 19:54:39,890 Thread-1 : Reading attribute Date: 2018-03-09 19:54:48,910 Thread-1 : Exception occured 2018-03-09 19:54:48,910 MainThread : Thread: MainThread, count:954539 In between count value count:954540 to count:954539, there is almost 9 seconds gap which indicate that Thread 1 is holding GIL for that period of time. Please correct me if my understanding is wrong or I am not missing some step.Please go through attached code snippet and correct me if it need to update. Thanks & Regards, Hitesh Patel
Regards,
TCS_GMRT |
|
|
---|---|
Thanks for trying with the last pytango release, unfortunately I can't reproduce your issue. How long does it take for you before the main thread gets stuck? Also, what OS do you use exactly? And what happens on the server side? Is the exception a timeout? |
|
|
---|---|
|
Hi, Can you try a simple pytango client (no gui) to see how the system behaves? Do you know where the Exception occured message is coming from? Don't be so quick to blame the GIL. I would bet more on a taurus lock holding the application than the GIL. |
|
|
---|---|
|
Hi again, Sorry but I didn't see the attachment in your initial post. Please ignore the taurus lock statement and the request to make a simple pytango client. I got confused when you posted you were using taurus. I will have a better look at code. |
|
|
---|---|
Hi TCoutinho, Thanks for your detailed response for GIL releasing in PyTango. Please go through attached code snippet. I have created worker thread from main thread in which I have created client for remote device server using deviceProxy API and repetitively reading Date attribute of that device.while testing we have observed that if server is working slowly then reading attribute from thread is holding GIL because at that time main thread is not printing count. If server is working normally then this issue is not observed. So, we are looking for some timeout parameter while reading method. please let me know if you want any other information. Thanks, Hitesh
Regards,
TCS_GMRT |
|
|
---|---|
Could you replace: with: and see if it helps?Thanks |
|
|
---|---|
Hi Vincent M, Thanks for your suggestion. I will try your suggestion and update the status. Actually,we have already changed the way of reading attribute.we have used PyTango.futures mode as GreenMode which is helping us to have attribute result in predefined timeout parameter. we have used as reading method as >> dev.read_attribute('Date',wait=True,timeout=1) The above way has helped us to have attribute result within predefined timeout or else timeout exception. Thanks, Hitesh
Regards,
TCS_GMRT |
|
|
---|---|
The reason I asked this is because the "dev.<attribute_name>" interface calls "get_attribute_list" which, according to Tiago, does not release the GIL. In theory, this call should not take seconds, but maybe something is off with your server. On the other hand, "read_attribute" does release the GIL properly. So depending on your results, we should be able to tell you whether your problem comes indeed from the combination of "get_attribute_list", the GIL, and your server (which is my main bet), or if we should look in another direction. Cheers |