After a container segfaults and is restarted, objexp and other clients cannot seem to connect to components within the container. Why?
A: The shutdown() command is onewayThe shutdown() command accepted by the Manager (and by the Containers) to request a shutdown is a oneway method (since ORB::shutdown() will be called before invocation is completed - see also the explanation given in the Adv. C++ CORBA prog. book.). Therefore the method returns immediately. The Manager needs some time to actually shutdown ensuring that there are no pending activities. Therefore the Manager and JVM remain active several seconds after the shutdown() call has returned. We will look for a better solution with ACS 4.0. For the time being applications should check (if possible) for the Manager JVM in the process table or wait for some seconds (10 should be a reasonable value) before assuming that the Manager really shutdown. Q: When I try to stop and start a container again it fails. Why?A: Either the acsStop* scripts have failed or more likely is that a developer component implementation fails to kill a thread it spawns.The reason why objexp can be used to manipulate components that have been shutdown and restarted (by restarting a container) without restarting objexp itself is because the components are persistent objects. This is accomplished by the acsStartContainer script assigning what is more or less a static TCP port to the container it runs. What does this have to do with the inability to restart a container? A lot believe it or not! Much like the manager shutdown delay described above, the acsStopContainer command uses a CORBA oneway command to stop the container. That is, just because acsStopContainer returns control does not necessarily mean the container has really shutdown! Furthermore, the container has no real control over what threads are started and more importantly stopped by your component code. In a worst case scenario:
Nine out of ten times the scenario depicted above is what's really going on but there are indeed other possible culprits:
Q: After a container segfaults and is restarted, objexp and other clients cannot seem to connect to components within the container. Why?With ACS 4.1.1, we implemented extra logic into the acsStartContainer script itself to workaround the segfaulting components. A: Even though the process segfaulted and control has been returned to the console, you must issue the acsStopContainer command to reclaim the TCP portWhen C++ containers segfault as a result of poorly implemented components, you must run the acsStopContainer command if the container was started by the acsStartContainer script to reclaim the TCP port number. If you do not do this - the next time acsStartContainer is run it picks a new TCP port for the container. objexp as well as other clients of components use the old TCP port for the components causing CORBA no resources exceptions and it to appear like the container and components are broken when in fact they are not. The detailed summary is the following:
-- DavidFugate - 17 Sep 2004 |
Related articles appear here based on the labels you select. Click to edit the macro and add or change labels.
|