Information
We are currently investigating an issue with the editor of some pages. Please save your work and avoid to create new pages until this banner is gone.
After a container segfaults and is restarted, objexp and other clients cannot seem to connect to components within the container. Why?
The shutdown() command accepted by the Manager (and by the Containers) to request a shutdown is a oneway method (since ORB::shutdown() will be called before invocation is completed - see also the explanation given in the Adv. C++ CORBA prog. book.).
Therefore the method returns immediately.
The Manager needs some time to actually shutdown ensuring that there are no pending activities. Therefore the Manager and JVM remain active several seconds after the shutdown() call has returned.
We will look for a better solution with ACS 4.0.
For the time being applications should check (if possible) for the Manager JVM in the process table or wait for some seconds (10 should be a reasonable value) before assuming that the Manager really shutdown.
The reason why objexp can be used to manipulate components that have been shutdown and restarted (by restarting a container) without restarting objexp itself is because the components are persistent objects. This is accomplished by the acsStartContainer script assigning what is more or less a static TCP port to the container it runs. What does this have to do with the inability to restart a container? A lot believe it or not!
Much like the manager shutdown delay described above, the acsStopContainer command uses a CORBA oneway command to stop the container. That is, just because acsStopContainer returns control does not necessarily mean the container has really shutdown! Furthermore, the container has no real control over what threads are started and more importantly stopped by your component code. In a worst case scenario:
Nine out of ten times the scenario depicted above is what's really going on but there are indeed other possible culprits:
With ACS 4.1.1, we implemented extra logic into the acsStartContainer script itself to workaround the segfaulting components.
When C++ containers segfault as a result of poorly implemented components, you must run the acsStopContainer command if the container was started by the acsStartContainer script to reclaim the TCP port number. If you do not do this - the next time acsStartContainer is run it picks a new TCP port for the container. objexp as well as other clients of components use the old TCP port for the components causing CORBA no resources exceptions and it to appear like the container and components are broken when in fact they are not.
The detailed summary is the following:
-- DavidFugate - 17 Sep 2004