Information
We are currently investigating an issue with the editor of some pages. Please save your work and avoid to create new pages until this banner is gone.
All the tests were done sending 10000 logs in total, sending them as fast as possible:
Scenario | Process | Threads | Output | Logging Proxy | Central Logger | Oneway | Total | Average | Min | Max |
---|---|---|---|---|---|---|---|---|---|---|
1 | Client | 1 | Log (stdout) | 18380 us | 1.8 us | 1.6 us | 308 us | |||
1 | Client | 1 | Log (stdout) | 601893 us | 60.1 us | 5.9 us | 525.8 us | |||
2 | Client | 1 | Log (stdout) | 681522 us | 68.1 us | 52.8 us | 531.7 us | |||
3 | Client | 1 | Log (stdout) | 337458 us | 33.7 | 19.2 us | 353.4 us | |||
4 | Client | 1 | Log (file) | 18639 us | 1.8 us | 1.6 us | 295 us | |||
4 | Client | 1 | Log (file) | 548046 us | 54.8 us | 6.3 us | 783.2 us | |||
5 | Client | 1 | Log (file) | 619841 us | 61.9 us | 52.5 us | 946.9 us | |||
6 | Client | 1 | Log (file) | 320333 us | 32.0 | 16.8 us | 393.5 us | |||
7 | Client | 1 | stdout | 18841 us | 1.8 us | 0.8 us | 447.9 us | |||
8 | Client | 1 | file | 21358 us | 2.1 us | 1.4 us | 332.8 us | |||
9 | Client | 1 | no-op | 834 us | 0.1 us | 0.02 us | 95.2 us | |||
10 | Client | 10 | Log (stdout) | 64520.4 us | 6.4 us | 1.7 us | 610.9 us | |||
10 | Client | 10 | Log (stdout) | 1415860 us | 141.5 us | 4.9 us | 12617.4 us | |||
11 | Client | 10 | Log (stdout) | 2950060 us | 294.902 us | 66.257 us | 5005.26 us | |||
12 | Client | 10 | Log (stdout) | 958351 us | 95.75 us | 16.7 us | 65872.4 us | |||
13 | Client | 10 | Log (file) | 63096.3 us | 6.2 us | 1.7 us | 1007.8 us | |||
13 | Client | 10 | Log (file) | 930486 us | 93.0 us | 6.8 us | 15150.4 us | |||
14 | Client | 10 | Log (file) | 2319040 us | 231.8 us | 54.7 us | 5164.8 us | |||
15 | Client | 10 | Log (file) | 660748 us | 66.0 us | 15.3 us | 67563.6 us | |||
16 | Client | 10 | stdout | 301182 us | 30.1 us | 0.8 us | 1655.3 us | |||
17 | Client | 10 | file | 210906 us | 21.06 us | 1.5 us | 285.3 us | |||
18 | Client | 10 | no-op | 2409 us | 0.2 us | 0.02 us | 257.84 us | |||
19 | Component | 1 | Log (stdout) | |||||||
20 | Component | 1 | Log (stdout) | |||||||
21 | Component | 1 | Log (stdout) | |||||||
22 | Component | 1 | Log (file) | |||||||
23 | Component | 1 | Log (file) | |||||||
24 | Component | 1 | Log (file) | |||||||
25 | Component | 1 | stdout | |||||||
26 | Component | 1 | file | |||||||
27 | Component | 1 | no-op | |||||||
28 | Component | 10 | Log (stdout) | |||||||
29 | Component | 10 | Log (stdout) | |||||||
30 | Component | 10 | Log (stdout) | |||||||
31 | Component | 10 | Log (file) | |||||||
32 | Component | 10 | Log (file) | |||||||
33 | Component | 10 | Log (file) | |||||||
34 | Component | 10 | stdout | |||||||
35 | Component | 10 | file | |||||||
36 | Component | 10 | no-op |
From the statistics results, we can infer that (in a local environment) the biggest performance hit comes from the LoggingProxy itself and not the communication with the Central Logger. Going a bit further we found two slow downs down the path:
xmlElement = XMLParser::parseString(iter->c_str(), true);
And
ts.tv_sec=0; ts.tv_nsec=100; //FIXME: This nanosleep provide stability to the logging system. It shouldn't exist :( nanosleep(&ts,NULL);
Which combined with several mutexes, end up generating thread contention on the logging system.
The impact is of about 60/70 μs on average for each log call, which is not a big number, but we would like to be closer to the 1/10 μs range.
There are different strategies that could be used to fix this problem, such as refactoring the code, replacing the XML parsing library or the mechanism used within the library, re-structuring the use of mutexes to multiple instances or changing them to be RW mutexes.
In the case of the central logger, it's not a mystery the slow down comes from the remote call to the central logger.
m_logger->writeRecords(reclist);
Here the slowdown accounts for ~70/80 μs on a local deployment, but can get closer to 1 ms when the central logger is on a different machine than the process generating the logs.
Again, there are several alternatives to improve the thread contention problems, for example by queuing this task and dispatching them on a separate thread. It is also possible to replace the writeRecords method to be oneway on the IDL, which makes it 2x/3x faster on the local deployment, and it also makes it invariant to the deployment as this call does not wait for a reply from the remote object. The drawback is losing the ability to assess whether the message was sent or there was a problem along the way, which could be leveraged by the use of a callback object.