The UpdatingBatch performance effect

It is always nice to have benchmarks to see if what we use does really impact performance. That’s why I thought I would do this Updating Batch performance assessment. To see how much using a UpdatingBatch can improve performance. The idea was also to see if we are creating/updating/deleting a lot of objects, what is the best chunk size, meaning the number of objects you will include in one batch, then create a new one. Of course you can add all your updates to the same batch, but then you could get a Transaction Error from your DB because the transaction would become to big to handle.

As usual I did three tests, on Local, over LAN and over Internet. Although the actual time of execution is not really relevant since we don’t have the same machines, or the same internet connection, what really matters here is the evolution of the execution time depending of the chunk size, where two environments have for bottle neck the network, and the first one the machine resources.

Creation of 1000 sub-folders


We can see on the chart that the machine was struggling with resources (it is not my usual EXSi server but a VM on my laptop). Here the bottle neck isn’t the network, so the line isn’t as nice, but we can still conclude than with the same resources and no network problem, the Updating batch is still faster, which means it is less power consuming. Also the NO_REFRESH mode is slightly faster, but this is not as significant as I would have thought.

Here we can see that when network traffic becomes more of an issue, the REFRESH mode becomes also more important. With an execution time three times faster when using NO_REFRESH with an UpdatingBatch. Also since traffic is important, increasing the chunk size has more impact, because it decreases the number of round trips made between the server and the client. And last but not least, difference between using an UpdatingBatch and saving each creation is huge, since UpdatingBatch makes creation 5 times faster when refreshing, and 7 times faster when not refreshing! Increasing chunk size has a positive impact until 40 updates per batch, after which it’s not significant anymore, because the data transfer time becomes more significant than creating the request. So no need to use 5,000 because you will just risk a DB failing because of huge transaction for no performance improvement 🙂

Using a connection over internet, where network traffic is also in issue, we observe the same thing as with LAN, we can see that on a creation, the Updating Batch makes the creation 8 times faster. This is also interesting to see that performance stabilizes with a chunk size superior to 10. A chunk size between 20 and 60 gives really good performance, so no need to go higher than that.

From the three charts, we can conclude that Updating Batch use less bandwidth and less resources for the server. But the best gain is on network, which is good since it is generally the bottle neck we get.

Deletion of 1000 sub-folders

Here, as for the creation, the machine is struggling. Even if it will be clearer on the LAN and Internet chart, we see that there isn’t really a difference between refresh and no refresh for deletion. However using a Updating Batch makes things quicker, from half the time to third the time, so here again, Updating batch are less power consuming.

For LAN and Internet chart, It is interesting to see that the RefreshMode does really have no impact at all, with or without using an Updating Batch, which makes sense since it is a deletion, and there is actually nothing to refresh. But it is always good to see it in practice.

From these three charts, we can conclude that Updating Batch make deletion a lot faster, more than 5 times faster, using less bandwidth and less resources.

Creation of documents

In this section we will compare execution time when creating documents. First documents with small content (1 kb), then bigger one (50 Mb), to see the difference when there is a more data to transfer.

From this 2 charts, we can see that the chunk size has less effect with big file. Indeed it makes sens because he transfer size in more significant and therefore the time to create query and communicate with the server becomes less significant compare to the transfer time. This is the same with the update mode, updating the batch becomes unsignificant compare to the transfer time. However there is still a positive impact (even if really small) to increase the chunk size. With small file, the refresh mode and the chunk size take a huge importance, because there are a lot more transaction with the server, and reducing this time have a lot more effect. To conclude, if you are working with an important number of small file, refresh only if you need it, and be careful to the chunk size, with bigger file, it is still good to do but don’t expect a huge performance improvement,

 

Leave a Reply