*In this paper, we analyze the effect of the stochastic features of the Push-sum protocol on the variance of its convergence rate. We have provided the theoretical introduction and executed the experiments
on 10 randomly generated topologies (the experiments were repeated 1000 times for each topology). The second experiment consists of the analysis of the behavior of the parameter estimation.
We analysed the gained results and the theoretical conclusions have been derived.
*

*Keywords: distributed computing; distributed signal processing; Push-sum protocol; gossip-based aggregation algorithms
*

This paper is devoted to the Push-sum protocol, which is classified as a gossip-based aggregation algorithm. We have analyzed the effect of the stochastic features of this protocol on the variance of its convergence rate. The first part summarized the general knowledge about distributed computing. The second part introduces the set of the gossip-based aggregation algorithms. The last theoretical part deals with the Push-sum protocol, which was analyzed within the practical part. The practical part consists of the experiments executed on 10 different randomly generated topologies (the experiment was repeated 1000 times for every topology). The obtained results have been depicted and analyzed. At the end, we have analyzed the behavior of the estimation.

Distributed computing is classified as the computer science and describes the systems executing distributed algorithms[1]. These systems consist of spatially distributed entities whose goal is to communicate with one another in order to fulfill a specific functionality as the whole. The mutual communicate among them is realized by message passing. We distinguish between two types of message passing[1]:

*Synchronous message passing*- a system is formed by synchronized entities always ready to receive a message from other entities in a system.*Asynchronous message passing*- a system can consist of devices that are not able to respond in a real-time.

- the entities in a system do not share each other's clock.
- the entities are only partially aware of the other elements in a system. They are also poorly informed about the system as the whole.
- the entities can be located far away from each other.
- the entities may not be equivalent to one another.
- the entities are not usually informed about a failure at a different entity.

There are many categories into which distributed algorithms can be divided. In this paper, we have focused our attention on so-called gossip-based aggregation algorithms. The set of gossip-based aggregation algorithms is a type of computer-to-computer communication, inspired by spreading of information between people via gossips in a social environment. Nowadays, a communication solution utilizing gossip-based manner of information interchange is often preferred in systems with specific characteristics such as a complex structure, distributed character of a computation process, an extensive area that a distributed system covers etc. According to [2], the set of the gossip algorithms can be characterized by three main attributes:

*Scalability*- a growth of a network's size does not significantly degrade the performance of a system executing a gossip algorithm. Each entity of a distributed system sends a fixed number of the messages regardless of the system's size as well as it performs a simple algorithm for information exchange with a small rate. In order to be able to fulfill a specific functionality, particular entities need to be aware of the identity of the entities situated in an adjacent area (which can be either local-area or small wide-area) and a small set of constants.*Adaptability*- gossip algorithms allow to easily remove entities from a distributed system. In some cases, mechanisms to execute this action are implemented in an algorithm.*Graceful Degradation*- for many gossip algorithms, there is a threshold value*f*. If not exceeded, the correct functionality of an algorithn is guaranteed. Thus, the reliability of a gossip algorithm is determined by the probability that no more than*f*failures happen. It is hard to determine this probability since it can be affected by the parameters that are complicated to evaluate. Therefore, it is expected in practical implementations that the algorithm's functionality is not rapidly degraded even if the number of failures exceeds this threshold value. The algorithms fulfilling the previous demands are said to degrade gracefully.

The protocol Push-sum is classified as a gossip-based aggregation algorithm formed by an iterative pair-wise distribution of aggregated values within a network.
Its mail goal is to compute sums or averages of the values of all the entities in a distributed system.
The Push-sum protocol is considered to be a highly resistant protocol.
During the whole process, each entity stores two values: the value of the averaged quantity and the weight.
The value of the averaged quantity is set to the initial value of the entity at the beginning of the whole process.
The initial weight equals 1 for all the entities in a distributed system.
The protocol [3] is executed as follows:
During every iteration, each entity chooses uniformly at random one of its neighbors. This neighbor is sent the pair: the half of the value of the averaged quantity and the half of the weight.
This information is also stored in the inner memory of the sending entity. Afterward, all the entities compute the estimation of the average by calculating the ratio of sums of these two parameters.
This procedure is repeated until the system reaches the consensus, i.e. the difference of the maximal and minimum value within a system is less than 0,00015.
According to the authors of [4], the correctness of the Push-sum protocol can be verified in terms of the fundamental property defined as *the mass conservation*.
It says that the global sum of all the estimates in a system is constant in every iteration.
The condition also says that the sum of all the values of the averaged quantity is same during the whole process. The same condition is valid also for weights.

Within the practical part, we have focused on examining the effect of stochastic features of the Push-sum protocol on the rate of the protocol. It means that we have examined how a random choice of a receiver of a message affects the overall number of the iterations necessary for the protocol to be completed. We generated 10 topologies with the generator presented in [5] and depicted all the obtained results in Fig. 1. The systems consist of 50 entities and their topologies can be classified as dense. Since the Push-sum protocol is a converging algorithm executed in an iterative manner, it was necessary to define the convergence event to indicate the convergence. We used the already defined event with the difference less than 0,00015, which ensures a high precision of the computation process. The vertical axis in Fig. 1 represents a particular network. The horizontal axis contains the number of iterations necessary for a system to converge. We executed 1000 repetitions of the same experiment for each topology. From the graph, we can see that the more slow the protocol is for a particular topology, the more spread the obtained data tends to be; therefore, the variance is much larger in these topologies. We used the black diamond to label the average value counted from the 1000 repetitions for one topology. The empty circles have been used to label the number of iterations necessary for a system to converge (there may be more than one result with this value). These circles for different topologies are distinguished from each other by using a different color .

*Fig. 1 The variance of the convergence rate*

*Fig. 2 The behavior of the Push-sum protocol*

This paper provides the analysis of how the stochastic features of the Push-sum protocol affects the convergence rate. We executed 10 000 experiments (1000 repetitions x 10 topologies) and depicted the obtained results. From them, we can see the stochastic features of the Push-sum protocol affect much more significantly distributed systems whose convergence rate is slower. At the end, we showed the behavior of the parameter estimation.

Research described in this paper was financed by the National Sustainability Program under grant LO1401. For the research, infrastructure of the SIX Center was used.

[1] Kenyeres, M. (2015). Optimalization of Distributed Classification of the Convergence Event. In Proceedings of the 21st Conference STUDENT EEICT 2015. Vysoké ucení technické v Brne, Fakulta elektrotechniky a komunikacních technologií.

[2] Lin, M. J., Marzullo, K., & Masini, S. (2000). Gossip versus deterministically constrained flooding on small networks. In Distributed Computing (pp. 253-267). Springer Berlin Heidelberg.

[3] Kempe, D., Dobra, A., & Gehrke, J. (2003, October). Gossip-based computation of aggregate information. In Foundations of Computer Science, 2003. Proceedings. 44th Annual IEEE Symposium on (pp. 482-491). IEEE.

[4] Jesus, P., Baquero, C., & Almeida, P. S. (2009, January). Fault-tolerant aggregation by flow updating. In Distributed Applications and Interoperable Systems (pp. 73-86). Springer Berlin Heidelberg.

[5] Kenyeres, J., Kenyeres, M., Rupp, M., & Farkas, P. (2013). Connectivity-Based Self-Localization in WSNs. Radioengineering, 22(3).