Short Buffer Handling The traditional behaivor of networking APIs when a user provided buffer is insufficient is to provide as much data as possible and truncate the rest. Sometimes the user receives a notice that some data was truncated and sometimes no notification is given. Thus it is the user's responsiblity to detect when datagrams are too short and recover in some way (such as re-requesting data). The difficulty with using this approach in Spread is that when the application has to recover from this some properties of the message are lost. For example, if the message was a SAFE message, the other members can rightly assume that either all the members will get the data or they will not get it because they crash or disconnect from Spread . In this case some members might get part of the data, but have to recover the rest of it, also the data can be lost even when the process continues to execute correctly which makes it difficult for the other members to detect the fault. Essentially because each message has attached meaning, such as ordering, or reliability guarantees, unpredictable loss of data in an otherwise reliable system compromises the very semantics we want to use. It is possible to check for this loss and recover, but the costs are significant, especially when weighed against the cost of avoiding the problem in the first place. Thus, unlike UDP datagrams, Spread messages are designed to be reliable even with short buffers. The method used is straightforward. Spread will never truncate large messages unless you explicitly ask it to. When you call SP receive with a data buffer or groups list too short to hold all the data, the SP_receive function will return with an error code of GROUPS_TOO_SHORT or BUFFER_TOO_SHORT and NO data or groups will be returned. The only information that will be returned is in the following parameters: service_type: set to the correct type for the message. sender is empty. num_groups: set to the number of groups the groups parameter needs to accept to avoid a GROUPS TOO SHORT error. This number is returned as a negative number. If there were sufficient groups given then a 0 will be returned groups: is empty. mess_type: set to the message type field the application sent with the original message, this is only a short int (16bits). This value is already endian corrected before the application receives it. endian_mismatch: set to the size, in bytes, of the data buffers needed to completely receive this message and avoid a BUFFERS TOO SHORT error. This number is returned as a negative number. If the buffers were large enough a 0 will be returned. mess: is empty. So, when SP receive returns one of the *_TOO_SHORT errors you can examine the service type and mess type fields to get some information about what kind of message Spread is trying to give you. You can then examine the num groups and endian mismatch fields to discover how large your buffers need to be. You then increase your application buffers and call SP receive again. It should return with the message and without error (unless something else is also wrong). This retry approach is safe with multi-threaded applications because each call succeeds or fails on it's own and if two threads retry for the same message, one will get it and the other will get the message after it (which is what would happen anyway if they were not retrying). The retry approach does, however, require that the application check for errors when calling SP receive and if a *_TOO_SHORT error occurs they either enlarge their buffers or call SP receive again with the DROP_RECV flag set, as described below. If they either ignore errors or do not correct the short buffers, the application will continually loop calling SP receive and not receiving anything. If the application does not want to actually receive the entire data buffer or groups list, it has the option of calling SP_receive with the service type field set to the DROP_RECV flag. When this is done, Spread will treat the message just like most networking systems and return all the data and groups that will fit in the available space and truncate the rest. It will still return an error value informing the application that it has lost data. In simple applications or ones with relaxed, or specialized requirements this might be more useful then having to check for error values and retry the SP receive.