#### Finding Clustering Configurations to Accurately Infer Packet Structures from Network Data

##### Othman Esoul, Neil Walkinshaw

Clustering is often used for reverse engineering network protocols from captured network traces. The performance of clustering techniques is often contingent upon the selection of various parameters, which can have a severe impact on clustering quality. In this paper we experimentally investigate the effect of four different parameters with respect to network traces. We also determining the optimal parameter configuration with respect to traces from four different network protocols. Our results indicate that the choice of distance measure and the length of the message has the most substantial impact on cluster accuracy. Depending on the type of protocol, the $n$-gram length can also have a substantial impact.

arrow_drop_up