Alapfogalmak
The authors propose an online semantic-based clustering approach to monitor code error life-cycles in log data. Their novel metric evaluates temporal log clusters' performance and outperforms existing systems.
Kivonat
Log analysis is crucial in software maintenance, prompting the need for summarizing and monitoring logs over time. The authors introduce a semantic-based clustering method to track defects efficiently. By collaborating with Software Reliability Engineers, they develop criteria for successful log evolution monitoring. Their algorithm dynamically updates log clusters based on semantic representations and introduces a novel evaluation metric. Experiments show superior performance compared to traditional methods across industrial and public datasets.
Statisztikák
We extracted two months’ worth of data from 57,000 error logs in a private industrial dataset.
The monitored applications generate around 100,000 logs per day, with about 1.5% related to errors and defects.
For public datasets like Loghub (HDFS_2, Linux, Zookeeper, OpenStack), we used samples of 2,000 logs each.
Our algorithm hyperparameters were 𝜃 = 0.05, 𝛼 = 0.1, and 𝛾 = 100.
Idézetek
"We suggest an online semantic-based clustering approach to error logs that dynamically updates the log clusters."
"Our solution outperforms similar systems when tested with an industrial dataset."
"We hope that our work encourages further temporal exploration in defect datasets."