Přehled o publikaci
2019
An Algorithm for Message Type Discovery in Unstructured Log Data
TOVARŇÁK, DanielBasic information
Original name
An Algorithm for Message Type Discovery in Unstructured Log Data
Authors
TOVARŇÁK, Daniel
Edition
Prague, Proceedings of the 14th International Conference on Software Technologies - Volume 1: ICSOFT, p. 665-676, 12 pp. 2019
Publisher
SciTePress
Other information
Language
English
Type of outcome
Proceedings paper
Country of publisher
Portugal
Confidentiality degree
is not subject to a state or trade secret
Publication form
electronic version available online
Marked to be transferred to RIV
Yes
RIV identification code
RIV/00216224:14610/19:00110676
Organization
Ústav výpočetní techniky – Repository – Repository
ISBN
978-989-758-379-7
ISSN
EID Scopus
Keywords in English
log abstraction; message type discovery; log management; logging; unstructured data
Links
EF16_019/0000822, research and development project.
Changed: 9/9/2020 05:52, RNDr. Daniel Jakubík
Abstract
In the original language
Log message abstraction is a common way of dealing with the unstructured nature of log data. It refers to the separation of static and dynamic part of the log message, so that both parts can be accessed independently, allowing the message to be abstracted into a more structured representation. To facilitate this task, so-called message types and the corresponding matching patterns must be first discovered, and only after that can be this pattern-set used to pattern-match individual log messages in order to extract dynamic information and impose some structure on them. Because the manual discovery of message types is a tiresome and error-prone process, we have focused our research on data mining algorithms that are able to discover message types in already generated log data. Since we have identified several deficiencies of the existing algorithms, which are limiting their capabilities, we propose a novel algorithm for message type discovery addressing these deficiencies.