Constructing attribute weights from computer audit data for effective intrusion detection

Wei Wang*, Xiangliang Zhang, Sylvain Gombault

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

41 Scopus citations


Attributes construction and selection from audit data is the first and very important step for anomaly intrusion detection. In this paper, we present several cross frequency attribute weights to model user and program behaviors for anomaly intrusion detection. The frequency attribute weights include plain term frequency (TF) and various forms of term frequency-inverse document frequency (tfidf), referred to as Ltfidf, Mtfidf and LOGtfidf. Nearest Neighbor (NN) and k-NN methods with Euclidean and Cosine distance measures as well as principal component analysis (PCA) and Chi-square test method based on these frequency attribute weights are used for anomaly detection. Extensive experiments are performed based on command data from Schonlau et al. The testing results show that the LOGtfidf weight gives better detection performance compared with plain frequency and other types of weights. By using the LOGtfidf weight, the simple NN method and PCA method achieve the better masquerade detection results than the other 7 methods in the literature while the Chi-square test consistently returns the worst results. The PCA method is suitable for fast intrusion detection because of its capability of reducing data dimensionality while NN and k-NN methods are suitable for detection of a small data set because of its no need of training process. A HTTP log data set collected in a real environment and the sendmail system call data from University of New Mexico (UNM) are used as well and the results also demonstrate the effectiveness of the LOGtfidf weight for anomaly intrusion detection.

Original languageEnglish (US)
Pages (from-to)1974-1981
Number of pages8
JournalJournal of Systems and Software
Issue number12
StatePublished - Dec 2009
Externally publishedYes

Bibliographical note

Generated from Scopus record by KAUST IRTS on 2023-09-21


  • Chi-square
  • Distance measures
  • Intrusion detection
  • Masquerade detection
  • Principal component analysis
  • k-Nearest neighbor

ASJC Scopus subject areas

  • Software
  • Information Systems
  • Hardware and Architecture


Dive into the research topics of 'Constructing attribute weights from computer audit data for effective intrusion detection'. Together they form a unique fingerprint.

Cite this