An Improved Model for Web Usage Mining and Web Traffic Analysis
Abstract
The World Wide Web (WWW) is continuously growing with the information transaction volume from web servers and the number of requests from web users. Providing web administrators with meaningful information about users’ access behaviour and usage patterns have become a necessity to improve the quality of web information service performance. Existing models do not make use of completely detailed and longer period of web log data. There is the need for a model that analyses usage patterns of different aspects of log files collectively and for a longer duration. In this paper, web log data was collected from Information and Telecommunications Unit of Obafemi Awolowo University, Ile Ife. The web log data was comprehensively studied to identify the most important input variables useful for the web usage mining model and web traffic analysis. An improved web mining model was designed using Unified Modelling Language. The developed model was simulated on Waikato Engineering and Knowledge Analysis (WEKA) software using Naïve Bayes’ classifier. The performance of the simulated model was validated using performance metrics: accuracy, recall, precision, true positive and false positive rate and ROC area. The model had a precision value of 0.810, which means that the Naïve Bayes’ classifier got 81% of predictions correctly to their original class. The area under the ROC had a minimum value of 0.993 indicating clearly the level of bias attributed to the predictions made by the Naïve Bayes’ classifier which in this case is 0.7% of all predictions.
Full Text: PDF DOI: 10.15640/jcsit.v6n1a5
Abstract
The World Wide Web (WWW) is continuously growing with the information transaction volume from web servers and the number of requests from web users. Providing web administrators with meaningful information about users’ access behaviour and usage patterns have become a necessity to improve the quality of web information service performance. Existing models do not make use of completely detailed and longer period of web log data. There is the need for a model that analyses usage patterns of different aspects of log files collectively and for a longer duration. In this paper, web log data was collected from Information and Telecommunications Unit of Obafemi Awolowo University, Ile Ife. The web log data was comprehensively studied to identify the most important input variables useful for the web usage mining model and web traffic analysis. An improved web mining model was designed using Unified Modelling Language. The developed model was simulated on Waikato Engineering and Knowledge Analysis (WEKA) software using Naïve Bayes’ classifier. The performance of the simulated model was validated using performance metrics: accuracy, recall, precision, true positive and false positive rate and ROC area. The model had a precision value of 0.810, which means that the Naïve Bayes’ classifier got 81% of predictions correctly to their original class. The area under the ROC had a minimum value of 0.993 indicating clearly the level of bias attributed to the predictions made by the Naïve Bayes’ classifier which in this case is 0.7% of all predictions.
Full Text: PDF DOI: 10.15640/jcsit.v6n1a5
Browse Journals
Journal Policies
Information
Useful Links
- Call for Papers
- Submit Your Paper
- Publish in Your Native Language
- Subscribe the Journal
- Frequently Asked Questions
- Contact the Executive Editor
- Recommend this Journal to Librarian
- View the Current Issue
- View the Previous Issues
- Recommend this Journal to Friends
- Recommend a Special Issue
- Comment on the Journal
- Publish the Conference Proceedings
Latest Activities
Resources
Visiting Status
Today | 181 |
Yesterday | 122 |
This Month | 3792 |
Last Month | 6586 |
All Days | 1470505 |
Online | 16 |