Welcome to Mozkito

Mozkito is a general purpose framework that allows developers and data miners to effectively mine their own software archives without writing the same mining scripts over and over again.

The Brief

Mozkito is not a full-fledged mining tool that supports everything out of the box. Instead, our goal is to provide a uniform platform for many state-of-the-art mining version archive approaches and techniques. Its modular architecture allows easy extension and improvement for users, miners, and researchers. With Mozkito, we explicitly target researchers by providing a tool set of standard mining techniques and efforts but also provide a platform that allows them to make newly developed (and maybe published) mining approaches reproducible. Read More

Featured Articles

  • How to mine a version control system? The Mozkito versions tool provides an executable that can be used to mine individual version control systems. Read more
  • How to mine a bug database? To demonstrate how to use mozkito-issues to mine bug databases, we will mine the Mozilla bug tracker for the project Rhino—a JavaScript engine written in Java. Read more
  • How to merge persons? We advice Mozkito users to merge persons. Multiple Mozkito tools model person objects. Unfortunately, many software repositories often use different user databases. Read more

Most Recent Publications

It’s not a Bug, it’s a Feature: On the Data Quality of Bug Databases @ICSE_2013

In a manual examination of more than 7,000 issue reports, we found 33.8% of all issue reports to be misclassified. This misclassification introduces bias in bug prediction models, confusing bugs and features: On average, 39% of files marked as defective actually never had a bug. This study was carried out using the Mozkito framework. [read more...]
 

Most Recent Datasets

Most Recent Posts

The Persistence Module

The mozkito-persistence module is the most fundamental module of Mozkito. It provides the mechanisms and functionality to persists Mozkito model objects into a relation database and to load them back again. [read more...]

How to merge persons?

We advice Mozkito users to merge persons. Multiple Mozkito tools model person objects. Unfortunately, many software repositories often use different user databases. Thus, many developers and users being active in both systems ... [read more...]
 

Motivation

These citations express our motivation to develop and use Mozkito as mining framework. Using Mozkito provides the opportunity to share data sets and source code of approaches and allows replicated studies—even if the data sets cannot be published.

Robles, Gregorio. 2010. “Replicating MSR: A Study of the Potential Replicability of Papers Published in the Mining Software Repositories Proceedings.” In 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010), 171–180. IEEE.
We have analyzed the papers that contained any experimental analysis of software projects for their potentiality of being replicated. In this regard, three main issues have been addressed: i) the public availability of the data used as case study, ii) the public availability of the processed dataset used by researchers and iii) the public availability of the tools and scripts. A total number of 171 papers have been analyzed from the six workshops/working conferences up to date. Results show that MSR authors use in general publicly available data sources, mainly from free software repositories, but that the amount of publicly available processed datasets is very low.