Abstract

The Multiple Instance Regression (MIR) problem arises when a data set is a collection of bags, where each bag contains multiple instances sharing the identical real-valued label. The goal is to train a regression model that can accurately predict the label of an unlabeled bag. Many remote sensing applications can be studied within this setting. We propose a novel probabilistic framework for MIR that represents bag labels with a mixture model. It is based on an assumption that each bag contains the prime instance which is responsible for the bag label. An expectation-maximization algorithm is proposed to maximize the likelihood of the mixture model. The mixture model MIR framework is quite flexible and several existing MIR algorithms can be described as its special cases. The proposed algorithms were evaluated on synthetic data and remote sensing data for aerosol retrieval and crop yield prediction. The results show that the proposed MIR algorithms achieve higher accuracy than the previous state-of-the-art.

The datasets used in the experiments is available for download.

Paper citation

Please cite as:
Wang, Z., Lan, L., Vucetic, S., Mixture Model for Multiple Instance Regression and Applications in Remote Sensing, 2011.