This paper presents a novel preprocessing method of color-to-gray document image conversion. In contrast to the conventional methods designed for natural images that aim to preserve the contrast between different classes in the converted gray image, the proposed conversion method reduces as much as possible the contrast (i.e., intensity variance) within the text class. It is based on learning a linear filter from a predefined data set of text and background pixels that: 1) when applied to background pixels, minimizes the output response and 2) when applied to text pixels, maximizes the output response, while minimizing the intensity variance within the text class. Our proposed method (called learning-based color-to-gray) is conceived to be used as preprocessing for document image binarization. A data set of 46 historical document images is created and used to evaluate subjectively and objectively the proposed method. The method demonstrates drastically its effectiveness and impact on the performance of state-of-the-art binarization methods. Four other Web-based image data sets are created to evaluate the scalability of the proposed method.