Probabilistic programming provides a structured approach to signal processing algorithm design. The design task is formulated as a generative model, and the algorithm is derived through automatic inference. Efficient inference is a major challenge; e.g., the Shafer-Shenoy algorithm (SS) performs badly on models with large treewidth, which arise from various real-world problems. We focus on reducing the size of discrete models with large treewidth by storing intermediate factors in compressed form, thereby decoupling the variables through conditioning on introduced weights.This work proposes pruning of these weights using Kullback-Leibler divergence. We adapt a strategy from the Gaussian mixture reduction literature, leading to Kullback-Leibler Tensor Belief Propagation (KL-TBP), in which we use agglomerative hierarchical clustering to subsequently merge pairs of weights. Experiments using benchmark problems show KL-TBP consistently achieves lower approximation error than existing methods with competitive runtime.