long-context
EntropyLong: Efficient Long-Context Training via Uncertainty Prediction
EntropyLong: An Information-Theoretic Approach to Long-Text Training Data EntropyLong is a long-text data construction method based on predictive uncertainty. It identifies locations of missing information by measuring a model’s prediction entropy, retrieves relevant distant context, and verifies whether this context reduces uncertainty — ensuring training data contains genuine long-range dependencies.