Without Forgetting, Partial Relearning: US AI Researchers Open Possibilities for Cost Reduction in Model Training

Source: CIO Magazine

A recent study finds that the phenomenon of ‘catastrophic forgetting’ during the fine-tuning of large AI models may actually be due to temporary biases in output, not an actual loss of knowledge. By selectively retraining only specific layers, such as self-attention and MLP, researchers demonstrated that models could retain existing abilities while acquiring new functionalities. This method showed promising results in reducing retraining costs and enhancing model stability, particularly for multimodal models like LLaVA and Qwen 2.5-VL.

The implications of this research are significant for enterprises, potentially redefining AI model maintenance strategies. Analysts believe this innovative approach could alleviate common issues related to catastrophic forgetting, enabling more efficient updates without performance degradation. However, further validation across diverse environments is deemed essential to establish the approach’s robustness in practical applications.

Embedding such strategies requires rigorous oversight and a clear workflow from enterprises to ensure consistency and efficacy during partial retraining. Overall, the findings indicate a promising route for AI developers, enabling quieter and more frequent updates to complex models without starting from scratch.

👉 Pročitaj original: CIO Magazine