top of page
Search

A Simple Demo of Continual Learning of LLMs

  • Apr 25, 2024
  • 3 min read

Updated: Nov 12, 2025


I Introduction

Background, Objective, and Challenges


Background

Large Language Model (LLM) has rapid development in solving various problems. However, frequent re-training LLMs is not amenable due to high training costs. Hence, continual learning is required to enhance the capabilities of LLMs.


Objective data

Automatically crawl new data from the website to fine-tune the model.


Function

We hope the LLM have the conversation skills and a certain ability to remember past conversations.


Feedback

A feedback loop can be applied to enhance its overall abilities and reduce catastrophic forgetting.


Challenges

  1. OpenAI API Challenge: It was deactivated. Solution: Use open source model Llama. 

  2. Crawling data Challenge: Difficulty in crawling Quora’s data. Solution: Use API for help.

  3. Training data format Challenge: Question and answer format cannot realize the dialogue function. Solution: Use chat template formate for causal language modeling.

  4. Perplexity Challenge: The existing code reload model every time when evaluated, which is time-consuming. Solution: Rewrote code for loading model once.


II Implementation Data Source

Raw Data: We crawled the questions and answers from the technology sector https://www.quora.com/topic/Technology. Pre-processed data: First, we filtered the questions without answers. Then, we transformed the left Q&A pairs into chat mode, where “user”ask questions and “assistant” provide answers. Training Data

Resources Downloaded Quora Question and answer datasets from Hugging_face:

The first 5 of them were used for training.


Features question, answer_text, upvotes, comment_count, answer_shared, answer_date




Code

Data Processing Loop

Crawl data from quora.com by API and output in .json formt


prepare_data.py Convert raw data into the format we needed.


summarize.py Use LLMs to summarize the raw text into high quality dataset.



Training and Evaluation Loop

cl_trainer.py Feed new datasets to the model continually.


eval_metrics.py Perplexity is used to evaluate the performance of our model. 


demo.py A simple demo to chat with the trained models.



III Results

Results and evaluation


Train We use GPT-2 as our pre-trained model. (total parameter: 124M)


  • Train Device: RTX 4090

  • To study to model’s ability to continually learn new knowledge, we choose 5 parts from the dataset “quora_qa”and sequentially feed them to the model.


  • We use LoRA adapter as trainable parameters and merge the adapter only after training the final stream. 


  • Each stream is trained for 3 epochs before the next stream arrives.




Evaluation


Perplexity

where wₙ represents the n-th character of the input text, and N is the length of the input text.






470.7





420.4

434.1

Data Stream



230.7

233.7

237.3



374.3

453.1

441.0

460.0


185.7

178.1

196.1

202.7

205.3




Model Generation




High-Level Insight

Continual Acquiring Knowledge

By training LLMs with sequential data stream using LoRA update technique, the model is able to continually acquiring new knowledge.  


Scale Matters

The generation performance is limited by the scale of the model (GPT-2). In order to fully exploit the advantage of scaling law, larger model is needed.




IV Conclusion

Conclusion and future work

Future Direction

  1. memory augmentation This enhancement could potentially enable the LLM to retain and utilize past interactions more effectively

  2. Pre-training Methods DAP-training incorporates a novel soft-masking mechanism and a new proxy to preserve general knowledge, effectively integrating new and existing knowledge while preventing catastrophic forgetting.

  3.  Against Catastrophic Forgetting MemoryBank allows the LLM to mimic human memory, adapting to user personalities and selectively reinforcing or forgetting details over time.


References


Data: 


Model:


Framework/package: 

  • PyTorch, peft, transformers, crawlbase


Literature:





 
 
 

Comments


Drop Me a Line, Let Me Know What You Think

Thank you for your message!

© 2023 by Easkwon. Powered and secured byWix

bottom of page