Task 1: Commonsense Dialogue Response Generation | Kaihua Ni
Kaihua Ni, an alumnus of the University of Leeds with a major in Artificial Intelligence, has extensive experience in AI and deep learning algorithms, having worked at major companies such as Augmentum and CareerBuilder. His expertise lies in natural language processing and the nuances of human conversation.
Winning Strategy:
Kaihua’s strategy for the Commonsense Persona-Grounded Challenge 2024 was rooted in a two-pronged approach focusing on fine-tuning a large language model (LLM) and expert, prompt engineering:
- Fine-Tuning the LLM:
Utilizing transfer learning techniques, Kaihua adapted the pre-existing parameters of the LLM to the specific conversational style and knowledge domain of the persona being emulated. The model was trained on a curated dataset consisting of dialogues, written works, and other textual representations of the persona, significantly enhancing its ability to mimic the persona’s syntactic and semantic patterns.
- Prompt Engineering:
Kaihua crafted optimized prompts that encapsulated the context of the conversation while embedding subtle cues that aligned with the personal characteristics of the persona. This optimization steered the model to generate responses that were contextually relevant and infused with the persona’s idiosyncratic communication style.
- Advanced NLP Techniques:
Attention mechanisms and context window adjustments were employed to maintain coherence and context retention across multi-turn dialogues. Kaihua also developed a custom evaluation metric aligned with the challenge’s criteria to iteratively assess and refine the model’s performance.
- Ethical Considerations:
Ethical aspects were critically addressed, ensuring that the AI’s mimicry respected the privacy and dignity of the individual. Kaihua implemented strict boundaries on the use of personal information and incorporated safeguards against generating inappropriate or harmful content.
Task 2: Commonsense Persona Knowledge Linking | Iris Lin
Iris Lin, an experienced Machine Learning Engineer, specializes in developing and optimizing large-scale machine learning models. With a robust background in deep learning, natural language processing, and recommendation systems, Iris is proficient in Scala, Python, Java, Spark, PyTorch, TensorFlow, and various machine learning libraries.
Winning Strategy:
Team biu_biu, managed by Iris Lin, employed a comprehensive approach to developing a commonsense persona knowledge linker for CPDC Task 2, leveraging cutting-edge techniques and tools:
1. Baseline Evaluation with ComFact Model:
The team initiated the project by testing the provided ComFact baseline model on a hidden test to pinpoint its limitations and identify areas for improvement in linking persona commonsense facts to dialogue contexts.
2. Dataset Curation and Enhancement:
The team merged the Conv2 and Peacock datasets to refine the training process, focusing on conversations that incorporated persona knowledge. This ensured the model was trained on highly relevant data. They also employed GPT-3.5-Turbo to create a synthetic dataset by labelling persona facts in 20,000 conversations, providing a diverse and extensive training foundation.
3. Model Fine-Tuning with Deberta-V3:
The team fine-tuned the Deberta-V3 model, renowned for its effectiveness in various NLP tasks. A rigorous hyperparameter search was conducted to optimize performance, particularly in capturing the nuances of persona knowledge linking.
4. Comprehensive Model Evaluation:
The model was rigorously evaluated under two settings: predicting head and tail facts separately and simultaneously. This dual-testing strategy allowed the team to thoroughly assess the model’s versatility and pinpoint any potential enhancements.