Winner’s Solution Overview: KDD Cup 2024 - Team NVIDIA

Team NVIDIA, a group of data scientists and technologists, brought diverse skills to the KDD Cup 2024. Key members include Gilberto, a former #1 ranked Kaggle competitor with a background in Electrical Engineering; Chris, a Ph.D. holder in computational science and mathematics with experience across various professions; Benedikt Schifferer, a manager of Applied Research with expertise in recommender systems and LLMs; Ivan Sorokin, a Senior LLM Technologist; Ahmet Erdem, a Kaggle Grandmaster and open-source contributor; and Simon, a senior LLM technologist specializing in deep learning applications in computer vision and NLP.

Winning Strategy:

Team NVIDIA’s strategy for the KDD Cup 2024 involved the deployment of five fine-tuned Qwen2-72B LLM models, one for each of the competition’s tracks, leveraging cutting-edge techniques and substantial computational resources:

  1. Data Transformation and Model Training:

• The team transformed data from six public datasets, including Amazon-M2 and MMLU, into 500k question-answer pairs across 40 tasks and 5 task types.

• They fine-tuned multiple Qwen2-72B models using QLoRA on NVIDIA’s powerful 8xA100 80GB GPUs, employing techniques like DeepSpeed and Axolotl for efficiency.

  1. Model Optimization and Fine-Tuning:

• Fine-tuning involved adjusting LoRA parameters and experimenting with different weights for model adapters to optimize performance across various tasks.

• The models were trained with specific prompts tailored to simulate an online shopping assistant, enhancing task-specific performance.

  1. Quantization and Inference Optimization:

• To meet the competition’s stringent hardware limitations and inference time constraints, Team NVIDIA employed 4-bit AWQ quantization and batch inference strategies using software vLLM, significantly reducing the model’s memory footprint.

• During inference, logits processors were added to the model’s predictions to ensure output accuracy, particularly in handling structured responses like numbers and commas.

  1. Ensemble Techniques:

• The final submissions for each track involved sophisticated ensembles of base models and multiple LoRA adapters, fine-tuned to enhance the accuracy and robustness of the solutions.

Impact and Contributions:

Team NVIDIA’s comprehensive approach showcased their technical prowess and ability to innovate within constraints, leading to their first-place victory in all five competition tracks. Their work demonstrates the powerful capabilities of LLMs in handling diverse and complex real-world NLP tasks, particularly in a competitive setting with limited hardware resources.