Large Language Models and Code Copilots for AI-Assisted Programming
Introduction
This tutorial delves into the evolution of AI-assisted programming, tracing its roots to E.W.Dijkstra’s seminal idea of computer-assisted programming and to Natural Language Processing (NLP) and probabilistic language models. It highlights the recent transformative impact of modern transformerbased large language models (LLMs) trained on Big Code, leveraging software naturalness to revolutionize tasks like code generation, completion, translation, and defect detection. Pioneering examples include GitHub Copilot (powered by OpenAI Codex), GPT models, Meta’s Code Llama, Google’s Gemini Code Assist, Amazon CodeWhisperer, Alibaba’s Qwen, and Codeium.
Participants will explore advancements in contextual-aware, multilingual programming models that enhance the adaptability of both local and cloud-based LLMs in diverse ecosystems. Core LLM architectures, their downstream applications, and challenges in integrating NLP methodologies with software naturalness will be examined. The tutorial highlights reinforcement learning with human feedback, focusing on alignment techniques to enhance fairness, safety, and performance in code generation by large language models. The session demonstrates AI-assisted programming extensions to Apple’s Xcode and LLM agent development, showcasing tools like Copilot to streamline mobile development and empower participants to evaluate, benchmark, and deploy LLMs effectively.
The tutorial will also focus on general techniques for benchmarking and evaluation of LLMs for AIassisted programming. Models are assessed using code-specific benchmarks such as HumanEval and CodeNet, providing standardized datasets for evaluating code generation and completion. Performance metrics like Pass@k, BLEU, CodeBLEU, and functional correctness are analyzed to quantify the quality of generated code. Real-world effectiveness is gauged through human evaluations and deployment case studies, which provide valuable insights into user experiences and practical challenges.
Additionally, advanced evaluation methodologies are discussed, including fine-grained analysis to identify common errors, assess model robustness, and measure performance on adversarial inputs. Comparative studies across different programming languages and domains illustrate the adaptability and limitations of various models, including emerging LLM coding agent players, which demonstrate cutting-edge advancements in multilingual programming and cross-domain functionality.
Lastly, LLMs and LLM agents have profound implications for computer science, driving advancements in the search for efficient algorithms and automating problem-solving in competitive programming. By tackling complex programming challenges, they open new avenues for understanding algorithm design, optimization, and the theoretical foundations of computation.
Tutorial Speakers

Chee Wei Tan
Dr. Tan received the M.A and Ph.D. degrees in Electrical Engineering from Princeton University. He is currently with College of Computing and Data Science, Nanyang Technological University in Singapore. He was a postdoctoral scholar in the NetLab Group at Caltech, a senior fellow for Science at Extreme Scales program at the Institute for Pure and Applied Mathematics at UCLA, and was a visiting faculty at Tencent AI Lab and Qualcomm R&D (QRC). His research interests are distributed optimization, Generative AI, networks and edge learning.

Yuchen Wang
Yuchen Wang is a Ph.D. candidate at Nanyang Technological University and a full-stack software engineer specializing in mobile development. She received her B.E. and M.S. degrees in Computer Science, both with Honours with the Highest Distinction, from the National University of Singapore. Her research mainly focuses on AI-Assisted Programming and Sound and Music Computing.
Affiliation
NTU Singapore
Relevant Links
- Evaluating Large Language Models Trained on Code
- StarCoder: may the source be with you!
- Code Llama: Open Foundation Models for Code
- DeepSeek-Coder: When the Large Language Model Meets Programming – The Rise of Code Intelligence
- Contextual Augmented Multi-Model Programming (CAMP): A Hybrid Local-Cloud Copilot Framework
- Natural Language Generation and Understanding of Big Code for AI-Assisted Programming: A Review
- Copilot for Xcode: Exploring AI-Assisted Programming by Prompting Cloud-based Large Language Models
- Aligning Crowd-Sourced Human Feedback for Reinforcement Learning on Code Generation by Large Language Models
Slides
Coming soon.