Posts

This page helps to keep track of my study notes.

2023

NEFTune: Noisy Embedding Instruction Fine Tuning

2 minute read

Published:

This paper proposes NEFTune, a simple trick by adding noise to embedding vectors during training which improve the outcome of instruction fine-tuning by large margin. If you are using SFT trainier by huggingface, you can use this trick by simply adding one line of code!

LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models

4 minute read

Published:

This paper introduces LLM-Planner, which utilizes LLMs as high-level planners for embodied agents, allowing them to generate and adapt plans according to the current environment. Experiments on the ALFRED dataset show that using less than 0.5% of paired training data, LLM-Planner achieves competitive performance with recent baselines that are trained using the full training data.

Notes on Recent Talks about Autonomous Intelligence by Yann LeCun

8 minute read

Published:

This note includes insights from Yann LeCun, often referred to as the father of deep learning. In his talk, he discussed the limitations of current machine learning methods and self-supervised learning methods. He emphasized the need for objective-driven AI and introduced the concept of a modular cognitive architecture, also known as the world model. Additionally, he introduced the Joint-Embedding Predictive Architecture (JEPA), a new approach in the field.

Contrastive Decoding: Open-ended Text Generation as Optimization

3 minute read

Published:

The paper proposes a new decoding method for open-ended text generation, called contrastive decoding (CD), which aims to generate text that is fluent, coherent, and informative, by exploiting the contrasts between expert model and amateur model behaviors.