Posts by Tags

JEPA

Notes on Recent Talks about Autonomous Intelligence by Yann LeCun

8 minute read

Published:

This note includes insights from Yann LeCun, often referred to as the father of deep learning. In his talk, he discussed the limitations of current machine learning methods and self-supervised learning methods. He emphasized the need for objective-driven AI and introduced the concept of a modular cognitive architecture, also known as the world model. Additionally, he introduced the Joint-Embedding Predictive Architecture (JEPA), a new approach in the field.

Knowledge graph

LLM

NEFTune: Noisy Embedding Instruction Fine Tuning

2 minute read

Published:

This paper proposes NEFTune, a simple trick by adding noise to embedding vectors during training which improve the outcome of instruction fine-tuning by large margin. If you are using SFT trainier by huggingface, you can use this trick by simply adding one line of code!

LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models

4 minute read

Published:

This paper introduces LLM-Planner, which utilizes LLMs as high-level planners for embodied agents, allowing them to generate and adapt plans according to the current environment. Experiments on the ALFRED dataset show that using less than 0.5% of paired training data, LLM-Planner achieves competitive performance with recent baselines that are trained using the full training data.

Contrastive Decoding: Open-ended Text Generation as Optimization

3 minute read

Published:

The paper proposes a new decoding method for open-ended text generation, called contrastive decoding (CD), which aims to generate text that is fluent, coherent, and informative, by exploiting the contrasts between expert model and amateur model behaviors.

LLM fine-tuning

NEFTune: Noisy Embedding Instruction Fine Tuning

2 minute read

Published:

This paper proposes NEFTune, a simple trick by adding noise to embedding vectors during training which improve the outcome of instruction fine-tuning by large margin. If you are using SFT trainier by huggingface, you can use this trick by simply adding one line of code!

LLM planning

LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models

4 minute read

Published:

This paper introduces LLM-Planner, which utilizes LLMs as high-level planners for embodied agents, allowing them to generate and adapt plans according to the current environment. Experiments on the ALFRED dataset show that using less than 0.5% of paired training data, LLM-Planner achieves competitive performance with recent baselines that are trained using the full training data.

LLM reasoning

Contrastive Decoding: Open-ended Text Generation as Optimization

3 minute read

Published:

The paper proposes a new decoding method for open-ended text generation, called contrastive decoding (CD), which aims to generate text that is fluent, coherent, and informative, by exploiting the contrasts between expert model and amateur model behaviors.