Artificially Speaking

#12 - Executable Code Actions Elicit Better LLM Agents

This research paper explores using executable Python code as actions for Large Language Model (LLM) agents. The authors introduce CodeAct, a framework enabling LLMs to generate and execute Python code, dynamically adapting actions based on observations. Experiments across 17 LLMs demonstrate CodeAct's superior performance in complex tasks, achieving up to a 20% higher success rate than alternatives. A new instruction-tuning dataset, CodeActInstruct, is created to improve open-source LLMs' CodeAct capabilities, resulting in CodeActAgent, an open-source agent capable of sophisticated tasks. The paper concludes by discussing the potential benefits and risks of such autonomous agents.