AUTOLLM

Mohammad Areeb

February 11, 2024

•

min read

In this post, I will demonstrate a brief overview of the 'AUTOLLM' concept, a proprietary technology developed by InSituate. The journey of AUTOLLM, right from its underlying motivation to implementation, marks a stepping stone toward Artificial Generative Intelligence (AGI).

The era of Machine Learning has witnessed the development of various autonomous agents over time, the most fundamental being automated hyperparameter tuning. Instead of playing around with the learning rates and model parameters, and spending countless hours staring at your screen, comparing the model performances, researchers removed this need in its entirety by automating the tuning process. Back in the day, such techniques were grouped into a concept, collectively named AUTOML.

Now with Large Language Models coming into the picture and achieving commendable performances in varied domains, one becomes curious to extend the techniques in automated ML to automation in LLMs [ever thought about it?]. A series of natural questions follow this thought. What are the possible hyperparams one wishes to fine-tune autonomously? How to traverse the search space of the parameter? Does this traversal have a defined direction, and can the search be made efficient?....and the list continues. AUTOLLM, aimed at autonomously configuring Large Language Models is a major breakthrough in AGI, transforming months of efforts into a single click!

Below I have laid the foundation of AUTOLLM and tried to answer each of the natural queries raised above. Ready? Let's dive in.

A defined configuration of a Large Language Model is a set of values of its hyperparameters including its temperature, max_new_tokens, top_p, top_k, content_window, and the instruction prompt. In this blog, we are only concerned with the Instruction Prompt for two chief reasons. Firstly, all the other hyperparams are numeric and hence their automated tuning can be handled by the already-developed techniques in AUTOML. Secondly, as has been observed by experiments, the instruction prompt of an LLM is the most critical parameter affecting the model responses. The other mentioned variables do indeed affect the model performance, however, the dependency of the response on these variables is insignificant as compared with the LLM prompt.

The LLM Instruction Prompt lies in the semantic text space of a defined language, guiding the LLM to generate responses. Automating its fine-tuning is met with the major challenge of traversing the prompt space, which can be really huge [can we define its dimension?]. A sufficient question to answer this is what defines a 'change' in the prompt? and how to bring about this 'change'?

Interpreting a 'change' in the instruction of an LLM, to fine-tune it, naturally drives one to think of a 'slightly better' prompt than the previous one. So a naive solution one can think of is to keep 'changing' the LLM prompt to a 'slightly better' version iteratively up to a defined stopping criteria [max cap on iterations is a good idea in this case]. However, believe me, it doesn't work! Try giving an instruction string to GPT and ask to 'slightly improve' on it. Repeat this at least 10 times....you realize you are on the wrong path.

The problem with this is that 'improving' the prompt [whether it is actually 'improvement' or not is a task for a later day] iteratively, renders us searching in much much broader prompt space than we are actually interested in. AUTOLLM achieves the traversal in the prompt space by just 're-phrasing' the instruction prompt.....why does this work?

Repeat the above experiment with GPT, but this time ask it to just 're-phrase' the instruction. ... realize the difference?

Well, one can argue that it is analogous to Grid Search in AUTOML, where we are not interested in actually improving a parameter, but only bringing about some random shifts in it [like +/- 1 for int params]. After exhaustively searching the space, we compare the performances at different nodes in the grid to achieve the tuned value.

AUTOLLM operates on a Grid Search inspired Algorithm to autonomously configure the Instruction Prompt of a Large Language Model. A node in the grid is a defined configuration of the LLM. To explore the neighboring nodes in the grid, the best possible solution is to randomly add some noise to the instruction prompt. This is achieved by 'Re-Phrasing' by the LLM prompt.

Further, for results to be reliable, we need to ensure that exploring direction is NOT biased in the gird. 'Improving' on the instruction prompt is bound to add some bias in the search direction and hence quite a few neighboring nodes are likely to be skipped. 'Re-phrasing' ensures randomness in the search leading to reliable and accurate results.

In conclusion, AUTOLLM operates on a grid-search-like inspired algorithmic design to autonomously configure the Instruction Prompt of an LLM. This marks a steep step towards automation in Lagre Langue Models. Further research is being carried out at InSituate on how to make this search efficient while maintaining unbiasedness and accuracy.

‍

Mohammad Areeb