A Compact Guide to Large Language Models (Part-3: Applying LLM)
Applying Large Language Models
There are a few paths that one can take when looking to apply large language models for their given use case. Generally speaking, you can break them down into two categories, but there’s some crossover between each. We’ll briefly cover the pros and cons of each and what scenarios fit best for each.
Proprietary services
As the first widely available LLM powered service, OpenAI’s ChatGPT was the
explosive charge that brought LLMs into the mainstream. ChatGPT provides
a nice user interface (or API) where users can feed prompts to one of many
models (GPT-3.5, GPT-4, and more) and typically get a fast response.
These are
among the highest-performing models, trained on enormous data sets, and are
capable of extremely complex tasks both from a technical standpoint, such as
code generation, as well as from a creative perspective like writing poetry in a
specific style.
The downside of these services is the absolutely enormous amount of compute
required not only to train them (OpenAI has said GPT-4 cost them over $100
million to develop) but also to serve the responses.
For this reason, these
extremely large models will likely always be under the control of organizations,
and require you to send your data to their servers in order to interact with their
language models. This raises privacy and security concerns, and also subjects
users to “black box” models, whose training and guardrails they have no control
over. Also, due to the compute required, these services are not free beyond a
very limited use, so cost becomes a factor in applying these at scale.
In summary: Proprietary services are great to use if you have very complex tasks,
are okay with sharing your data with a third party, and are prepared to incur
costs if operating at any significant scale.
Open source models
The other avenue for language models is to go to the open source community,
where there has been similarly explosive growth over the past few years.
Communities like Hugging Face gather hundreds of thousands of models
from contributors that can help solve tons of specific use cases such as text
generation, summarization and classification. The open source community has
been quickly catching up to the performance of the proprietary models, but
ultimately still hasn’t matched the performance of something like GPT-4.
It does currently take a little bit more work to grab an open source model and
start using it, but progress is moving very quickly to make them more accessible
to users.
Another huge upside to using open source models is the ability to fine-tune
them to your own data. Since you’re not dealing with a black box of a proprietary
service, there are techniques that let you take open source models and train
them to your specific data, greatly improving their performance on your
specific domain.
Conclusion and general guidelines
Ultimately, every organization is going to have unique challenges to overcome, and there isn’t a one-size-fits-all approach when it comes to LLMs. As the world becomes more data driven, everything, including LLMs, will be reliant on having a strong foundation of data. LLMs are incredible tools, but they have to be used and implemented on top of this strong data foundation.
Leave a Reply