Do you need a model for handling private data or discussing controversial/taboo subjects?

Note: leading providers of LLMs are very careful in utilizing private data. In a recent release, OpenAI stated that “As of March 1, 2023, data sent to the OpenAI API will not be used to train or improve OpenAI models”. So you only really need to consider hosting a model locally if privacy is as significant as it would be in the healthcare or defense sectors.

Yes:

<aside> ➡️ Dolly - one of the largest open-source LLMs

</aside>

Databricks.png

<aside> ➡️ Bloom - A multilingual model which you can demo on HuggingFace spaces (HuggingFace account required).

</aside>

Bloom.png

<aside> ➡️ GPT-NeoX - You can access GTP-NeoX code to create a generator yourself.

</aside>

Note: there is no ready-made interface on their website to start playing around with these models. You will need an API and a front-end for that. See the next section for recommendations. Use an open-source model. You will need to host these yourself. Online instructions are widely available on how to do so, but it will require some coding work.

No:

Use models offered by OpenAI or similar closed-source company:

<aside> ➡️ GPT3.5 or GPT4 through OpenAI - State-of-the-art text generators

</aside>

OpenAI.png