Rise of the Open Source ChatGPT Clones

At first there was Open-Assistant, then appeared OpenChatKit, and the newly announced ColossalChat. Now three projects (that I know of!) aim to give everybody the ability to create their own ChatGPT clone.

The basic components of a ChatGPT clone are:

  1. large language model as its base

  2. instruct dataset for fine-tuning the large language model

  3. tools and pipeline for generating and curating the instruct dataset

  4. tools and pipeline for fine-tuning and alignment of the model

  5. tools for system management (ie user management, pre-prompt management)

  6. tools for operations

  7. content moderation system to identify when the model produced an undesired, unethical, or illegal response

  8. user interface to expose the functionality