Design ChatGPT
Key Concepts
Choosing a Database: Select a NoSQL database capable of handling large amounts of data.
Reliable and Resilient Infrastructure: ChatGPT requires designing for scalability, security, and performance.
Rate Limiting: Implement rate-limiting procedures to prevent DDoS attempts aimed at the system and ensure its safety.
Functional Requirements
When designing ChatGPT, several functional requirements, such as creating, updating, viewing, and deleting conversations, need to be considered.
Additionally, rating a response by giving thumbs up or down can help train the model.
Text-based inputs in English are assumed, and inputs go through a sanitization phase to remove profanity and detect insults.
Non-functional Requirements
Non-functional requirements, such as latency, security, and scalability, are also essential. The latency for server responses can become lengthy due to extensive processing time on the back end.
Login flows and rate limitation procedures can prevent DDoS attempts aimed at the system, ensuring its safety.
The rate limiter must also be scalable to enable the system to host several users simultaneously without any issues.
A scalable database should be designed to handle this storage, with NoSQL being an ideal option.
The average message size is estimated at 100 bytes, meaning that 200 million messages per day translate to 20 GB per day and 7.3 terabytes annually, which amounts to 76 terabytes for ten years.
High-level Design
The system's high-level design should include a conversation service that manages user dialogue and the ChatGPT model, the system's core component.
Before a message is processed, it undergoes sanitization and analysis to ensure it meets the established standards.
The model's results are stored in a single conversation database. Finally, a thumbs-down rating indicates that the model needs to be retrained, and a risk model is designed to detect the legitimacy of user ratings.
REST API
The conversation service is a REST API that includes creating, deleting, viewing, and sending a message. Each conversation has a unique ID assigned to each message. The user can rate each message with a thumbs up or thumbs down.
The data is stored in a NoSQL database, where the conversation table contains various conversation IDs, while each message contains an ID, text, author, and parent.
ChatGPT Model
ChatGPT uses a Transformer model, which predicts natural sequences of words.
The model is trained on internet data such as websites, books, and Wikipedia links to provide semantically meaningful and grammatically correct replies. The model can use multiple approaches such as top-K, greedy, or nucleus temperature to select the most accurate prediction.
A dataset of question-and-answer pairs is initially used to train the model.
However, since training the model on every possible question-and-answer combination is impossible, a reward model is designed to select the best responses.
The reward model is also trained using reinforcement learning, where the chatbot is rewarded for giving appropriate responses and penalized for inappropriate ones.
The chatbot design is expected to support different input and output formats, including images, audio, and video.
Even if the model does not have a large dataset initially, it can still provide accurate responses in natural language. The reward model considers the emotion and tone of the question and answer and continuously trains itself to improve accuracy.