LLMs Usage FAQ
Common Questions and Answers about Using LLMs
Force Traditional Chinese Output[edit]
📝 Problem: AI responses contain simplified Chinese characters
💬 Solution:
- Add #zh-TW before your question [1]
- Or say "Use Traditional Chinese commonly used in Taiwan"
Prompt:
Some models cannot distinguish the difference between Traditional Chinese and Simplified Chinese, therefore @Will_Huang suggests providing a Traditional Chinese-English bilingual vocabulary list.
``` Use Traditional Chinese commonly used in Taiwan: Rules - Use full-width punctuation marks and add spaces between Chinese and English text. - Below is a common AI terminology correspondence table (English -> Traditional Chinese): * Transformer -> Transformer * Token -> Token * LLM/Large Language Model -> 大語言模型 * Zero-shot -> 零樣本 * Few-shot -> 少樣本 * AI Agent -> AI 代理 * AGI -> 通用人工智慧 - The following is a table of common Taiwanese terms (English -> Traditional Chinese): * create -> 建立 * quality -> 質量 * information = 資訊 * message = 訊息 * store = 儲存 * search = 搜尋 * view = 檢視, 檢視表 (No 視圖 as always) * create = created = 建立 * data = 資料 * object = 物件 * queue = 佇列 * stack = 堆疊 * invocation = 呼叫 * code = 程式碼 * running = 執行 * library = 函式庫 * building = 建構 * package = 套件 * video = 影片 * class = 類別 * component = 元件 * Transaction = 交易 * Code Generation = 程式碼產生器 * Scalability = 延展性 * Metadata = Metadata * Clone = 複製 * Memory = 記憶體 * Built-in = 內建 * Global = 全域 * Compatibility = 相容性 * Function = 函式 * document = 文件 * example = 範例 * blog = 部落格 * realtime = 即時 * document = 文件 * integration = 整合 ```
Generating Longer Article Content[edit]
📝 Problem: I want to use LLMs to generate articles of 5000-6000 words, but each attempt only produces articles of 1000-1500 words.
💬 Reason: LLM models have context window length limitations, with a fixed limit on the total number of tokens for input and output combined in each request. Therefore, each generation result encounters an upper limit of 1000-1500 words. The recommended workaround is to break down the intended article structure and generate content chapter by chapter.
Solution: If it's not possible to generate a 5000-6000 word article in one go, you can pre-plan a five-chapter structure in your input instructions, then generate content sequentially according to the chapter order, ultimately achieving the goal of producing a 5000-6000 word article.
How to Make AI Process Long Articles[edit]
📝 Problem: Context Length Limitations
LLM models are constrained by context window length limitations. Taking long article translation as an example, since we cannot process the entire content at once, we need to segment the article for processing.
💬 Processing Methods:
Method 1: Switch to models that support longer context windows, such as Google Gemini:
- GPT-4o: "16,384 max output tokens"[2] equivalent to approximately 5,461 Chinese characters (16,384/3)
- gemini-2.5-pro: "65,536 max output tokens"[3] equivalent to approximately 21,845 Chinese characters (65,536/3)
- GPT-5: "128,000 max output tokens"[4] equivalent to approximately 42,666 Chinese characters (128,000/3)
Method 2: Start a new conversation and transfer the conversation content to the new dialogue. For existing conversations, you can try using this prompt:
As the first prompt for a new conversation, please organize our previous dialogue into: 1. Clear operational steps 2. Instructions to verify the success of each prerequisite step
Method 3: Chunking strategy with context continuity maintenance
When processing long texts, we need to adopt chunking technical strategies[5]. To help the model understand the context of previous chapters when processing subsequent paragraphs, an effective approach is Chunking Strategy with Previous Article Summarization:
- First summarize the previous chapters
- Input the summary together with the full text of the next chapter to be processed to the AI
- This maintains context coherence while saving token usage
Overlapping Chunking Strategy
Another chunking strategy is suitable for processing transcript editing. Transcript formats typically include timestamps and corresponding subtitle content:
1 00:00:00,001 --> 00:00:02,000 So you answer me first 2 00:00:02,000 --> 00:00:06,000 Which country has left you with such a long constitutional gap 3 00:00:06,000 --> 00:00:10,000 Then tell me which country doesn't have such provisions
If segment 3 is sent directly to AI for editing, errors are likely to occur due to lack of previous dialogue context. In this case, we can adopt a content chunking strategy that "allows partial overlap." Here's an example prompt for improving Chinese transcripts[6]:
Your task is to improve Chinese spoken interview transcript paragraphs. You need to add punctuation, ensure paragraph coherence, maintain the original meaning, and rewrite portions of text as needed. Please use Traditional Chinese commonly used in Taiwan.
This is the previous paragraph:
<previous_paragraph>
{PREVIOUS_PARAGRAPH}
</previous_paragraph>
This is the current paragraph:
<current_paragraph>
{CURRENT_PARAGRAPH}
</current_paragraph>
This is the following paragraph:
<next_paragraph>
{NEXT_PARAGRAPH}
</next_paragraph>
This method allows AI to reference both preceding and following context simultaneously, ensuring coherence and accuracy in processing results.
How to Quickly Understand Code Projects Using AI Programming Tools[edit]
1. Install Visual Studio Code https://code.visualstudio.com/
2. File → Open Folder: Navigate to and select your project directory
3. Switch the chat mode from "Ask" to "Agent" - this enables automatic project structure analysis. You can then ask any questions about the codebase
How to Solve AI Forgetting Training Content[edit]
📝 Inquiry:
I'd like to ask a follow-up question: If we adopt a "layer-by-layer prompt optimization" approach to improve AI performance, might we encounter the following situation: After multiple rounds of prompt optimization, the AI does learn the relevant skills and performs well, but after some time, it forgets these trained capabilities? I want to understand whether current mainstream AI model platforms all have stable memory retention capabilities - that is, can they continuously remember the training prompts and guidance we've previously provided? Sometimes I feel that AI's memory seems unstable. During the same project, content and requirements that I've already explained to the AI in detail need to be re-explained from scratch after a while, which makes me question the continuity of AI learning.
💬 Response:
Indeed, early AI models, due to shorter context window limitations, were prone to drifting from their original settings. When I encounter such situations, I usually choose to start a completely new conversation and restart the entire interaction process.
Current AI models should have significant improvements in this regard. If this conversation's results are satisfactory, I suggest you can give the AI an instruction to summarize and consolidate the entire conversation process, integrating the accumulated interaction principles and experiences from the dialogue into the initial prompt:
Assuming I want to start a new conversation to discuss the same topic, please suggest what complete prompt I should use. This prompt needs to include important content from our entire discussion process: (1) The core problems and objectives that the original prompt aimed to solve (2) Important aspects and details related to the original problem that the initial solution method didn't fully consider
How to Solve: AI Doesn't Know AABB[edit]
📝 Inquiry
AI is so dumb! When I ask it about AABB topics, it doesn't know anything. How can it be so difficult to use?
💬 Response
AI base models have knowledge cut-off dates[7]. The knowledge cutoff date represents the point beyond which AI models cannot access events or information. This date is determined by when the model's training data was last updated. When queries exceed this model's time limitations, it's easy to encounter hallucinations or situations where the AI doesn't understand.
Suggested solutions:
- Web Search Functionality: It is recommended to choose AI tools that support "web search" functionality. First enable the "web search" feature, then resubmit your question.
- Knowledge File Upload: Alternatively, prepare documents or knowledge files about AABB topics[8][9], and upload them to the AI conversation to provide it with prerequisite knowledge before discussing the topic.
If the information is too new or niche to be obtained through real-time web search, it is recommended to use the second method of uploading knowledge files. Sometimes AI will claim to understand AABB while actually not understanding (pretending to know). In such cases, you can propose basic conceptual questions for verification.
How to Solve AI Models Generating Incorrect Website Links?[edit]
📝 Question: Currently using AI tools like Grok, ChatGPT, and Perplexity to collect financial data and capital status of the top 10 global companies in specific industries, organizing them into table format. Additionally, I need to attach original website links as reference sources for the data. However, all three AI tools generate completely incorrect website links. Does anyone have solutions to prevent these language models from continuously producing incorrect reference links?
💬 Response: If you want to use the original models rather than smarter reasoning models, you can ask the AI to add a preliminary step before giving conclusions: "Please extract and number relevant webpage paragraph text related to the question's answer, then answer the question based on these paragraphs." This can reduce the probability of hallucinations in less capable models.[10][11][12]
Can AI Self-Verify Its Reasoning Errors?[edit]
📝 Query: A thought-provoking philosophical question continues to puzzle me: Do artificial intelligence systems possess the capability to self-detect and expose their own limitations? In other words, can we use AI tools to identify and prove flaws and inaccuracies in AI reasoning processes?
💬 Response: There are several viable strategic approaches to address this AI self-verification challenge:
Method 1: Multi-Model Cross-Validation Framework
Utilize different AI models for cross-comparison, verifying information accuracy through multiple perspectives and leveraging inter-model differences to identify potential errors.
Method 2: Structured Reasoning Step Prompts
When using the same model rather than more advanced reasoning models, you can require the AI to execute a key step before reaching conclusions: "Before making your final conclusion, please list all evidence supporting this conclusion in complete detail, ranked from highest to lowest relevance. Then answer the question based on these evidence paragraphs." This technique is not recommended for "reasoning models", as they already incorporate systematic reasoning processes. [13]
.
Method 3: Web Data Verification Combined with Structured Reasoning
Require the model to proactively search web data for fact-checking while simultaneously combining Method 2's structured reasoning steps, creating a dual verification mechanism.
Here's the English translation:
May user chat records be used to improve AI models?[edit]
Solving ChatGPT Generated File Not Found Issues[edit]
📝 Inquiry: Strange, all files generated by ChatGPT fail to download, showing that the file doesn't exist. What's the problem?
💬 Response:
1. When the model switches to using the Code Interpreter tool, Python code will appear in the conversation. Its virtual execution environment has execution time limits[14]. While OpenAI Help Center mentions that "generated download file links expire quickly"[15], the official documentation does not specify exact time limits. Some users have reported approximately one hour[16]. When the time limit is exceeded, the generated files will disappear.
If you keep hitting the same wall in the same conversation, the only solution is to start a new conversation and try again.
2. An alternative approach is to avoid generating files directly and instead display the file content within the conversation.
If you consistently can't generate files, try changing the approach to display the file content directly in the conversation. This works better for text-based files.
3. Some online tutorials mention that filenames must be changed to English for proper downloads, but according to user testing experiences, changing to English filenames doesn't guarantee successful file downloads.
Models tested that can download files with Traditional Chinese filenames:
- ChatGPT 5
- ChatGPT 4o
Prompt:
Generate a sample CSV file with columns: product name, quantity, sale date, total amount. Please provide a CSV file using a Traditional Chinese filename.
Related articles[edit]
References[edit]
- ↑ ChatGPT: How to Force Traditional Chinese Output | Learn Technology, Save Time - Learn Technology
- ↑ Model - OpenAI API
- ↑ Gemini 2.5 Pro
- ↑ Model - OpenAI API
- ↑ 使用繁體中文評測 RAG 的 Chunking 切塊策略 – ihower { blogging }
- ↑ How to Add Punctuation to Whisper Transcripts Using AI | Medium
- ↑ A comprehensive list of Large Language Model knowledge cut off dates - ALLMO: Boost Your Brand's Visibility in AI Search
- ↑ What kinds of documents can I upload to Claude.ai? | Anthropic Help Center
- ↑ Document understanding | Gemini API | Google AI for Developers
- ↑ Reduce hallucinations - Anthropic
- ↑ Improving AI-Generated Responses: Techniques for Reducing Hallucinations - The Learning Agency
- ↑ 9 Prompt Engineering Methods to Reduce Hallucinations (Proven Tips) - Workflows "Step-Back Prompting is a technique where you ask the AI to review its previous response and make sure it is accurate."
- ↑ New prompting rules when using reasoning models (Deep Research) | by Sophie Hundertmark | Medium "Avoid chain-of-thought (CoT) prompting"
- ↑ Answered: ChatGPT "Code interpreter session expired" (2025)
- ↑ Troubleshooting ChatGPT Error Messages | OpenAI Help Center "Ensure that the file is recently generated, files generated by ChatGPT expire quickly"
- ↑ Code Interpreter - maintaining files uploaded to session and session state - ChatGPT - OpenAI Developer Community