逆向工程 Cursor 的 LLM 客户端
Cursor 内部到底发生了什么?
我们将TensorZero连接到 Cursor 和 LLM 之间,以查看每个令牌的飞过......并根据自己的意愿调整这些 API 调用。
TensorZero 是一个开源框架,可帮助工程师使用下游反馈信号(例如生产指标、人工反馈、用户行为)优化 LLM 应用程序,我们认为看看是否可以在我们自己使用最频繁的 LLM 应用程序 Cursor 上使用 TensorZero 会很有趣。
通过 Cursor 和 LLM 提供商之间的网关,我们可以观察正在进行的 LLM 调用,对单个推理进行评估,使用推理时间优化,甚至试验和优化 Cursor 使用的提示和模型。
Cursor 针对其整个用户群体进行了优化——除此之外.cursorrules,通过更深入的研究和针对个人用户的定制,是否还有改进的空间?
如果能够根据您的个人使用模式对 Cursor 进行实证试验和优化,那会是什么样子?
除了优化之外,Cursor 仍然是一个黑匣子。看看到底向模型发送了什么信息,是不是很有趣?
几乎所有 LLM 的优化、评估和实验技术都需要关于所作推断及其实际结果的数据。在本文中,我们将重点介绍前者,并深入探讨如何将 TensorZero 设置为 Cursor 及其调用的 LLM 之间的自托管代理。如果您想亲自尝试,请查看我们代码库中的示例。我们随后将发布一篇博文,介绍如何收集反馈并完成优化循环。
将事物连接在一起
我们首先注意到的是,Cursor 允许你覆盖 OpenAI 的基 URL 和模型名称。完美!TensorZero 公开了一个与 OpenAI 兼容的推理端点,因此我们可以轻松地将 Cursor 配置为调用 TensorZero 而不是 OpenAI。
在 TensorZero 配置中,我们定义了一个 TensorZero 函数,cursorzero以便我们可以自动尝试不同的模型和提示,同时将与提供商无关的推理和反馈数据存储在我们的数据库中,以实现可观察性和优化。
第一个障碍:Cursor 的服务器
Cursor 最初无法连接到本地运行的 TensorZero。事实证明,Cursor 首先向自己的服务器发送请求,并在调用 LLM 之前进行额外的处理,因此无法连接到我们位于 的网关localhost。(这也意味着您的凭据必须转发到 Cursor 的服务器,以便他们收集您的推理和代码库数据。)
为了测试我们的计划是否可行,我们将 Cursor 指向了 OpenRouter,并意识到我们可以将其模型用于 Cursor 中的 Ask、Agent 和 Cmd+K 交互。我们仍然可以使用常规的 Cursor Tab 补全功能,该功能依赖于 Cursor 提供的用于内联补全的专有模型(该模型非常出色,因此我们很高兴能够保留它)。
解决方案是设置一个反向代理,以公开一个公共端点,该端点会将请求转发回我们的机器。我们使用 Ngrok 来简化操作。由于我们将一个带有 LLM 凭证的网关公开到公共互联网,因此我们在网关前面添加了 Nginx 来验证请求。我们将 Cursor 的基本 URL 设置为我们的 Ngrok 端点,添加了我们在 Nginx 中设置的凭证,关闭了内置模型,最后在模型名称下添加了新的 TensorZero 函数
tensorzero::function_name::cursorzero。
最终,工作流程如下:Cursor → Ngrok → Nginx(自托管)→ TensorZero(自托管)→ LLM 提供商
但它没有起作用。
第二个障碍:CORS
身份验证过程失败。
Nginx 日志显示有一个OPTIONS请求到达了我们的端点,因此我们将 Nginx 配置为在请求中返回标头OPTIONS,并逐步添加我们在 OpenAI API 响应中看到的标头。这是来自本地 Cursor IDE 的初始验证请求。CORS 要求可能来自 Electron。
初步验证后,所有请求都来自 Cursor 的服务器。
我们的 Nginx 配置用于处理 CORS 标头
终于成功了!
最后:Cursor 的可观察性
我们最终可以看到 Cursor 的所有内容——包括它的提示。
例如,这是我们在询问 Cursor“如何让 nginx 从我的 .env 中的环境变量中获取密钥?”之后看到的内容。
系统提示
解释
You are a an AI coding assistant, powered by tensorzero::function_name::cursorzero. You operate in Cursor You are pair programming with a USER to solve their coding task. Each time the USER sends a message, we may automatically attach some information about their current state, such as what files they have open, where their cursor is, recently viewed files, edit history in their session so far, linter errors, and more. This information may or may not be relevant to the coding task, it is up for you to decide. Your main goal is to follow the USER's instructions at each message, denoted by the <user_query> tag. <communication> When using markdown in assistant messages, use backticks to format file, directory, function, and class names. Use ( and ) for inline math, [ and ] for block math. </communication> <search_and_reading> If you are unsure about the answer to the USER's request or how to satiate their request, you should gather more information. This can be done by asking the USER for more information. Bias towards not asking the user for help if you can find the answer yourself. </search_and_reading> <making_code_changes> The user is likely just asking questions and not looking for edits. Only suggest edits if you are certain that the user is looking for edits. When the user is asking for edits to their code, please output a simplified version of the code block that highlights the changes necessary and adds comments to indicate where unchanged code has been skipped. For example: ```language:path/to/file // ... existing code ... {{ edit_1 }} // ... existing code ... {{ edit_2 }} // ... existing code ... ``` The user can see the entire file, so they prefer to only read the updates to the code. Often this will mean that the start/end of the file will be skipped, but that's okay! Rewrite the entire file only if specifically requested. Always provide a brief explanation of the updates, unless the user specifically requests only the code. These edit codeblocks are also read by a less intelligent language model, colloquially called the apply model, to update the file. To help specify the edit to the apply model, you will be very careful when generating the codeblock to not introduce ambiguity. You will specify all unchanged regions (code and comments) of the file with "// ... existing code ..." comment markers. This will ensure the apply model will not delete existing unchanged code or comments when editing the file. You will not mention the apply model. </making_code_changes> <user_info> The user's OS version is darwin 24.3.0. The absolute path of the user's workspace is /Users/viraj/tensorzero/tensorzero/examples/cursorzero. The user's shell is /bin/zsh. </user_info> You MUST use the following format when citing code regions or blocks: ```12:15:app/components/Todo.tsx // ... existing code ... ``` This is the ONLY acceptable format for code citations. The format is ```startLine:endLine:filepath``` where startLine and endLine are line numbers.
用户提示
解释
<additional_data> Below are some potentially helpful/relevant pieces of information for figuring out to respond <current_file> Path: nginx/nginx.conf Line: 1 Line Content: `events {}` </current_file> <attached_files> <file_contents> ```path=nginx/nginx.conf, lines=1-46 events {} http { log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent"'; access_log /var/log/nginx/access.log main; map $http_authorization $is_authorized { default 0; "~*^Bearer your-token-here#34; 1; } server { listen 80; # --- CORS helper macro --- set $CORS_ALLOW_ORIGIN "$http_origin"; # reflect the caller's origin set $CORS_ALLOW_HEADERS "Authorization,Content-Type"; location / { # --- pre-flight --- if ($request_method = OPTIONS) { add_header Access-Control-Allow-Origin "$CORS_ALLOW_ORIGIN" always; add_header Access-Control-Allow-Credentials "true" always; add_header Access-Control-Allow-Methods "GET,POST,OPTIONS" always; add_header Access-Control-Allow-Headers "$CORS_ALLOW_HEADERS" always; add_header Access-Control-Max-Age 86400 always; return 204; # 204 No Content is conventional for pre-flight } # --- authenticated requests --- if ($is_authorized = 0) { return 401; } proxy_pass http://gateway:3000; proxy_set_header Host $host; # expose CORS headers on the real response too add_header Access-Control-Allow-Origin "$CORS_ALLOW_ORIGIN" always; add_header Access-Control-Allow-Credentials "true" always; } } } ``` </file_contents> <manually_added_selection> ```path=nginx/nginx.conf, lines=1-16 events {} http { log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent"'; access_log /var/log/nginx/access.log main; map $http_authorization $is_authorized { default 0; "~*^Bearer your-token-here#34; 1; } server { listen 80; ``` </manually_added_selection> </attached_files> </additional_data> <user_query> how do I get nginx to get the key from an environment variable in my .env? </user_query>
回复
解释
I see you're trying to use an environment variable for your authentication token in the Nginx configuration instead of hardcoding "your-token-here" in the `nginx.conf` file. Unfortunately, Nginx doesn't directly read environment variables from a `.env` file. However, there are a few approaches you can take: ### Option 1: Use Nginx's environment variable support with a template 1. Create a template file (e.g., `nginx.conf.template`) with a placeholder: ``` map $http_authorization $is_authorized { default 0; "~*^Bearer ${API_TOKEN}#34; 1; } ``` 2. Then use a script to replace the placeholder with the actual environment variable before starting Nginx: ```bash #!/bin/bash # Load environment variables from .env source .env # Replace placeholder with actual token envsubst '${API_TOKEN}#39; < /path/to/nginx.conf.template > /path/to/nginx.conf # Start Nginx nginx -g 'daemon off;' ``` ### Option 2: Use Docker environment variables (if using Docker) If you're running Nginx in Docker, you can pass environment variables to the container and use `envsubst`: ```dockerfile FROM nginx:alpine COPY nginx.conf.template /etc/nginx/templates/default.conf.template CMD ["/bin/sh", "-c", "envsubst '${API_TOKEN}#39; < /etc/nginx/templates/default.conf.template > /etc/nginx/conf.d/default.conf && nginx -g 'daemon off;'"] ``` Then run with: ```bash docker run -e API_TOKEN=your-secret-token -p 80:80 your-nginx-image ``` ### Option 3: Use a separate authentication service Instead of hardcoding the token in Nginx, you could proxy authentication requests to a separate service that has access to environment variables: ``` location /auth { proxy_pass http://auth-service:3001; # This service can read environment variables } ``` For your specific case, I'd recommend Option 1 or 2 depending on your deployment setup. This way, you can keep your token in your `.env` file and have it securely injected into the Nginx configuration at startup time.
快捷工程光标
这个例子描绘了最先进的人工智能编码助手的快速工程是什么样子的。
这个提示符和其他类似的提示符(用于提问、编辑源代码和在shell中工作的不同提示符)都是Cursor用来回答问题和生成代码的。我们发现这里没有更多提示符,而且所有软件工程知识已经被当代的LLM课程充分内化,以至于你可以通过一个642个token的系统提示符来了解Cursor背后的大脑,这真是令人惊讶。这肯定要归功于各大实验室在后期训练方面的大量努力。
明确的 AI 层次结构?
我们发现以下代码片段特别有趣:
解释
These edit codeblocks are also read by a less intelligent language model, colloquially called the apply model, to update the file. To help specify the edit to the apply model, you will [...]. You will not mention the apply model.
混合使用不同的模型来优化成本和延迟是很常见的,但 Cursor 会向模型本身解释这种层次结构吗?有意思。
A/B 测试模型
有了 TensorZero,我们拥有了完整的 Cursor 体验,能够掌控 LLM 请求的可观察性和实验。我们已经运行CursorZero进行了数天的繁重软件工程测试:它一直很稳定,没有出现明显的额外延迟。
我们目前正在运行 Claude 4.0 Sonnet、GPT-4.1、o4 Mini 和 Gemini 2.5 Pro 的均匀随机分割——感觉和 Cursor 一样好。
亲自尝试
有兴趣分析一下你自己对 AI 编程助手的使用情况吗?你可以在 GitHub 上找到使用Cursor和OpenAI Codex复现此工作的说明。
TensorZero 可以帮助您理解并优化您的 LLM 代理,即使它们不是您自己构建的!
但等等...还有更多!
这篇博文展示了我们如何通过将 TensorZero 设置为自托管代理服务来成功地对 Cursor 的 LLM 客户端进行逆向工程,使我们能够观察、分析和试验不同的 LLM 模型,同时保持完整的 Cursor 体验。
在本系列的下一篇文章中,我们将解释如何评估人工智能编程助手的实际使用情况,以及我们内部不同模型的 A/B 测试结果。抢先预览:git hooks、tree-sitter 等等。
然后,我们将探索此反馈信号是否可以通过您的个人使用模式优化模型和推理来改进 Cursor。
原文链接:
https://www.tensorzero.com/blog/reverse-engineering-cursors-llm-client/