add: post-to-xhs skills
This commit is contained in:
209
skills/post-to-xhs/SKILL.md
Normal file
209
skills/post-to-xhs/SKILL.md
Normal file
@@ -0,0 +1,209 @@
|
||||
---
|
||||
name: post-to-xhs
|
||||
description: >
|
||||
小红书内容发布技能。支持两种输入方式:(1) 用户提供完整内容和图片/图片URL,直接发布;
|
||||
(2) 用户提供网页URL,自动提取内容和图片,适当总结后发布。如果从URL提取不到图片,
|
||||
提示用户手动下载并提供。适用于任何类型的内容发布。
|
||||
---
|
||||
|
||||
# 小红书内容发布
|
||||
|
||||
根据用户输入自动判断发布方式,简化发布流程。
|
||||
|
||||
## 工作流程
|
||||
|
||||
```
|
||||
用户输入
|
||||
│
|
||||
├─ 完整内容 + 图片/图片URL → 直接进入发布流程
|
||||
│
|
||||
└─ 网页 URL → WebFetch 提取内容和图片
|
||||
│
|
||||
├─ 有图片 → 适当总结内容 → 发布流程
|
||||
│
|
||||
└─ 无图片 → 提示用户手动下载图片
|
||||
│
|
||||
└─ 用户提供图片后 → 发布流程
|
||||
```
|
||||
|
||||
## Step 1: 判断输入类型
|
||||
|
||||
根据用户输入判断:
|
||||
|
||||
- **完整内容模式**:用户提供了标题、正文内容、以及图片(本地路径或URL)
|
||||
- **URL 提取模式**:用户只提供了一个网页 URL
|
||||
|
||||
如果不确定,询问用户。
|
||||
|
||||
## Step 2: 处理内容
|
||||
|
||||
### 完整内容模式
|
||||
|
||||
直接使用用户提供的标题和正文,跳到 Step 3。
|
||||
|
||||
### URL 提取模式
|
||||
|
||||
1. 使用 WebFetch 提取网页内容
|
||||
2. 提取关键信息:标题、正文、图片URL
|
||||
3. 适当总结内容,保持:
|
||||
- 关键信息完整
|
||||
- 语言自然流畅
|
||||
- 适合小红书阅读习惯
|
||||
|
||||
#### 图片提取失败处理
|
||||
|
||||
如果从网页中提取不到图片URL,或图片URL无法访问,**必须**:
|
||||
|
||||
1. 告知用户图片提取失败
|
||||
2. 提供原网页链接,请用户手动访问
|
||||
3. 指导用户:
|
||||
- 在浏览器中打开原网页
|
||||
- 右键点击想要的图片 → "图片另存为" 或 "复制图片地址"
|
||||
- 将保存的图片路径或复制的图片URL提供给我
|
||||
4. 等待用户提供图片后再继续发布流程
|
||||
|
||||
**示例提示语**:
|
||||
```
|
||||
从网页中未能提取到可用的图片。请手动获取:
|
||||
|
||||
1. 打开原文链接:[URL]
|
||||
2. 找到合适的配图,右键另存为本地,或复制图片地址
|
||||
3. 将图片路径或URL发给我
|
||||
|
||||
拿到图片后我们继续发布。
|
||||
```
|
||||
|
||||
## Step 3: 内容检查
|
||||
|
||||
### 标题检查
|
||||
|
||||
标题长度必须 ≤ 38,计算规则:
|
||||
- 中文字符和中文标点(《》、,。等):每个计 2
|
||||
- 英文字母/数字/空格/ASCII标点:每个计 1
|
||||
|
||||
如果超长,自动生成符合长度要求的新标题,保持语义一致。
|
||||
|
||||
### 正文格式
|
||||
|
||||
- 段落之间使用双换行分隔
|
||||
- 语言自然,避免机器翻译感
|
||||
- 简体中文
|
||||
|
||||
## Step 4: 发布到小红书
|
||||
|
||||
完整发布流程参考: [references/publish-workflow.md](references/publish-workflow.md)
|
||||
|
||||
### 4.1 用户确认内容
|
||||
|
||||
通过 `AskUserQuestion` 向用户展示即将发布的内容(标题、正文、图片),获得明确确认后再继续。
|
||||
|
||||
### 4.2 选择发布模式
|
||||
|
||||
通过 `AskUserQuestion` 让用户选择发布模式:
|
||||
|
||||
- **无头模式**(推荐):后台运行,速度快,无浏览器窗口。发布完成后直接报告结果。
|
||||
- **有窗口模式**:显示浏览器窗口,可以预览内容。需要用户确认后再点击发布。
|
||||
|
||||
```
|
||||
AskUserQuestion 示例:
|
||||
问题:选择发布模式
|
||||
选项:
|
||||
- 无头模式(推荐):后台快速发布,无需预览
|
||||
- 有窗口模式:显示浏览器,可预览确认
|
||||
```
|
||||
|
||||
### 4.3 写入临时文件
|
||||
|
||||
将标题和正文写入临时 UTF-8 文本文件。不要在 `python -c` 中内联中文文本。
|
||||
|
||||
### 4.4 运行 Pipeline
|
||||
|
||||
根据用户选择的模式执行发布脚本:
|
||||
|
||||
**无头模式**(添加 `--headless` 参数):
|
||||
```bash
|
||||
python "C:\Users\admin\AI\.claude\skills\post-to-xhs\scripts\publish_pipeline.py" --headless --title-file title.txt --content-file content.txt --image-urls "URL1" "URL2"
|
||||
```
|
||||
|
||||
**有窗口模式**(不添加 `--headless`):
|
||||
```bash
|
||||
python "C:\Users\admin\AI\.claude\skills\post-to-xhs\scripts\publish_pipeline.py" --title-file title.txt --content-file content.txt --image-urls "URL1" "URL2"
|
||||
```
|
||||
|
||||
**其他参数**:
|
||||
```bash
|
||||
# 发布到指定账号
|
||||
python ... --account myaccount ...
|
||||
|
||||
# 使用本地图片
|
||||
python ... --images "C:\path\to\image.jpg"
|
||||
```
|
||||
|
||||
处理输出:
|
||||
- `NOT_LOGGED_IN` (exit code 1) → 脚本自动切换到有窗口模式,提示用户扫码登录,确认后重新运行
|
||||
- `READY_TO_PUBLISH` (exit code 0) → 根据模式进入下一步
|
||||
- Exit code 2 → 报告错误
|
||||
|
||||
### 4.5 用户预览确认(仅有窗口模式)
|
||||
|
||||
**仅当用户选择有窗口模式时**,使用 `AskUserQuestion` 请用户在浏览器中检查预览,确认后再发布。
|
||||
|
||||
无头模式跳过此步骤,直接进入 4.6。
|
||||
|
||||
### 4.6 点击发布
|
||||
|
||||
点击发布按钮:
|
||||
|
||||
```bash
|
||||
python "C:\Users\admin\AI\.claude\skills\post-to-xhs\scripts\cdp_publish.py" click-publish
|
||||
```
|
||||
|
||||
### 4.7 报告结果
|
||||
|
||||
根据命令输出告知用户发布是否成功。
|
||||
|
||||
## 重要提示
|
||||
|
||||
- **绝不自动发布** - 必须在 Step 4.4 获得用户确认
|
||||
- **图片必须有** - 小红书发布必须有图片,没有图片不能发布
|
||||
- **无头模式**:使用 `--headless` 参数自动化发布。如需登录,脚本自动切换到有窗口模式
|
||||
- 如果页面结构变化导致选择器失效,参考 `references/publish-workflow.md` 更新
|
||||
|
||||
## 账号管理
|
||||
|
||||
系统支持多个小红书账号,每个账号有独立的 Chrome profile。
|
||||
|
||||
### 列出账号
|
||||
|
||||
```bash
|
||||
python "C:\Users\admin\AI\.claude\skills\post-to-xhs\scripts\cdp_publish.py" list-accounts
|
||||
```
|
||||
|
||||
### 添加账号
|
||||
|
||||
```bash
|
||||
python "C:\Users\admin\AI\.claude\skills\post-to-xhs\scripts\cdp_publish.py" add-account myaccount --alias "我的账号"
|
||||
```
|
||||
|
||||
### 登录
|
||||
|
||||
```bash
|
||||
# 默认账号
|
||||
python "C:\Users\admin\AI\.claude\skills\post-to-xhs\scripts\cdp_publish.py" login
|
||||
|
||||
# 指定账号
|
||||
python "C:\Users\admin\AI\.claude\skills\post-to-xhs\scripts\cdp_publish.py" --account myaccount login
|
||||
```
|
||||
|
||||
### 切换账号
|
||||
|
||||
```bash
|
||||
python "C:\Users\admin\AI\.claude\skills\post-to-xhs\scripts\cdp_publish.py" switch-account
|
||||
python "C:\Users\admin\AI\.claude\skills\post-to-xhs\scripts\cdp_publish.py" --account otheraccount switch-account
|
||||
```
|
||||
|
||||
### 设置默认账号
|
||||
|
||||
```bash
|
||||
python "C:\Users\admin\AI\.claude\skills\post-to-xhs\scripts\cdp_publish.py" set-default-account myaccount
|
||||
```
|
||||
10
skills/post-to-xhs/config/accounts.json
Normal file
10
skills/post-to-xhs/config/accounts.json
Normal file
@@ -0,0 +1,10 @@
|
||||
{
|
||||
"default_account": "default",
|
||||
"accounts": {
|
||||
"default": {
|
||||
"alias": "默认账号",
|
||||
"profile_dir": "C:\\Users\\admin\\AppData\\Local\\Google\\Chrome\\XiaohongshuProfiles\\default",
|
||||
"created_at": null
|
||||
}
|
||||
}
|
||||
}
|
||||
196
skills/post-to-xhs/references/publish-workflow.md
Normal file
196
skills/post-to-xhs/references/publish-workflow.md
Normal file
@@ -0,0 +1,196 @@
|
||||
# 小红书发布流程参考
|
||||
|
||||
本文档描述通过 CDP(Chrome DevTools Protocol)自动发布内容到小红书创作者中心的完整流程。
|
||||
|
||||
## 前置条件
|
||||
|
||||
1. **Chrome 浏览器已安装** - 标准 Google Chrome
|
||||
2. **Python 依赖已安装** - `websockets`、`requests`
|
||||
3. **首次登录已完成** - 至少登录过一次小红书(cookie 持久化在专用 profile 中)
|
||||
|
||||
## 流程概览
|
||||
|
||||
```
|
||||
生成文案 → 用户确认 → 启动 Chrome → 检查登录 → 导航发布页 → 上传图片 → 填写标题 → 填写正文 → 用户确认发布
|
||||
```
|
||||
|
||||
## 详细步骤
|
||||
|
||||
### 1. 启动 / 连接 Chrome
|
||||
|
||||
脚本: `scripts/chrome_launcher.py`
|
||||
|
||||
- 检测 `127.0.0.1:9222` 端口是否已有 Chrome 实例
|
||||
- 若无,启动 Chrome 并附带以下参数:
|
||||
- `--remote-debugging-port=9222`
|
||||
- `--user-data-dir=%LOCALAPPDATA%/Google/Chrome/XiaohongshuProfile`
|
||||
- `--no-first-run`
|
||||
- `--no-default-browser-check`
|
||||
- `--headless=new`(仅在无头模式下)
|
||||
- 等待端口就绪(最多 15 秒)
|
||||
|
||||
**用户数据目录说明**: 使用独立的 `XiaohongshuProfile` 目录,与用户日常浏览器 profile 完全隔离,不会干扰正常使用。
|
||||
|
||||
**无头模式说明**: 使用 `--headless` 参数启动时,Chrome 不会显示窗口,适合自动化发布。如需登录或切换账号,脚本会自动切换到有窗口模式。
|
||||
|
||||
### 2. 检查登录状态
|
||||
|
||||
脚本: `scripts/cdp_publish.py` → `check_login()`
|
||||
|
||||
- 导航到 `https://creator.xiaohongshu.com`
|
||||
- 检查当前 URL 是否包含 "login"(被重定向到登录页)
|
||||
- 检查页面是否存在用户信息相关的 DOM 元素
|
||||
- 若未登录,提示用户在 Chrome 窗口中扫码登录
|
||||
|
||||
### 3. 导航到发布页
|
||||
|
||||
- 目标 URL: `https://creator.xiaohongshu.com/publish/publish`
|
||||
- 等待页面完全加载
|
||||
|
||||
### 4. 上传图片
|
||||
|
||||
脚本: `scripts/cdp_publish.py` → `_upload_images()`
|
||||
|
||||
- 通过 CDP `DOM.querySelector` 定位 `input[type="file"]` 元素
|
||||
- 使用 CDP `DOM.setFileInputFiles` 命令设置文件路径
|
||||
- 等待图片上传和处理完成
|
||||
|
||||
**图片来源**: 如果图片是 URL,先用 `scripts/image_downloader.py` 下载到临时目录,发布后自动清理。
|
||||
|
||||
### 5. 填写标题
|
||||
|
||||
脚本: `scripts/cdp_publish.py` → `_fill_title()`
|
||||
|
||||
- 定位标题输入框
|
||||
- 设置 value 并触发 `input` 和 `change` 事件
|
||||
|
||||
### 6. 填写正文
|
||||
|
||||
脚本: `scripts/cdp_publish.py` → `_fill_content()`
|
||||
|
||||
- 定位 contenteditable 编辑区域(TipTap/ProseMirror editor)
|
||||
- 将正文按段落拆分,包裹为 `<p>` 标签写入 innerHTML,段落之间插入 `<p><br></p>` 空行
|
||||
- 触发 `input` 事件
|
||||
|
||||
### 7. 用户确认并发布
|
||||
|
||||
- 脚本填写完成后暂停,提示用户在浏览器中检查预览
|
||||
- 用户确认后,脚本点击发布按钮
|
||||
- 或用户选择手动点击发布按钮
|
||||
|
||||
## DOM 选择器参考
|
||||
|
||||
> **注意**: 小红书前端可能随时更新,以下选择器基于编写时的页面结构。如果自动化失败,需要在浏览器 DevTools 中重新抓取选择器,并更新 `cdp_publish.py` 中的 `SELECTORS` 字典。
|
||||
|
||||
| 元素 | 主选择器 | 备选选择器 | 说明 |
|
||||
|---|---|---|---|
|
||||
| 图片上传 | `input.upload-input` | `input[type="file"]` | 隐藏的文件输入,通过 CDP 直接操作 |
|
||||
| 标题输入 | `input[placeholder*="填写标题"]` | `input.d-text` | 标题输入框 |
|
||||
| 正文编辑 | `div.tiptap.ProseMirror` | `div.ProseMirror[contenteditable="true"]` | TipTap/ProseMirror 富文本编辑器 |
|
||||
| 发布按钮 | 文本匹配"发布"(`button` + `.d-button-content .d-text`) | - | 通过遍历按钮文本定位 |
|
||||
| 登录检测 | URL 包含 "login" | `.user-info, .creator-header` | 重定向检测 + DOM 元素检测 |
|
||||
|
||||
## 选择器维护指南
|
||||
|
||||
当小红书更新页面导致自动化失败时:
|
||||
|
||||
1. 在 Chrome 中打开 `https://creator.xiaohongshu.com/publish/publish`
|
||||
2. 按 F12 打开开发者工具
|
||||
3. 使用元素选择器(Ctrl+Shift+C)定位目标元素
|
||||
4. 记录新的选择器
|
||||
5. 更新 `scripts/cdp_publish.py` 中 `SELECTORS` 字典对应的值
|
||||
|
||||
## 错误处理
|
||||
|
||||
| 错误 | 原因 | 解决方案 |
|
||||
|---|---|---|
|
||||
| Chrome 未启动 | 端口 9222 无响应 | 运行 `chrome_launcher.py` 或手动启动 Chrome |
|
||||
| 找不到 Chrome | 非标准安装路径 | 检查 Chrome 安装,或在脚本中指定路径 |
|
||||
| 未登录 | cookie 过期或首次使用 | 在 Chrome 窗口中扫码登录 |
|
||||
| 选择器失效 | 小红书页面更新 | 按上述维护指南更新选择器 |
|
||||
| 图片上传失败 | 文件路径错误或格式不支持 | 检查图片路径,确保格式为 jpg/png/webp |
|
||||
| 发布按钮找不到 | 页面未完全加载 | 增加等待时间或手动点击发布 |
|
||||
|
||||
## CLI 用法
|
||||
|
||||
所有脚本位于 `scripts/` 目录。
|
||||
|
||||
### 方式 A: 统一 pipeline(推荐)
|
||||
|
||||
```bash
|
||||
# 无头模式(推荐)- 无浏览器窗口,更快
|
||||
python publish_pipeline.py --headless --title "标题" --content "正文" --image-urls URL1 URL2
|
||||
|
||||
# 无头模式 - 从文件读取标题和正文
|
||||
python publish_pipeline.py --headless --title-file title.txt --content-file body.txt --image-urls URL1
|
||||
|
||||
# 有窗口模式 - 用于调试或首次登录
|
||||
python publish_pipeline.py --title "标题" --content "正文" --image-urls URL1 URL2
|
||||
|
||||
# 使用本地图片文件
|
||||
python publish_pipeline.py --headless --title "标题" --content "正文" --images img1.jpg img2.jpg
|
||||
|
||||
# 填写并自动发布
|
||||
python publish_pipeline.py --headless --title "标题" --content "正文" --image-urls URL1 --auto-publish
|
||||
```
|
||||
|
||||
输出状态码:
|
||||
- 退出码 0 + `READY_TO_PUBLISH` = 表单已填写,等待确认
|
||||
- 退出码 0 + `PUBLISHED` = 已发布
|
||||
- 退出码 1 + `NOT_LOGGED_IN` = 未登录,需扫码(无头模式下会自动切换到有窗口模式)
|
||||
- 退出码 2 = 其他错误
|
||||
|
||||
### 方式 B: 分步调用
|
||||
|
||||
```bash
|
||||
# 1. 启动 Chrome(可选 --headless)
|
||||
python chrome_launcher.py
|
||||
python chrome_launcher.py --headless
|
||||
|
||||
# 2. 检查登录(退出码 0=已登录, 1=未登录)
|
||||
python cdp_publish.py check-login
|
||||
python cdp_publish.py --headless check-login
|
||||
|
||||
# 3. 填写表单
|
||||
python cdp_publish.py fill --title "标题" --content-file body.txt --images img1.jpg
|
||||
python cdp_publish.py --headless fill --title "标题" --content-file body.txt --images img1.jpg
|
||||
|
||||
# 4. 用户确认后点击发布
|
||||
python cdp_publish.py click-publish
|
||||
|
||||
# 或一步完成填写+发布
|
||||
python cdp_publish.py --headless publish --title "标题" --content-file body.txt --images img1.jpg
|
||||
```
|
||||
|
||||
### 账号管理
|
||||
|
||||
```bash
|
||||
# 首次登录或 session 过期 - 打开浏览器扫码登录
|
||||
python cdp_publish.py login
|
||||
|
||||
# 切换账号 - 清除 cookie 并打开登录页
|
||||
python cdp_publish.py switch-account
|
||||
|
||||
# 关闭 Chrome
|
||||
python chrome_launcher.py --kill
|
||||
|
||||
# 重启 Chrome(可选无头模式)
|
||||
python chrome_launcher.py --restart
|
||||
python chrome_launcher.py --restart --headless
|
||||
```
|
||||
|
||||
### Claude Code 集成
|
||||
|
||||
在 Claude Code 中通过 Bash 工具调用。推荐使用 pipeline 方式:
|
||||
|
||||
1. 将中文标题和正文写入临时文本文件(UTF-8 编码)
|
||||
2. 调用 `publish_pipeline.py --headless` 传入文件路径和图片 URL
|
||||
3. 根据输出状态码处理结果:
|
||||
- 未登录 → 脚本自动切换到有窗口模式,提示用户扫码
|
||||
- 已填写 → 请用户确认预览
|
||||
4. 用户确认后调用 `cdp_publish.py click-publish` 发布
|
||||
|
||||
**切换账号流程**:
|
||||
1. 调用 `cdp_publish.py switch-account`
|
||||
2. 等待用户扫码确认
|
||||
3. 继续正常发布流程
|
||||
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
309
skills/post-to-xhs/scripts/account_manager.py
Normal file
309
skills/post-to-xhs/scripts/account_manager.py
Normal file
@@ -0,0 +1,309 @@
|
||||
"""
|
||||
Multi-account manager for Xiaohongshu publishing.
|
||||
|
||||
Manages multiple Xiaohongshu accounts with separate Chrome profiles:
|
||||
- Each account has its own user-data-dir for cookie isolation
|
||||
- Accounts are stored in a JSON config file
|
||||
- Supports add/remove/list/switch operations
|
||||
|
||||
Usage:
|
||||
python account_manager.py list
|
||||
python account_manager.py add <name> [--alias <alias>]
|
||||
python account_manager.py remove <name>
|
||||
python account_manager.py info <name>
|
||||
python account_manager.py set-default <name>
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import shutil
|
||||
from typing import Optional
|
||||
|
||||
# Config file location
|
||||
CONFIG_DIR = os.path.join(os.path.dirname(os.path.abspath(__file__)), "..", "config")
|
||||
ACCOUNTS_FILE = os.path.join(CONFIG_DIR, "accounts.json")
|
||||
|
||||
# Base directory for account profiles
|
||||
PROFILES_BASE = os.path.join(os.environ.get("LOCALAPPDATA", os.path.expanduser("~")),
|
||||
"Google", "Chrome", "XiaohongshuProfiles")
|
||||
|
||||
# Default account name (for backward compatibility)
|
||||
DEFAULT_PROFILE_NAME = "default"
|
||||
|
||||
|
||||
def _ensure_config_dir():
|
||||
"""Ensure the config directory exists."""
|
||||
os.makedirs(CONFIG_DIR, exist_ok=True)
|
||||
|
||||
|
||||
def _load_accounts() -> dict:
|
||||
"""Load accounts from config file."""
|
||||
_ensure_config_dir()
|
||||
if os.path.exists(ACCOUNTS_FILE):
|
||||
try:
|
||||
with open(ACCOUNTS_FILE, "r", encoding="utf-8") as f:
|
||||
return json.load(f)
|
||||
except (json.JSONDecodeError, IOError):
|
||||
pass
|
||||
# Default structure
|
||||
return {
|
||||
"default_account": DEFAULT_PROFILE_NAME,
|
||||
"accounts": {
|
||||
DEFAULT_PROFILE_NAME: {
|
||||
"alias": "默认账号",
|
||||
"profile_dir": os.path.join(PROFILES_BASE, DEFAULT_PROFILE_NAME),
|
||||
"created_at": None,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
def _save_accounts(data: dict):
|
||||
"""Save accounts to config file."""
|
||||
_ensure_config_dir()
|
||||
with open(ACCOUNTS_FILE, "w", encoding="utf-8") as f:
|
||||
json.dump(data, f, ensure_ascii=False, indent=2)
|
||||
|
||||
|
||||
def get_profile_dir(account_name: Optional[str] = None) -> str:
|
||||
"""
|
||||
Get the Chrome profile directory for a given account.
|
||||
|
||||
Args:
|
||||
account_name: Account name. If None, uses the default account.
|
||||
|
||||
Returns:
|
||||
Path to the Chrome user-data-dir for this account.
|
||||
"""
|
||||
data = _load_accounts()
|
||||
|
||||
if account_name is None:
|
||||
account_name = data.get("default_account", DEFAULT_PROFILE_NAME)
|
||||
|
||||
if account_name not in data["accounts"]:
|
||||
# Fallback to default
|
||||
account_name = DEFAULT_PROFILE_NAME
|
||||
if account_name not in data["accounts"]:
|
||||
# Create default account entry
|
||||
data["accounts"][account_name] = {
|
||||
"alias": "默认账号",
|
||||
"profile_dir": os.path.join(PROFILES_BASE, account_name),
|
||||
"created_at": None,
|
||||
}
|
||||
_save_accounts(data)
|
||||
|
||||
return data["accounts"][account_name]["profile_dir"]
|
||||
|
||||
|
||||
def get_default_account() -> str:
|
||||
"""Get the name of the default account."""
|
||||
data = _load_accounts()
|
||||
return data.get("default_account", DEFAULT_PROFILE_NAME)
|
||||
|
||||
|
||||
def set_default_account(account_name: str) -> bool:
|
||||
"""
|
||||
Set the default account.
|
||||
|
||||
Returns True if successful, False if account doesn't exist.
|
||||
"""
|
||||
data = _load_accounts()
|
||||
if account_name not in data["accounts"]:
|
||||
return False
|
||||
data["default_account"] = account_name
|
||||
_save_accounts(data)
|
||||
return True
|
||||
|
||||
|
||||
def list_accounts() -> list[dict]:
|
||||
"""
|
||||
List all registered accounts.
|
||||
|
||||
Returns a list of dicts with account info.
|
||||
"""
|
||||
data = _load_accounts()
|
||||
default = data.get("default_account", DEFAULT_PROFILE_NAME)
|
||||
result = []
|
||||
for name, info in data["accounts"].items():
|
||||
result.append({
|
||||
"name": name,
|
||||
"alias": info.get("alias", ""),
|
||||
"profile_dir": info.get("profile_dir", ""),
|
||||
"is_default": name == default,
|
||||
})
|
||||
return result
|
||||
|
||||
|
||||
def add_account(name: str, alias: Optional[str] = None) -> bool:
|
||||
"""
|
||||
Add a new account.
|
||||
|
||||
Args:
|
||||
name: Unique account name (used as identifier)
|
||||
alias: Display name / description
|
||||
|
||||
Returns True if added, False if name already exists.
|
||||
"""
|
||||
data = _load_accounts()
|
||||
if name in data["accounts"]:
|
||||
return False
|
||||
|
||||
from datetime import datetime
|
||||
profile_dir = os.path.join(PROFILES_BASE, name)
|
||||
os.makedirs(profile_dir, exist_ok=True)
|
||||
|
||||
data["accounts"][name] = {
|
||||
"alias": alias or name,
|
||||
"profile_dir": profile_dir,
|
||||
"created_at": datetime.now().isoformat(),
|
||||
}
|
||||
_save_accounts(data)
|
||||
return True
|
||||
|
||||
|
||||
def remove_account(name: str, delete_profile: bool = False) -> bool:
|
||||
"""
|
||||
Remove an account.
|
||||
|
||||
Args:
|
||||
name: Account name to remove
|
||||
delete_profile: If True, also delete the Chrome profile directory
|
||||
|
||||
Returns True if removed, False if not found or is default.
|
||||
"""
|
||||
data = _load_accounts()
|
||||
if name not in data["accounts"]:
|
||||
return False
|
||||
|
||||
# Don't allow removing the default account if it's the only one
|
||||
if name == data.get("default_account") and len(data["accounts"]) == 1:
|
||||
return False
|
||||
|
||||
profile_dir = data["accounts"][name].get("profile_dir", "")
|
||||
del data["accounts"][name]
|
||||
|
||||
# If we removed the default, set a new default
|
||||
if name == data.get("default_account"):
|
||||
data["default_account"] = next(iter(data["accounts"].keys()))
|
||||
|
||||
_save_accounts(data)
|
||||
|
||||
# Optionally delete the profile directory
|
||||
if delete_profile and profile_dir and os.path.isdir(profile_dir):
|
||||
try:
|
||||
shutil.rmtree(profile_dir)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
return True
|
||||
|
||||
|
||||
def get_account_info(name: str) -> Optional[dict]:
|
||||
"""Get info for a specific account."""
|
||||
data = _load_accounts()
|
||||
if name not in data["accounts"]:
|
||||
return None
|
||||
info = data["accounts"][name].copy()
|
||||
info["name"] = name
|
||||
info["is_default"] = name == data.get("default_account")
|
||||
return info
|
||||
|
||||
|
||||
def account_exists(name: str) -> bool:
|
||||
"""Check if an account exists."""
|
||||
data = _load_accounts()
|
||||
return name in data["accounts"]
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# CLI
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def main():
|
||||
import argparse
|
||||
|
||||
parser = argparse.ArgumentParser(description="Xiaohongshu Account Manager")
|
||||
sub = parser.add_subparsers(dest="command", required=True)
|
||||
|
||||
# list
|
||||
sub.add_parser("list", help="List all accounts")
|
||||
|
||||
# add
|
||||
p_add = sub.add_parser("add", help="Add a new account")
|
||||
p_add.add_argument("name", help="Account name (unique identifier)")
|
||||
p_add.add_argument("--alias", help="Display name / description")
|
||||
|
||||
# remove
|
||||
p_rm = sub.add_parser("remove", help="Remove an account")
|
||||
p_rm.add_argument("name", help="Account name to remove")
|
||||
p_rm.add_argument("--delete-profile", action="store_true",
|
||||
help="Also delete the Chrome profile directory")
|
||||
|
||||
# info
|
||||
p_info = sub.add_parser("info", help="Show account info")
|
||||
p_info.add_argument("name", help="Account name")
|
||||
|
||||
# set-default
|
||||
p_def = sub.add_parser("set-default", help="Set the default account")
|
||||
p_def.add_argument("name", help="Account name to set as default")
|
||||
|
||||
# get-profile-dir (for internal use)
|
||||
p_dir = sub.add_parser("get-profile-dir", help="Get profile directory for an account")
|
||||
p_dir.add_argument("--account", help="Account name (default: default account)")
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
if args.command == "list":
|
||||
accounts = list_accounts()
|
||||
if not accounts:
|
||||
print("No accounts configured.")
|
||||
return
|
||||
print(f"{'Name':<20} {'Alias':<20} {'Default':<10}")
|
||||
print("-" * 50)
|
||||
for acc in accounts:
|
||||
default_mark = "*" if acc["is_default"] else ""
|
||||
print(f"{acc['name']:<20} {acc['alias']:<20} {default_mark:<10}")
|
||||
|
||||
elif args.command == "add":
|
||||
if add_account(args.name, args.alias):
|
||||
print(f"Account '{args.name}' added.")
|
||||
print(f"Profile dir: {get_profile_dir(args.name)}")
|
||||
print("\nTo log in to this account, run:")
|
||||
print(f" python cdp_publish.py --account {args.name} login")
|
||||
else:
|
||||
print(f"Error: Account '{args.name}' already exists.", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
elif args.command == "remove":
|
||||
if remove_account(args.name, args.delete_profile):
|
||||
print(f"Account '{args.name}' removed.")
|
||||
else:
|
||||
print(f"Error: Cannot remove account '{args.name}'.", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
elif args.command == "info":
|
||||
info = get_account_info(args.name)
|
||||
if info:
|
||||
print(f"Name: {info['name']}")
|
||||
print(f"Alias: {info.get('alias', '')}")
|
||||
print(f"Profile dir: {info.get('profile_dir', '')}")
|
||||
print(f"Default: {'Yes' if info.get('is_default') else 'No'}")
|
||||
print(f"Created: {info.get('created_at', 'Unknown')}")
|
||||
else:
|
||||
print(f"Error: Account '{args.name}' not found.", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
elif args.command == "set-default":
|
||||
if set_default_account(args.name):
|
||||
print(f"Default account set to '{args.name}'.")
|
||||
else:
|
||||
print(f"Error: Account '{args.name}' not found.", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
elif args.command == "get-profile-dir":
|
||||
print(get_profile_dir(args.account))
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
686
skills/post-to-xhs/scripts/cdp_publish.py
Normal file
686
skills/post-to-xhs/scripts/cdp_publish.py
Normal file
@@ -0,0 +1,686 @@
|
||||
"""
|
||||
CDP-based Xiaohongshu publisher.
|
||||
|
||||
Connects to a Chrome instance via Chrome DevTools Protocol to automate
|
||||
publishing articles on Xiaohongshu (RED) creator center.
|
||||
|
||||
CLI usage:
|
||||
# Basic commands
|
||||
python cdp_publish.py check-login [--headless] [--account NAME]
|
||||
python cdp_publish.py fill --title "标题" --content "正文" --images img1.jpg [--headless] [--account NAME]
|
||||
python cdp_publish.py publish --title "标题" --content "正文" --images img1.jpg [--headless] [--account NAME]
|
||||
python cdp_publish.py click-publish [--headless] [--account NAME]
|
||||
|
||||
# Account management
|
||||
python cdp_publish.py login [--account NAME] # open browser for QR login
|
||||
python cdp_publish.py re-login [--account NAME] # clear cookies and re-login same account
|
||||
python cdp_publish.py switch-account [--account NAME] # clear cookies + open login for new account
|
||||
python cdp_publish.py list-accounts # list all configured accounts
|
||||
python cdp_publish.py add-account NAME [--alias ALIAS] # add a new account
|
||||
python cdp_publish.py remove-account NAME # remove an account
|
||||
|
||||
Library usage:
|
||||
from cdp_publish import XiaohongshuPublisher
|
||||
|
||||
publisher = XiaohongshuPublisher()
|
||||
publisher.connect()
|
||||
publisher.check_login()
|
||||
publisher.publish(
|
||||
title="Article title",
|
||||
content="Article body text",
|
||||
image_paths=["/path/to/img1.jpg", "/path/to/img2.jpg"],
|
||||
)
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
import time
|
||||
import sys
|
||||
from typing import Any
|
||||
|
||||
# Ensure UTF-8 output on Windows consoles
|
||||
if sys.platform == "win32":
|
||||
os.environ.setdefault("PYTHONIOENCODING", "utf-8")
|
||||
try:
|
||||
sys.stdout.reconfigure(encoding="utf-8", errors="replace")
|
||||
sys.stderr.reconfigure(encoding="utf-8", errors="replace")
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
import requests
|
||||
import websockets.sync.client as ws_client
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Configuration - centralised selectors and URLs for easy maintenance
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
CDP_HOST = "127.0.0.1"
|
||||
CDP_PORT = 9222
|
||||
|
||||
# Xiaohongshu URLs
|
||||
XHS_CREATOR_URL = "https://creator.xiaohongshu.com/publish/publish"
|
||||
XHS_HOME_URL = "https://www.xiaohongshu.com"
|
||||
XHS_LOGIN_CHECK_URL = "https://creator.xiaohongshu.com"
|
||||
|
||||
# DOM selectors (update these when Xiaohongshu changes their page structure)
|
||||
# Last verified: 2026-02
|
||||
SELECTORS = {
|
||||
# "上传图文" tab - must click before uploading images
|
||||
"image_text_tab": "div.creator-tab",
|
||||
"image_text_tab_text": "上传图文",
|
||||
# Upload area - the file input element for images (visible after clicking tab)
|
||||
"upload_input": "input.upload-input",
|
||||
"upload_input_alt": 'input[type="file"]',
|
||||
# Title input field (visible after image upload)
|
||||
"title_input": 'input[placeholder*="填写标题"]',
|
||||
"title_input_alt": "input.d-text",
|
||||
# Content editor area - TipTap/ProseMirror contenteditable div
|
||||
"content_editor": "div.tiptap.ProseMirror",
|
||||
"content_editor_alt": 'div.ProseMirror[contenteditable="true"]',
|
||||
# Publish button
|
||||
"publish_button_text": "发布",
|
||||
# Login indicator - URL-based check (redirect to /login if not logged in)
|
||||
"login_indicator": '.user-info, .creator-header, [class*="user"]',
|
||||
}
|
||||
|
||||
# Timing
|
||||
PAGE_LOAD_WAIT = 3 # seconds to wait after navigation
|
||||
TAB_CLICK_WAIT = 2 # seconds to wait after clicking tab
|
||||
UPLOAD_WAIT = 6 # seconds to wait after image upload for editor to appear
|
||||
ACTION_INTERVAL = 1 # seconds between actions
|
||||
|
||||
|
||||
class CDPError(Exception):
|
||||
"""Error communicating with Chrome via CDP."""
|
||||
|
||||
|
||||
class XiaohongshuPublisher:
|
||||
"""Automates publishing to Xiaohongshu via CDP."""
|
||||
|
||||
def __init__(self, host: str = CDP_HOST, port: int = CDP_PORT):
|
||||
self.host = host
|
||||
self.port = port
|
||||
self.ws = None
|
||||
self._msg_id = 0
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# CDP connection management
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def _get_targets(self) -> list[dict]:
|
||||
"""Get list of available browser targets (tabs). Retries once on failure."""
|
||||
url = f"http://{self.host}:{self.port}/json"
|
||||
for attempt in range(2):
|
||||
try:
|
||||
resp = requests.get(url, timeout=5)
|
||||
resp.raise_for_status()
|
||||
return resp.json()
|
||||
except Exception as e:
|
||||
if attempt == 0:
|
||||
print(f"[cdp_publish] CDP connection failed ({e}), restarting Chrome...")
|
||||
from chrome_launcher import ensure_chrome
|
||||
ensure_chrome(self.port)
|
||||
time.sleep(2)
|
||||
else:
|
||||
raise CDPError(f"Cannot reach Chrome on {self.host}:{self.port}: {e}")
|
||||
|
||||
def _find_or_create_tab(self, target_url_prefix: str = "") -> str:
|
||||
"""Find an existing tab matching the URL prefix, or return the first page tab."""
|
||||
targets = self._get_targets()
|
||||
pages = [t for t in targets if t.get("type") == "page"]
|
||||
|
||||
if target_url_prefix:
|
||||
for t in pages:
|
||||
if t.get("url", "").startswith(target_url_prefix):
|
||||
return t["webSocketDebuggerUrl"]
|
||||
|
||||
# Create a new tab
|
||||
resp = requests.put(
|
||||
f"http://{self.host}:{self.port}/json/new?{XHS_CREATOR_URL}",
|
||||
timeout=5,
|
||||
)
|
||||
if resp.ok:
|
||||
return resp.json().get("webSocketDebuggerUrl", "")
|
||||
|
||||
# Fallback: use first available page
|
||||
if pages:
|
||||
return pages[0]["webSocketDebuggerUrl"]
|
||||
|
||||
raise CDPError("No browser tabs available.")
|
||||
|
||||
def connect(self, target_url_prefix: str = ""):
|
||||
"""Connect to a Chrome tab via WebSocket."""
|
||||
ws_url = self._find_or_create_tab(target_url_prefix)
|
||||
if not ws_url:
|
||||
raise CDPError("Could not obtain WebSocket URL for any tab.")
|
||||
|
||||
print(f"[cdp_publish] Connecting to {ws_url}")
|
||||
self.ws = ws_client.connect(ws_url)
|
||||
print("[cdp_publish] Connected to Chrome tab.")
|
||||
|
||||
def disconnect(self):
|
||||
"""Close the WebSocket connection."""
|
||||
if self.ws:
|
||||
self.ws.close()
|
||||
self.ws = None
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# CDP command helpers
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def _send(self, method: str, params: dict | None = None) -> dict:
|
||||
"""Send a CDP command and return the result."""
|
||||
if not self.ws:
|
||||
raise CDPError("Not connected. Call connect() first.")
|
||||
|
||||
self._msg_id += 1
|
||||
msg = {"id": self._msg_id, "method": method}
|
||||
if params:
|
||||
msg["params"] = params
|
||||
|
||||
self.ws.send(json.dumps(msg))
|
||||
|
||||
# Wait for the matching response
|
||||
while True:
|
||||
raw = self.ws.recv()
|
||||
data = json.loads(raw)
|
||||
if data.get("id") == self._msg_id:
|
||||
if "error" in data:
|
||||
raise CDPError(f"CDP error: {data['error']}")
|
||||
return data.get("result", {})
|
||||
# else: it's an event, skip it
|
||||
|
||||
def _evaluate(self, expression: str) -> Any:
|
||||
"""Execute JavaScript in the page and return the result value."""
|
||||
result = self._send("Runtime.evaluate", {
|
||||
"expression": expression,
|
||||
"returnByValue": True,
|
||||
"awaitPromise": True,
|
||||
})
|
||||
remote_obj = result.get("result", {})
|
||||
if remote_obj.get("subtype") == "error":
|
||||
raise CDPError(f"JS error: {remote_obj.get('description', remote_obj)}")
|
||||
return remote_obj.get("value")
|
||||
|
||||
def _navigate(self, url: str):
|
||||
"""Navigate the current tab to the given URL and wait for load."""
|
||||
print(f"[cdp_publish] Navigating to {url}")
|
||||
self._send("Page.enable")
|
||||
self._send("Page.navigate", {"url": url})
|
||||
time.sleep(PAGE_LOAD_WAIT)
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Login check
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def check_login(self) -> bool:
|
||||
"""
|
||||
Navigate to Xiaohongshu creator center and check if the user is logged in.
|
||||
|
||||
Returns True if logged in. If not logged in, prints instructions
|
||||
and returns False.
|
||||
"""
|
||||
self._navigate(XHS_LOGIN_CHECK_URL)
|
||||
time.sleep(2)
|
||||
|
||||
# Check if we got redirected to a login page
|
||||
current_url = self._evaluate("window.location.href")
|
||||
print(f"[cdp_publish] Current URL: {current_url}")
|
||||
|
||||
if "login" in current_url.lower():
|
||||
print(
|
||||
"\n[cdp_publish] NOT LOGGED IN.\n"
|
||||
" Please scan the QR code in the Chrome window to log in,\n"
|
||||
" then run this script again.\n"
|
||||
)
|
||||
return False
|
||||
|
||||
print("[cdp_publish] Login confirmed.")
|
||||
return True
|
||||
|
||||
def clear_cookies(self, domain: str = ".xiaohongshu.com"):
|
||||
"""
|
||||
Clear all cookies for the given domain to force re-login.
|
||||
|
||||
Used when switching accounts.
|
||||
"""
|
||||
print(f"[cdp_publish] Clearing cookies for {domain}...")
|
||||
self._send("Network.enable")
|
||||
self._send("Network.clearBrowserCookies")
|
||||
# Also clear storage
|
||||
self._send("Storage.clearDataForOrigin", {
|
||||
"origin": "https://www.xiaohongshu.com",
|
||||
"storageTypes": "cookies,local_storage,session_storage",
|
||||
})
|
||||
self._send("Storage.clearDataForOrigin", {
|
||||
"origin": "https://creator.xiaohongshu.com",
|
||||
"storageTypes": "cookies,local_storage,session_storage",
|
||||
})
|
||||
print("[cdp_publish] Cookies and storage cleared.")
|
||||
|
||||
def open_login_page(self):
|
||||
"""
|
||||
Navigate to the Xiaohongshu login page for QR code scanning.
|
||||
|
||||
Used for initial login or after clearing cookies for account switch.
|
||||
"""
|
||||
self._navigate(XHS_LOGIN_CHECK_URL)
|
||||
time.sleep(2)
|
||||
current_url = self._evaluate("window.location.href")
|
||||
if "login" not in current_url.lower():
|
||||
# Already logged in, navigate to login page explicitly
|
||||
self._navigate("https://creator.xiaohongshu.com/login")
|
||||
time.sleep(2)
|
||||
print(
|
||||
"\n[cdp_publish] Login page is open.\n"
|
||||
" Please scan the QR code in the Chrome window to log in.\n"
|
||||
)
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Publishing actions
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def _click_image_text_tab(self):
|
||||
"""Click the '上传图文' tab to switch to image+text publish mode."""
|
||||
print("[cdp_publish] Clicking '上传图文' tab...")
|
||||
tab_text = SELECTORS["image_text_tab_text"]
|
||||
selector = SELECTORS["image_text_tab"]
|
||||
|
||||
clicked = self._evaluate(f"""
|
||||
(function() {{
|
||||
var tabs = document.querySelectorAll('{selector}');
|
||||
for (var i = 0; i < tabs.length; i++) {{
|
||||
if (tabs[i].textContent.trim() === '{tab_text}') {{
|
||||
tabs[i].click();
|
||||
return true;
|
||||
}}
|
||||
}}
|
||||
return false;
|
||||
}})();
|
||||
""")
|
||||
|
||||
if not clicked:
|
||||
raise CDPError(
|
||||
f"Could not find '{tab_text}' tab. "
|
||||
"The page structure may have changed."
|
||||
)
|
||||
|
||||
print("[cdp_publish] Tab clicked, waiting for upload area...")
|
||||
time.sleep(TAB_CLICK_WAIT)
|
||||
|
||||
def _upload_images(self, image_paths: list[str]):
|
||||
"""Upload images via the file input element."""
|
||||
if not image_paths:
|
||||
print("[cdp_publish] No images to upload, skipping.")
|
||||
return
|
||||
|
||||
# Normalize paths (forward slashes for CDP)
|
||||
normalized = [p.replace("\\", "/") for p in image_paths]
|
||||
|
||||
print(f"[cdp_publish] Uploading {len(image_paths)} image(s)...")
|
||||
|
||||
# Enable DOM domain
|
||||
self._send("DOM.enable")
|
||||
|
||||
# Get the document root
|
||||
doc = self._send("DOM.getDocument")
|
||||
root_id = doc["root"]["nodeId"]
|
||||
|
||||
# Try primary selector, then fallback
|
||||
node_id = 0
|
||||
for selector in (SELECTORS["upload_input"], SELECTORS["upload_input_alt"]):
|
||||
result = self._send("DOM.querySelector", {
|
||||
"nodeId": root_id,
|
||||
"selector": selector,
|
||||
})
|
||||
node_id = result.get("nodeId", 0)
|
||||
if node_id:
|
||||
break
|
||||
|
||||
if not node_id:
|
||||
raise CDPError(
|
||||
"Could not find file input element.\n"
|
||||
"The page structure may have changed. Check references/publish-workflow.md."
|
||||
)
|
||||
|
||||
# Use DOM.setFileInputFiles to set the files
|
||||
self._send("DOM.setFileInputFiles", {
|
||||
"nodeId": node_id,
|
||||
"files": normalized,
|
||||
})
|
||||
|
||||
print("[cdp_publish] Images uploaded. Waiting for editor to appear...")
|
||||
time.sleep(UPLOAD_WAIT)
|
||||
|
||||
def _fill_title(self, title: str):
|
||||
"""Fill in the article title."""
|
||||
print(f"[cdp_publish] Setting title: {title[:40]}...")
|
||||
time.sleep(ACTION_INTERVAL)
|
||||
|
||||
for selector in (SELECTORS["title_input"], SELECTORS["title_input_alt"]):
|
||||
found = self._evaluate(f"!!document.querySelector('{selector}')")
|
||||
if found:
|
||||
escaped_title = json.dumps(title)
|
||||
self._evaluate(f"""
|
||||
(function() {{
|
||||
var el = document.querySelector('{selector}');
|
||||
var nativeSetter = Object.getOwnPropertyDescriptor(
|
||||
window.HTMLInputElement.prototype, 'value'
|
||||
).set;
|
||||
el.focus();
|
||||
nativeSetter.call(el, {escaped_title});
|
||||
el.dispatchEvent(new Event('input', {{ bubbles: true }}));
|
||||
el.dispatchEvent(new Event('change', {{ bubbles: true }}));
|
||||
}})();
|
||||
""")
|
||||
print("[cdp_publish] Title set.")
|
||||
return
|
||||
|
||||
raise CDPError("Could not find title input element.")
|
||||
|
||||
def _fill_content(self, content: str):
|
||||
"""Fill in the article body content using the TipTap/ProseMirror editor."""
|
||||
print(f"[cdp_publish] Setting content ({len(content)} chars)...")
|
||||
time.sleep(ACTION_INTERVAL)
|
||||
|
||||
for selector in (SELECTORS["content_editor"], SELECTORS["content_editor_alt"]):
|
||||
found = self._evaluate(f"!!document.querySelector('{selector}')")
|
||||
if found:
|
||||
escaped = json.dumps(content)
|
||||
self._evaluate(f"""
|
||||
(function() {{
|
||||
var el = document.querySelector('{selector}');
|
||||
el.focus();
|
||||
var text = {escaped};
|
||||
var paragraphs = text.split('\\n').filter(function(p) {{ return p.trim(); }});
|
||||
var html = [];
|
||||
for (var i = 0; i < paragraphs.length; i++) {{
|
||||
html.push('<p>' + paragraphs[i] + '</p>');
|
||||
if (i < paragraphs.length - 1) {{
|
||||
html.push('<p><br></p>');
|
||||
}}
|
||||
}}
|
||||
el.innerHTML = html.join('');
|
||||
el.dispatchEvent(new Event('input', {{ bubbles: true }}));
|
||||
}})();
|
||||
""")
|
||||
print("[cdp_publish] Content set.")
|
||||
return
|
||||
|
||||
raise CDPError("Could not find content editor element.")
|
||||
|
||||
def _click_publish(self):
|
||||
"""Click the publish button (found by text content)."""
|
||||
print("[cdp_publish] Clicking publish button...")
|
||||
time.sleep(ACTION_INTERVAL)
|
||||
|
||||
btn_text = SELECTORS["publish_button_text"]
|
||||
clicked = self._evaluate(f"""
|
||||
(function() {{
|
||||
// Strategy 1: search <button> elements by text
|
||||
var buttons = document.querySelectorAll('button');
|
||||
for (var i = 0; i < buttons.length; i++) {{
|
||||
var t = buttons[i].textContent.trim();
|
||||
if (t === '{btn_text}') {{
|
||||
buttons[i].click();
|
||||
return true;
|
||||
}}
|
||||
}}
|
||||
// Strategy 2: search d-button-content / d-text spans
|
||||
var spans = document.querySelectorAll('.d-button-content .d-text, .d-button-content span');
|
||||
for (var i = 0; i < spans.length; i++) {{
|
||||
if (spans[i].textContent.trim() === '{btn_text}') {{
|
||||
var el = spans[i].closest('button, [role="button"], .d-button, [class*="btn"], [class*="button"]');
|
||||
if (!el) el = spans[i].closest('.d-button-content');
|
||||
if (!el) el = spans[i];
|
||||
el.click();
|
||||
return true;
|
||||
}}
|
||||
}}
|
||||
return false;
|
||||
}})();
|
||||
""")
|
||||
|
||||
if clicked:
|
||||
print("[cdp_publish] Publish button clicked.")
|
||||
else:
|
||||
raise CDPError(
|
||||
"Could not find publish button. "
|
||||
"Please click it manually in the browser."
|
||||
)
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Main publish workflow
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def publish(
|
||||
self,
|
||||
title: str,
|
||||
content: str,
|
||||
image_paths: list[str] | None = None,
|
||||
):
|
||||
"""
|
||||
Execute the full publish workflow:
|
||||
1. Navigate to creator publish page
|
||||
2. Click '上传图文' tab
|
||||
3. Upload images (this triggers the editor to appear)
|
||||
4. Fill title
|
||||
5. Fill content
|
||||
|
||||
Args:
|
||||
title: Article title
|
||||
content: Article body text (paragraphs separated by newlines)
|
||||
image_paths: List of local file paths to images to upload
|
||||
"""
|
||||
if not self.ws:
|
||||
raise CDPError("Not connected. Call connect() first.")
|
||||
|
||||
if not image_paths:
|
||||
raise CDPError("At least one image is required to publish on Xiaohongshu.")
|
||||
|
||||
# Step 1: Navigate to publish page
|
||||
self._navigate(XHS_CREATOR_URL)
|
||||
time.sleep(2)
|
||||
|
||||
# Step 2: Click '上传图文' tab
|
||||
self._click_image_text_tab()
|
||||
|
||||
# Step 3: Upload images (editor appears after upload)
|
||||
self._upload_images(image_paths)
|
||||
|
||||
# Step 4: Fill title
|
||||
self._fill_title(title)
|
||||
|
||||
# Step 5: Fill content
|
||||
self._fill_content(content)
|
||||
|
||||
print(
|
||||
"\n[cdp_publish] Content has been filled in.\n"
|
||||
" Please review in the browser before publishing.\n"
|
||||
)
|
||||
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# CLI entry point
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def main():
|
||||
import argparse
|
||||
from chrome_launcher import ensure_chrome, restart_chrome
|
||||
|
||||
parser = argparse.ArgumentParser(description="Xiaohongshu CDP Publisher")
|
||||
parser.add_argument("--headless", action="store_true",
|
||||
help="Use headless Chrome (no GUI window)")
|
||||
parser.add_argument("--account", help="Account name to use (default: default account)")
|
||||
sub = parser.add_subparsers(dest="command", required=True)
|
||||
|
||||
# check-login
|
||||
sub.add_parser("check-login", help="Check login status (exit 0=logged in, 1=not)")
|
||||
|
||||
# fill - fill form without clicking publish
|
||||
p_fill = sub.add_parser("fill", help="Fill title/content/images without publishing")
|
||||
p_fill.add_argument("--title", required=True)
|
||||
p_fill.add_argument("--content", default=None)
|
||||
p_fill.add_argument("--content-file", default=None, help="Read content from file")
|
||||
p_fill.add_argument("--images", nargs="+", required=True)
|
||||
|
||||
# publish - fill form and click publish
|
||||
p_pub = sub.add_parser("publish", help="Fill form and click publish")
|
||||
p_pub.add_argument("--title", required=True)
|
||||
p_pub.add_argument("--content", default=None)
|
||||
p_pub.add_argument("--content-file", default=None, help="Read content from file")
|
||||
p_pub.add_argument("--images", nargs="+", required=True)
|
||||
|
||||
# click-publish - just click the publish button on current page
|
||||
sub.add_parser("click-publish", help="Click publish button on already-filled page")
|
||||
|
||||
# login - open browser for QR code login (always headed)
|
||||
sub.add_parser("login", help="Open browser for QR code login (always headed mode)")
|
||||
|
||||
# re-login - clear cookies and re-login the same account (always headed)
|
||||
sub.add_parser("re-login", help="Clear cookies and re-login same account (always headed)")
|
||||
|
||||
# switch-account - clear cookies and open login page (always headed)
|
||||
sub.add_parser("switch-account",
|
||||
help="Clear cookies and open login page for new account (always headed)")
|
||||
|
||||
# list-accounts - list all configured accounts
|
||||
sub.add_parser("list-accounts", help="List all configured accounts")
|
||||
|
||||
# add-account - add a new account
|
||||
p_add = sub.add_parser("add-account", help="Add a new account")
|
||||
p_add.add_argument("name", help="Account name (unique identifier)")
|
||||
p_add.add_argument("--alias", help="Display name / description")
|
||||
|
||||
# remove-account - remove an account
|
||||
p_rm = sub.add_parser("remove-account", help="Remove an account")
|
||||
p_rm.add_argument("name", help="Account name to remove")
|
||||
p_rm.add_argument("--delete-profile", action="store_true",
|
||||
help="Also delete the Chrome profile directory")
|
||||
|
||||
# set-default-account - set default account
|
||||
p_def = sub.add_parser("set-default-account", help="Set the default account")
|
||||
p_def.add_argument("name", help="Account name to set as default")
|
||||
|
||||
args = parser.parse_args()
|
||||
headless = args.headless
|
||||
account = args.account
|
||||
|
||||
# Account management commands that don't need Chrome
|
||||
if args.command == "list-accounts":
|
||||
from account_manager import list_accounts
|
||||
accounts = list_accounts()
|
||||
if not accounts:
|
||||
print("No accounts configured.")
|
||||
return
|
||||
print(f"{'Name':<20} {'Alias':<25} {'Default':<10}")
|
||||
print("-" * 55)
|
||||
for acc in accounts:
|
||||
default_mark = "*" if acc["is_default"] else ""
|
||||
print(f"{acc['name']:<20} {acc['alias']:<25} {default_mark:<10}")
|
||||
return
|
||||
|
||||
elif args.command == "add-account":
|
||||
from account_manager import add_account, get_profile_dir
|
||||
if add_account(args.name, args.alias):
|
||||
print(f"Account '{args.name}' added.")
|
||||
print(f"Profile dir: {get_profile_dir(args.name)}")
|
||||
print("\nTo log in to this account, run:")
|
||||
print(f" python cdp_publish.py --account {args.name} login")
|
||||
else:
|
||||
print(f"Error: Account '{args.name}' already exists.", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
return
|
||||
|
||||
elif args.command == "remove-account":
|
||||
from account_manager import remove_account
|
||||
if remove_account(args.name, args.delete_profile):
|
||||
print(f"Account '{args.name}' removed.")
|
||||
else:
|
||||
print(f"Error: Cannot remove account '{args.name}'.", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
return
|
||||
|
||||
elif args.command == "set-default-account":
|
||||
from account_manager import set_default_account
|
||||
if set_default_account(args.name):
|
||||
print(f"Default account set to '{args.name}'.")
|
||||
else:
|
||||
print(f"Error: Account '{args.name}' not found.", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
return
|
||||
|
||||
# Commands that require Chrome - login/re-login/switch-account always headed
|
||||
if args.command in ("login", "re-login", "switch-account"):
|
||||
headless = False
|
||||
|
||||
if not ensure_chrome(headless=headless, account=account):
|
||||
print("Failed to start Chrome. Exiting.")
|
||||
sys.exit(1)
|
||||
|
||||
publisher = XiaohongshuPublisher()
|
||||
try:
|
||||
if args.command == "check-login":
|
||||
publisher.connect()
|
||||
logged_in = publisher.check_login()
|
||||
if not logged_in and headless:
|
||||
print(
|
||||
"[cdp_publish] Headless mode: cannot scan QR code.\n"
|
||||
" Run with 'login' command or without --headless to log in."
|
||||
)
|
||||
sys.exit(0 if logged_in else 1)
|
||||
|
||||
elif args.command in ("fill", "publish"):
|
||||
content = args.content
|
||||
if args.content_file:
|
||||
with open(args.content_file, encoding="utf-8") as f:
|
||||
content = f.read().strip()
|
||||
if not content:
|
||||
print("Error: --content or --content-file required.", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
publisher.connect()
|
||||
publisher.publish(title=args.title, content=content, image_paths=args.images)
|
||||
print("FILL_STATUS: READY_TO_PUBLISH")
|
||||
|
||||
if args.command == "publish":
|
||||
publisher._click_publish()
|
||||
print("PUBLISH_STATUS: PUBLISHED")
|
||||
|
||||
elif args.command == "click-publish":
|
||||
publisher.connect(target_url_prefix="https://creator.xiaohongshu.com/publish")
|
||||
publisher._click_publish()
|
||||
print("PUBLISH_STATUS: PUBLISHED")
|
||||
|
||||
elif args.command == "login":
|
||||
# Ensure headed mode for QR scanning
|
||||
restart_chrome(headless=False, account=account)
|
||||
publisher.connect()
|
||||
publisher.open_login_page()
|
||||
print("LOGIN_READY")
|
||||
|
||||
elif args.command == "re-login":
|
||||
# Ensure headed mode, clear cookies, re-open login page for same account
|
||||
restart_chrome(headless=False, account=account)
|
||||
publisher.connect()
|
||||
publisher.clear_cookies()
|
||||
time.sleep(1)
|
||||
publisher.open_login_page()
|
||||
print("RE_LOGIN_READY")
|
||||
|
||||
elif args.command == "switch-account":
|
||||
# Ensure headed mode, clear cookies, open login page
|
||||
restart_chrome(headless=False, account=account)
|
||||
publisher.connect()
|
||||
publisher.clear_cookies()
|
||||
time.sleep(1)
|
||||
publisher.open_login_page()
|
||||
print("SWITCH_ACCOUNT_READY")
|
||||
|
||||
finally:
|
||||
publisher.disconnect()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
296
skills/post-to-xhs/scripts/chrome_launcher.py
Normal file
296
skills/post-to-xhs/scripts/chrome_launcher.py
Normal file
@@ -0,0 +1,296 @@
|
||||
"""
|
||||
Chrome launcher with CDP remote debugging support.
|
||||
|
||||
Manages a dedicated Chrome instance for Xiaohongshu publishing:
|
||||
- Detects if Chrome is already listening on the debug port
|
||||
- Launches Chrome with a dedicated user-data-dir for login persistence
|
||||
- Waits for the debug port to become available
|
||||
- Supports headless mode for automated publishing without GUI
|
||||
- Supports switching between headless and headed mode (e.g. for login)
|
||||
- Supports multiple accounts with separate profile directories
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import time
|
||||
import socket
|
||||
import subprocess
|
||||
import platform
|
||||
import signal
|
||||
from typing import Optional
|
||||
|
||||
CDP_PORT = 9222
|
||||
PROFILE_DIR_NAME = "XiaohongshuProfile"
|
||||
STARTUP_TIMEOUT = 15 # seconds to wait for Chrome to start
|
||||
|
||||
# Track the Chrome process we launched so we can kill it later
|
||||
_chrome_process: subprocess.Popen | None = None
|
||||
# Track the current account being used
|
||||
_current_account: Optional[str] = None
|
||||
|
||||
|
||||
def get_chrome_path() -> str:
|
||||
"""Find Chrome executable on Windows."""
|
||||
candidates = []
|
||||
|
||||
# Standard install locations
|
||||
for env_var in ("PROGRAMFILES", "PROGRAMFILES(X86)", "LOCALAPPDATA"):
|
||||
base = os.environ.get(env_var, "")
|
||||
if base:
|
||||
candidates.append(os.path.join(base, "Google", "Chrome", "Application", "chrome.exe"))
|
||||
|
||||
for path in candidates:
|
||||
if os.path.isfile(path):
|
||||
return path
|
||||
|
||||
# Fallback: check PATH
|
||||
import shutil
|
||||
found = shutil.which("chrome") or shutil.which("chrome.exe")
|
||||
if found:
|
||||
return found
|
||||
|
||||
raise FileNotFoundError(
|
||||
"Chrome not found. Please install Google Chrome or set its path manually."
|
||||
)
|
||||
|
||||
|
||||
def get_user_data_dir(account: Optional[str] = None) -> str:
|
||||
"""
|
||||
Return the Chrome profile directory path for a given account.
|
||||
|
||||
Args:
|
||||
account: Account name. If None, uses the default account from account_manager.
|
||||
|
||||
Returns:
|
||||
Path to the Chrome user-data-dir for this account.
|
||||
"""
|
||||
try:
|
||||
from account_manager import get_profile_dir
|
||||
return get_profile_dir(account)
|
||||
except ImportError:
|
||||
# Fallback if account_manager not available
|
||||
local_app_data = os.environ.get("LOCALAPPDATA", "")
|
||||
if not local_app_data:
|
||||
local_app_data = os.path.expanduser("~")
|
||||
return os.path.join(local_app_data, "Google", "Chrome", PROFILE_DIR_NAME)
|
||||
|
||||
|
||||
def is_port_open(port: int, host: str = "127.0.0.1") -> bool:
|
||||
"""Check if a TCP port is accepting connections."""
|
||||
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
|
||||
s.settimeout(1)
|
||||
try:
|
||||
s.connect((host, port))
|
||||
return True
|
||||
except (ConnectionRefusedError, socket.timeout, OSError):
|
||||
return False
|
||||
|
||||
|
||||
def launch_chrome(port: int = CDP_PORT, headless: bool = False, account: Optional[str] = None) -> subprocess.Popen | None:
|
||||
"""
|
||||
Launch Chrome with remote debugging enabled.
|
||||
|
||||
Args:
|
||||
port: CDP remote debugging port.
|
||||
headless: If True, launch Chrome in headless mode (no GUI window).
|
||||
account: Account name to use. If None, uses the default account.
|
||||
|
||||
Returns the Popen object if a new process was started, or None if Chrome
|
||||
was already running on the target port.
|
||||
"""
|
||||
global _chrome_process, _current_account
|
||||
|
||||
if is_port_open(port):
|
||||
print(f"[chrome_launcher] Chrome already running on port {port}.")
|
||||
return None
|
||||
|
||||
chrome_path = get_chrome_path()
|
||||
user_data_dir = get_user_data_dir(account)
|
||||
_current_account = account
|
||||
|
||||
cmd = [
|
||||
chrome_path,
|
||||
f"--remote-debugging-port={port}",
|
||||
f"--user-data-dir={user_data_dir}",
|
||||
"--no-first-run",
|
||||
"--no-default-browser-check",
|
||||
]
|
||||
|
||||
if headless:
|
||||
cmd.append("--headless=new")
|
||||
|
||||
mode_label = "headless" if headless else "headed"
|
||||
account_label = account or "default"
|
||||
print(f"[chrome_launcher] Launching Chrome ({mode_label}, account: {account_label})...")
|
||||
print(f" executable : {chrome_path}")
|
||||
print(f" profile dir: {user_data_dir}")
|
||||
print(f" debug port : {port}")
|
||||
|
||||
proc = subprocess.Popen(
|
||||
cmd,
|
||||
stdout=subprocess.DEVNULL,
|
||||
stderr=subprocess.DEVNULL,
|
||||
)
|
||||
_chrome_process = proc
|
||||
|
||||
# Wait for the debug port to become available
|
||||
deadline = time.time() + STARTUP_TIMEOUT
|
||||
while time.time() < deadline:
|
||||
if is_port_open(port):
|
||||
print(f"[chrome_launcher] Chrome is ready on port {port}.")
|
||||
return proc
|
||||
time.sleep(0.5)
|
||||
|
||||
print(
|
||||
f"[chrome_launcher] WARNING: Chrome started but port {port} not responding "
|
||||
f"after {STARTUP_TIMEOUT}s. It may still be initializing.",
|
||||
file=sys.stderr,
|
||||
)
|
||||
return proc
|
||||
|
||||
|
||||
def kill_chrome(port: int = CDP_PORT):
|
||||
"""
|
||||
Kill the Chrome instance on the given debug port.
|
||||
|
||||
Tries multiple strategies:
|
||||
1. Send CDP Browser.close command via HTTP
|
||||
2. Terminate the tracked subprocess
|
||||
3. Kill by port on Windows (taskkill)
|
||||
"""
|
||||
global _chrome_process
|
||||
|
||||
# Strategy 1: CDP Browser.close
|
||||
try:
|
||||
import requests
|
||||
resp = requests.get(f"http://127.0.0.1:{port}/json/version", timeout=2)
|
||||
if resp.ok:
|
||||
ws_url = resp.json().get("webSocketDebuggerUrl")
|
||||
if ws_url:
|
||||
import websockets.sync.client as ws_client
|
||||
ws = ws_client.connect(ws_url)
|
||||
ws.send('{"id":1,"method":"Browser.close"}')
|
||||
try:
|
||||
ws.recv(timeout=2)
|
||||
except Exception:
|
||||
pass
|
||||
ws.close()
|
||||
print("[chrome_launcher] Sent Browser.close via CDP.")
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Wait briefly for Chrome to shut down
|
||||
time.sleep(1)
|
||||
|
||||
# Strategy 2: Terminate tracked subprocess
|
||||
if _chrome_process and _chrome_process.poll() is None:
|
||||
try:
|
||||
_chrome_process.terminate()
|
||||
_chrome_process.wait(timeout=5)
|
||||
print("[chrome_launcher] Terminated tracked Chrome process.")
|
||||
except Exception:
|
||||
try:
|
||||
_chrome_process.kill()
|
||||
except Exception:
|
||||
pass
|
||||
_chrome_process = None
|
||||
|
||||
# Strategy 3: Windows taskkill by port (fallback)
|
||||
if sys.platform == "win32" and is_port_open(port):
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["netstat", "-ano"],
|
||||
capture_output=True, text=True, timeout=5
|
||||
)
|
||||
for line in result.stdout.splitlines():
|
||||
if f":{port}" in line and "LISTENING" in line:
|
||||
pid = line.strip().split()[-1]
|
||||
subprocess.run(
|
||||
["taskkill", "/F", "/PID", pid],
|
||||
capture_output=True, timeout=5
|
||||
)
|
||||
print(f"[chrome_launcher] Killed process {pid} via taskkill.")
|
||||
break
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Wait for port to be released
|
||||
deadline = time.time() + 5
|
||||
while time.time() < deadline:
|
||||
if not is_port_open(port):
|
||||
return
|
||||
time.sleep(0.5)
|
||||
|
||||
if is_port_open(port):
|
||||
print(f"[chrome_launcher] WARNING: port {port} still open after kill attempt.",
|
||||
file=sys.stderr)
|
||||
|
||||
|
||||
def restart_chrome(port: int = CDP_PORT, headless: bool = False, account: Optional[str] = None) -> subprocess.Popen | None:
|
||||
"""
|
||||
Kill the current Chrome instance and relaunch with the specified mode.
|
||||
|
||||
Useful for switching between headless and headed mode (e.g. when login
|
||||
is needed during a headless session), or switching accounts.
|
||||
|
||||
Args:
|
||||
port: CDP remote debugging port.
|
||||
headless: If True, relaunch in headless mode.
|
||||
account: Account name to use. If None, uses the default account.
|
||||
|
||||
Returns the Popen object for the new Chrome process.
|
||||
"""
|
||||
account_label = account or "default"
|
||||
print(f"[chrome_launcher] Restarting Chrome ({'headless' if headless else 'headed'}, account: {account_label})...")
|
||||
kill_chrome(port)
|
||||
time.sleep(1)
|
||||
return launch_chrome(port, headless=headless, account=account)
|
||||
|
||||
|
||||
def ensure_chrome(port: int = CDP_PORT, headless: bool = False, account: Optional[str] = None) -> bool:
|
||||
"""
|
||||
Ensure Chrome is running with remote debugging on the given port.
|
||||
|
||||
Args:
|
||||
port: CDP remote debugging port.
|
||||
headless: If True, launch in headless mode when starting a new instance.
|
||||
If Chrome is already running, this parameter is ignored.
|
||||
account: Account name to use. If None, uses the default account.
|
||||
|
||||
Returns True if Chrome is available, False otherwise.
|
||||
"""
|
||||
if is_port_open(port):
|
||||
return True
|
||||
try:
|
||||
launch_chrome(port, headless=headless, account=account)
|
||||
return is_port_open(port)
|
||||
except FileNotFoundError as e:
|
||||
print(f"[chrome_launcher] Error: {e}", file=sys.stderr)
|
||||
return False
|
||||
|
||||
|
||||
def get_current_account() -> Optional[str]:
|
||||
"""Get the name of the currently active account."""
|
||||
return _current_account
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
import argparse
|
||||
parser = argparse.ArgumentParser(description="Chrome Launcher for CDP")
|
||||
parser.add_argument("--headless", action="store_true", help="Launch in headless mode")
|
||||
parser.add_argument("--kill", action="store_true", help="Kill the running Chrome instance")
|
||||
parser.add_argument("--restart", action="store_true", help="Restart Chrome")
|
||||
parser.add_argument("--account", help="Account name to use (default: default account)")
|
||||
args = parser.parse_args()
|
||||
|
||||
if args.kill:
|
||||
kill_chrome()
|
||||
print("[chrome_launcher] Chrome killed.")
|
||||
elif args.restart:
|
||||
restart_chrome(headless=args.headless, account=args.account)
|
||||
print("[chrome_launcher] Chrome restarted.")
|
||||
elif ensure_chrome(headless=args.headless, account=args.account):
|
||||
print("[chrome_launcher] Chrome is ready for CDP connections.")
|
||||
else:
|
||||
print("[chrome_launcher] Failed to start Chrome.", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
5
skills/post-to-xhs/scripts/content.txt
Normal file
5
skills/post-to-xhs/scripts/content.txt
Normal file
@@ -0,0 +1,5 @@
|
||||
微软向 Canary 通道推送了 Windows 11 Insider Preview Build 28020.1546 更新,补丁编号 KB5074176。
|
||||
|
||||
本次更新为常规改进与修复,属于小幅迭代更新,没有重大功能变化。
|
||||
|
||||
Canary 通道是 Windows Insider 最前沿的测试分支,适合愿意尝鲜和接受不稳定性的用户。
|
||||
141
skills/post-to-xhs/scripts/image_downloader.py
Normal file
141
skills/post-to-xhs/scripts/image_downloader.py
Normal file
@@ -0,0 +1,141 @@
|
||||
"""
|
||||
Image downloader for Xiaohongshu publishing.
|
||||
|
||||
Downloads images from URLs to a local temp directory for upload,
|
||||
and cleans up after publishing is complete.
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import tempfile
|
||||
import shutil
|
||||
import uuid
|
||||
from urllib.parse import urlparse, unquote
|
||||
|
||||
import requests
|
||||
|
||||
DEFAULT_TIMEOUT = 30 # seconds per download
|
||||
TEMP_DIR_PREFIX = "xhs_images_"
|
||||
|
||||
|
||||
class ImageDownloader:
|
||||
"""Download images from URLs and manage a temporary directory for them."""
|
||||
|
||||
def __init__(self, temp_dir: str | None = None):
|
||||
if temp_dir:
|
||||
self.temp_dir = temp_dir
|
||||
os.makedirs(self.temp_dir, exist_ok=True)
|
||||
self._owns_dir = False
|
||||
else:
|
||||
self.temp_dir = tempfile.mkdtemp(prefix=TEMP_DIR_PREFIX)
|
||||
self._owns_dir = True
|
||||
self.downloaded_files: list[str] = []
|
||||
|
||||
def _guess_extension(self, url: str, content_type: str | None) -> str:
|
||||
"""Guess file extension from URL path or Content-Type header."""
|
||||
# Try URL path first
|
||||
path = urlparse(url).path
|
||||
_, ext = os.path.splitext(unquote(path))
|
||||
if ext and ext.lower() in (".jpg", ".jpeg", ".png", ".gif", ".webp", ".bmp"):
|
||||
return ext.lower()
|
||||
|
||||
# Fall back to Content-Type
|
||||
ct_map = {
|
||||
"image/jpeg": ".jpg",
|
||||
"image/png": ".png",
|
||||
"image/gif": ".gif",
|
||||
"image/webp": ".webp",
|
||||
"image/bmp": ".bmp",
|
||||
}
|
||||
if content_type:
|
||||
for mime, ext in ct_map.items():
|
||||
if mime in content_type:
|
||||
return ext
|
||||
|
||||
return ".jpg" # safe default
|
||||
|
||||
def download(self, url: str, referer: str | None = None) -> str:
|
||||
"""
|
||||
Download a single image and return the local file path.
|
||||
|
||||
Args:
|
||||
url: Image URL to download
|
||||
referer: Optional Referer header. If None, auto-generates from URL domain.
|
||||
|
||||
Raises requests.RequestException on network errors.
|
||||
"""
|
||||
# Build headers with Referer to bypass hotlink protection
|
||||
parsed = urlparse(url)
|
||||
if referer is None:
|
||||
referer = f"{parsed.scheme}://{parsed.netloc}/"
|
||||
|
||||
headers = {
|
||||
"Referer": referer,
|
||||
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
|
||||
}
|
||||
|
||||
resp = requests.get(url, timeout=DEFAULT_TIMEOUT, stream=True, headers=headers)
|
||||
resp.raise_for_status()
|
||||
|
||||
ext = self._guess_extension(url, resp.headers.get("Content-Type"))
|
||||
filename = f"{uuid.uuid4().hex[:12]}{ext}"
|
||||
filepath = os.path.join(self.temp_dir, filename)
|
||||
|
||||
with open(filepath, "wb") as f:
|
||||
for chunk in resp.iter_content(chunk_size=8192):
|
||||
f.write(chunk)
|
||||
|
||||
self.downloaded_files.append(filepath)
|
||||
print(f"[image_downloader] Downloaded: {url}")
|
||||
print(f" -> {filepath} ({os.path.getsize(filepath)} bytes)")
|
||||
return filepath
|
||||
|
||||
def download_all(self, urls: list[str]) -> list[str]:
|
||||
"""
|
||||
Download multiple images. Returns list of local file paths.
|
||||
|
||||
Skips URLs that fail to download (logs the error, continues).
|
||||
"""
|
||||
paths = []
|
||||
for url in urls:
|
||||
try:
|
||||
path = self.download(url)
|
||||
paths.append(path)
|
||||
except Exception as e:
|
||||
print(f"[image_downloader] Failed to download {url}: {e}", file=sys.stderr)
|
||||
return paths
|
||||
|
||||
def cleanup(self):
|
||||
"""Remove all downloaded files and the temp directory."""
|
||||
if self._owns_dir and os.path.isdir(self.temp_dir):
|
||||
shutil.rmtree(self.temp_dir, ignore_errors=True)
|
||||
print(f"[image_downloader] Cleaned up temp dir: {self.temp_dir}")
|
||||
else:
|
||||
for f in self.downloaded_files:
|
||||
try:
|
||||
os.remove(f)
|
||||
except OSError:
|
||||
pass
|
||||
print(f"[image_downloader] Cleaned up {len(self.downloaded_files)} files.")
|
||||
self.downloaded_files.clear()
|
||||
|
||||
def __enter__(self):
|
||||
return self
|
||||
|
||||
def __exit__(self, *_):
|
||||
self.cleanup()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
# Quick test: download URLs passed as command-line arguments
|
||||
if len(sys.argv) < 2:
|
||||
print("Usage: python image_downloader.py <url1> [url2] ...")
|
||||
sys.exit(1)
|
||||
|
||||
dl = ImageDownloader()
|
||||
paths = dl.download_all(sys.argv[1:])
|
||||
print(f"\nDownloaded {len(paths)} image(s):")
|
||||
for p in paths:
|
||||
print(f" {p}")
|
||||
print(f"Temp dir: {dl.temp_dir}")
|
||||
print("Files will remain until manually cleaned up.")
|
||||
213
skills/post-to-xhs/scripts/publish_pipeline.py
Normal file
213
skills/post-to-xhs/scripts/publish_pipeline.py
Normal file
@@ -0,0 +1,213 @@
|
||||
"""
|
||||
Unified publish pipeline for Xiaohongshu.
|
||||
|
||||
Single CLI entry point that orchestrates:
|
||||
chrome_launcher → login check → image download → form fill → (optional) publish
|
||||
|
||||
Usage:
|
||||
# Fill form only (default) - review in browser before publishing
|
||||
python publish_pipeline.py --title "标题" --content "正文" --image-urls URL1 URL2
|
||||
python publish_pipeline.py --title-file t.txt --content-file body.txt --image-urls URL1
|
||||
|
||||
# Headless mode (no GUI window) - faster for automated publishing
|
||||
python publish_pipeline.py --headless --title-file t.txt --content-file body.txt --image-urls URL1
|
||||
|
||||
# Publish to a specific account
|
||||
python publish_pipeline.py --account myaccount --title "标题" --content "正文" --image-urls URL1
|
||||
|
||||
# Fill and auto-publish in one step
|
||||
python publish_pipeline.py --title "标题" --content "正文" --image-urls URL1 --auto-publish
|
||||
|
||||
# Use local image files instead of URLs
|
||||
python publish_pipeline.py --title "标题" --content "正文" --images img1.jpg img2.jpg
|
||||
|
||||
Exit codes:
|
||||
0 = success (READY_TO_PUBLISH or PUBLISHED)
|
||||
1 = not logged in (NOT_LOGGED_IN) - headless auto-fallback will restart headed
|
||||
2 = error (see stderr)
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import os
|
||||
import sys
|
||||
|
||||
# Ensure UTF-8 output on Windows consoles
|
||||
if sys.platform == "win32":
|
||||
os.environ.setdefault("PYTHONIOENCODING", "utf-8")
|
||||
try:
|
||||
sys.stdout.reconfigure(encoding="utf-8", errors="replace")
|
||||
sys.stderr.reconfigure(encoding="utf-8", errors="replace")
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Add scripts dir to path so sibling modules can be imported
|
||||
SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
|
||||
if SCRIPT_DIR not in sys.path:
|
||||
sys.path.insert(0, SCRIPT_DIR)
|
||||
|
||||
from chrome_launcher import ensure_chrome, restart_chrome
|
||||
from cdp_publish import XiaohongshuPublisher, CDPError
|
||||
from image_downloader import ImageDownloader
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Xiaohongshu publish pipeline - unified entry point"
|
||||
)
|
||||
|
||||
# Title
|
||||
title_group = parser.add_mutually_exclusive_group(required=True)
|
||||
title_group.add_argument("--title", help="Article title text")
|
||||
title_group.add_argument("--title-file", help="Read title from UTF-8 file")
|
||||
|
||||
# Content
|
||||
content_group = parser.add_mutually_exclusive_group(required=True)
|
||||
content_group.add_argument("--content", help="Article body text")
|
||||
content_group.add_argument("--content-file", help="Read content from UTF-8 file")
|
||||
|
||||
# Images
|
||||
img_group = parser.add_mutually_exclusive_group(required=True)
|
||||
img_group.add_argument(
|
||||
"--image-urls", nargs="+", help="Image URLs to download"
|
||||
)
|
||||
img_group.add_argument(
|
||||
"--images", nargs="+", help="Local image file paths"
|
||||
)
|
||||
|
||||
# Publish mode
|
||||
parser.add_argument(
|
||||
"--auto-publish",
|
||||
action="store_true",
|
||||
default=False,
|
||||
help="Click publish button after filling (default: fill only)",
|
||||
)
|
||||
|
||||
# Headless mode
|
||||
parser.add_argument(
|
||||
"--headless",
|
||||
action="store_true",
|
||||
default=False,
|
||||
help="Run Chrome in headless mode (no GUI). Auto-falls back to headed if login is needed.",
|
||||
)
|
||||
|
||||
# Optional temp dir for downloaded images
|
||||
parser.add_argument(
|
||||
"--temp-dir",
|
||||
default=None,
|
||||
help="Directory for downloaded images (default: auto-created temp dir)",
|
||||
)
|
||||
|
||||
# Account selection
|
||||
parser.add_argument(
|
||||
"--account",
|
||||
default=None,
|
||||
help="Account name to publish to (default: default account)",
|
||||
)
|
||||
|
||||
args = parser.parse_args()
|
||||
headless = args.headless
|
||||
account = args.account
|
||||
|
||||
# --- Resolve title ---
|
||||
if args.title_file:
|
||||
with open(args.title_file, encoding="utf-8") as f:
|
||||
title = f.read().strip()
|
||||
else:
|
||||
title = args.title
|
||||
|
||||
if not title:
|
||||
print("Error: title is empty.", file=sys.stderr)
|
||||
sys.exit(2)
|
||||
|
||||
# --- Resolve content ---
|
||||
if args.content_file:
|
||||
with open(args.content_file, encoding="utf-8") as f:
|
||||
content = f.read().strip()
|
||||
else:
|
||||
content = args.content
|
||||
|
||||
if not content:
|
||||
print("Error: content is empty.", file=sys.stderr)
|
||||
sys.exit(2)
|
||||
|
||||
# --- Step 1: Ensure Chrome is running ---
|
||||
mode_label = "headless" if headless else "headed"
|
||||
account_label = account or "default"
|
||||
print(f"[pipeline] Step 1: Ensuring Chrome is running ({mode_label}, account: {account_label})...")
|
||||
if not ensure_chrome(headless=headless, account=account):
|
||||
print("Error: Failed to start Chrome.", file=sys.stderr)
|
||||
sys.exit(2)
|
||||
|
||||
# --- Step 2: Connect and check login ---
|
||||
print("[pipeline] Step 2: Checking login status...")
|
||||
publisher = XiaohongshuPublisher()
|
||||
try:
|
||||
publisher.connect()
|
||||
logged_in = publisher.check_login()
|
||||
if not logged_in:
|
||||
publisher.disconnect()
|
||||
if headless:
|
||||
# Auto-fallback: restart Chrome in headed mode for QR login
|
||||
print("[pipeline] Headless mode: not logged in. Switching to headed mode for login...")
|
||||
restart_chrome(headless=False, account=account)
|
||||
publisher.connect()
|
||||
publisher.open_login_page()
|
||||
print("NOT_LOGGED_IN")
|
||||
sys.exit(1)
|
||||
except CDPError as e:
|
||||
print(f"Error: {e}", file=sys.stderr)
|
||||
sys.exit(2)
|
||||
|
||||
# --- Step 3: Prepare images ---
|
||||
image_paths = []
|
||||
downloader = None
|
||||
|
||||
if args.image_urls:
|
||||
print(f"[pipeline] Step 3: Downloading {len(args.image_urls)} image(s)...")
|
||||
downloader = ImageDownloader(temp_dir=args.temp_dir)
|
||||
image_paths = downloader.download_all(args.image_urls)
|
||||
if not image_paths:
|
||||
print("Error: All image downloads failed.", file=sys.stderr)
|
||||
sys.exit(2)
|
||||
else:
|
||||
image_paths = args.images
|
||||
# Verify local files exist
|
||||
for p in image_paths:
|
||||
if not os.path.isfile(p):
|
||||
print(f"Error: Image file not found: {p}", file=sys.stderr)
|
||||
sys.exit(2)
|
||||
print(f"[pipeline] Step 3: Using {len(image_paths)} local image(s).")
|
||||
|
||||
# --- Step 4: Fill form ---
|
||||
print("[pipeline] Step 4: Filling form...")
|
||||
try:
|
||||
publisher.publish(title=title, content=content, image_paths=image_paths)
|
||||
print("FILL_STATUS: READY_TO_PUBLISH")
|
||||
except CDPError as e:
|
||||
print(f"Error during form fill: {e}", file=sys.stderr)
|
||||
if downloader:
|
||||
downloader.cleanup()
|
||||
sys.exit(2)
|
||||
|
||||
# --- Step 5: Publish (optional) ---
|
||||
if args.auto_publish:
|
||||
print("[pipeline] Step 5: Clicking publish button...")
|
||||
try:
|
||||
publisher._click_publish()
|
||||
print("PUBLISH_STATUS: PUBLISHED")
|
||||
except CDPError as e:
|
||||
print(f"Error clicking publish: {e}", file=sys.stderr)
|
||||
if downloader:
|
||||
downloader.cleanup()
|
||||
sys.exit(2)
|
||||
|
||||
# --- Cleanup ---
|
||||
publisher.disconnect()
|
||||
if downloader:
|
||||
downloader.cleanup()
|
||||
|
||||
print("[pipeline] Done.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
1
skills/post-to-xhs/scripts/title.txt
Normal file
1
skills/post-to-xhs/scripts/title.txt
Normal file
@@ -0,0 +1 @@
|
||||
Win11 Build 28020 Canary通道更新
|
||||
Reference in New Issue
Block a user