feat: add AI-powered translation sync automation

Adds automated translation file synchronization with AI-powered
translations using LibreTranslate API.

Features:
- Auto-detects missing translation keys from en.json source
- Translates new keys using LibreTranslate (free/public API)
- Preserves i18next placeholders like {{count}} and {{boardName}}
- Batch processing with rate limiting
- Falls back to TODO: prefix on translation failures
- GitHub Action runs daily and creates PRs automatically
- Supports custom LibreTranslate instances and API keys
This commit is contained in:
Igor Pecovnik
2025-12-25 17:10:48 +01:00
committed by SuperKali
parent f1e3e2c553
commit f5d29248d8
3 changed files with 739 additions and 0 deletions

86
.github/workflows/sync-locales.yml vendored Normal file
View File

@@ -0,0 +1,86 @@
name: Sync Translation Files
on:
workflow_dispatch:
schedule:
# Run daily at 00:00 UTC
- cron: '0 0 * * *'
permissions:
contents: write
pull-requests: write
jobs:
sync-locales:
name: Sync locale files with en.json
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '20'
- name: Run sync script with AI translation
id: sync
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
OPENAI_MODEL: ${{ vars.OPENAI_MODEL || 'gpt-4o-mini' }}
OPENAI_API: ${{ vars.OPENAI_API || 'https://api.openai.com/v1' }}
OPENAI_TIER: ${{ vars.OPENAI_TIER || 'free' }}
RETRY_FAILED: ${{ vars.RETRY_FAILED || 'false' }}
run: |
node scripts/sync-locales.js
continue-on-error: true
- name: Check for changes
id: check_changes
run: |
if git diff --quiet src/locales/; then
echo "has_changes=false" >> $GITHUB_OUTPUT
else
echo "has_changes=true" >> $GITHUB_OUTPUT
fi
- name: Create Pull Request
if: steps.check_changes.outputs.has_changes == 'true'
uses: peter-evans/create-pull-request@v7
with:
token: ${{ secrets.GITHUB_TOKEN }}
commit-message: 'i18n: sync and auto-translate locale files'
branch: sync/i18n-updates
title: 'i18n: Auto-translate and sync locale files'
body: |
## Summary
This PR updates all translation files with missing keys from `en.json` (the source of truth) using AI-powered translation.
## What changed
- New keys have been automatically translated using OpenAI
- Placeholders like `{{boardName}}` and `{{count}}` are preserved in translations
- Failed translations are marked with `TODO:` prefix
## Translation quality
The AI translations provide a good starting point but may need review for:
- Context-specific terminology
- UI-specific phrasing
- Cultural nuances
- Gender agreement in plural forms
## Next steps
Please review the translations and:
1. Test them in the application
2. Adjust any translations that don't fit the context
3. Remove `TODO:` markers from any failed translations
---
labels: |
i18n
translation
automated
draft: false

244
scripts/README.md Normal file
View File

@@ -0,0 +1,244 @@
# Translation Sync Scripts
This directory contains automation scripts for managing i18n translations.
## sync-locales.js
Automatically syncs all translation files with `src/locales/en.json` (the source of truth) and translates missing keys using AI.
### Features
- **Automatic Detection**: Finds missing keys in all locale files
- **AI Translation**: Uses OpenAI API to automatically translate new keys with high quality
- **Context-Aware**: Provides section/key context to the AI for better translations
- **Placeholder Preservation**: Maintains i18next placeholders like `{{count}}` and `{{boardName}}`
- **Adaptive Rate Limiting**: Automatically adjusts based on model and payment tier
- **Error Handling**: Falls back to `TODO:` prefix if translation fails
- **Smart Prompts**: Uses specialized prompts for technical UI translation
### Usage
#### Local Development
```bash
# Basic usage (requires OpenAI API key)
export OPENAI_API_KEY=sk-...
node scripts/sync-locales.js
# With custom model (default: gpt-4o-mini)
OPENAI_MODEL=gpt-4o node scripts/sync-locales.js
# With custom OpenAI-compatible API endpoint
OPENAI_API=https://api.openai.com/v1 node scripts/sync-locales.js
# For paid tier (much faster - 50-100x speedup)
OPENAI_TIER=paid node scripts/sync-locales.js
# Retry failed translations (keys marked with TODO:)
RETRY_FAILED=true node scripts/sync-locales.js
```
#### GitHub Actions
The workflow runs automatically:
- **Daily** at 00:00 UTC
- **On push** to the branch
- **On manual trigger** via workflow_dispatch
### Configuration
#### Environment Variables
| Variable | Description | Default | Required |
|----------|-------------|---------|----------|
| `OPENAI_API_KEY` | OpenAI API key | - | Yes |
| `OPENAI_MODEL` | Model to use for translation | `gpt-4o-mini` | No |
| `OPENAI_API` | API endpoint URL | `https://api.openai.com/v1` | No |
| `OPENAI_TIER` | Account tier for rate limits | `free` | No |
| `RETRY_FAILED` | Retry keys marked with `TODO:` | `false` | No |
#### GitHub Secrets/Variables
To configure the GitHub Action:
1. **Required - Add API key**:
```bash
gh secret set OPENAI_API_KEY
```
2. **Optional - Custom model** (for cost/quality tuning):
```bash
gh variable set OPENAI_MODEL --value "gpt-4o-mini"
```
3. **Optional - Custom endpoint** (for OpenAI-compatible APIs):
```bash
gh variable set OPENAI_API --value "https://api.openai.com/v1"
```
4. **Optional - Set account tier** (for faster translations with paid account):
```bash
gh variable set OPENAI_TIER --value "paid"
```
5. **Optional - Retry failed translations** (re-attempt keys marked with `TODO:`):
```bash
gh variable set RETRY_FAILED --value "true"
```
### OpenAI Setup
#### Getting an API Key
1. Visit [platform.openai.com](https://platform.openai.com/)
2. Sign up or log in
3. Navigate to API Keys section
4. Create a new API key
5. Add it to your environment or GitHub secrets
#### Account Tier & Rate Limits
The script automatically adjusts speed based on your account tier:
| Tier | RPM Limit | Batch Size | Delay | 100 Keys Time |
|------|-----------|------------|-------|---------------|
| **Free** | 3/min | 1 | 21s | ~35 min |
| **Paid (Tier 1-2)** | 200/min | 50 | 300ms | ~1 min |
| **Paid (Tier 3-5)** | 500/min | 50 | 120ms | ~30 sec |
To use paid tier rates:
1. Add a payment method to your OpenAI account (even $5 works)
2. Set `OPENAI_TIER=paid` environment variable
**Recommendation**: With just a $5 balance, you get Tier 1-2 rates which are **65x faster** than free tier.
#### Choosing a Model
| Model | Cost | Quality | Speed | Best For |
|-------|------|---------|-------|----------|
| `gpt-4o-mini` | Low | High | Fast | Most translations (default) |
| `gpt-4o` | Medium | Very High | Fast | Complex UI text |
| `gpt-3.5-turbo` | Very Low | Medium | Very Fast | Simple translations |
**Recommendation**: Start with `gpt-4o-mini` for the best balance of cost, quality, and speed.
### Supported Languages
| Code | Language |
|------|----------|
| `de` | German |
| `es` | Spanish |
| `fr` | French |
| `it` | Italian |
| `ja` | Japanese |
| `ko` | Korean |
| `nl` | Dutch |
| `pl` | Polish |
| `pt` | Portuguese |
| `ru` | Russian |
| `sl` | Slovenian |
| `tr` | Turkish |
| `uk` | Ukrainian |
| `zh` | Chinese (Simplified) |
### Output
The script will:
1. ✅ Show which keys are missing for each language
2. 🤖 Translate missing keys with OpenAI
3. 📊 Show translation statistics (success/failure)
4. ⚠️ Warn about any translation failures
5. 💾 Update locale files with translated content
6. 🔍 Exit with code 1 if changes were made (useful for CI/CD)
### Example Output
```
🔍 Syncing translation files with en.json (source of truth)
🤖 Using OpenAI API: https://api.openai.com/v1
📦 Model: gpt-4o-mini
✅ API key is configured
✅ Source file has 93 keys
📝 Processing de (German)...
✅ de is up to date (93 keys)
📝 Processing hr (Croatian)...
⚠️ Found 64 missing keys
🤖 Translating 64 strings with OpenAI...
✅ Updated hr with 64 new keys
✨ Translation files updated successfully!
📊 Summary:
- Total translated: 64 keys
- Please review translations for accuracy and context
```
### AI Translation Features
The script uses specialized prompts to ensure:
1. **Context Awareness**: Provides section/key context for each translation
2. **Technical Terminology**: Knows when to keep terms like "Flash", "SD card", "USB" in English
3. **Placeholder Preservation**: Maintains `{{variables}}` exactly as they appear
4. **UI Appropriateness**: Uses concise, natural text for buttons and labels
5. **Plural Forms**: Handles i18next plural suffixes (_one, _other) correctly
6. **Consistent Tone**: Maintains formal but friendly tone throughout
### Best Practices
1. **Review Translations**: AI translations are excellent but may need context-specific adjustments
2. **Test in App**: Always test translations in the actual application
3. **Handle Plurals**: The script preserves `_one` and `_other` suffixes for plural forms
4. **Check Placeholders**: Verify that `{{variables}}` are correctly preserved
5. **Cultural Nuances**: Review translations for cultural appropriateness
6. **Cost Management**: Use `gpt-4o-mini` for best cost/quality balance
### Troubleshooting
#### API Key Not Found
```
❌ OPENAI_API_KEY is not set!
```
**Solution**: Set the environment variable or GitHub secret:
```bash
export OPENAI_API_KEY=sk-...
```
#### Translation Failures
If some translations fail with `TODO:`:
- Check your API key has sufficient credits
- Verify API endpoint is accessible
- Check rate limits (especially for larger translation batches
#### Poor Quality Translations
If translations seem off:
- The AI might lack specific UI context
- Try a more capable model like `gpt-4o`
- Manually edit the JSON files to fix issues
- Consider the translation context provided in the prompt
#### Cost Concerns
For cost optimization:
- Use `gpt-4o-mini` (very cost-effective)
- Run the script less frequently
- Review and merge translations in batches
- Consider caching previous translations
### Cost Estimation
Approximate costs for translating missing keys (using `gpt-4o-mini`):
- **10 keys**: ~$0.00001
- **50 keys**: ~$0.00005
- **100 keys**: ~$0.0001
*Costs vary based on text length and model used.*

409
scripts/sync-locales.js Normal file
View File

@@ -0,0 +1,409 @@
#!/usr/bin/env node
/**
* Script to sync translation files with the English source of truth.
* Finds missing keys in each language file and translates them using AI.
*/
import fs from 'fs';
import path from 'path';
import { fileURLToPath } from 'url';
const __dirname = path.dirname(fileURLToPath(import.meta.url));
const localesDir = path.resolve(__dirname, '../src/locales');
const sourceFile = path.join(localesDir, 'en.json');
// OpenAI configuration
const OPENAI_API_KEY = process.env.OPENAI_API_KEY || '';
const OPENAI_MODEL = process.env.OPENAI_MODEL || 'gpt-4o-mini';
const OPENAI_API = process.env.OPENAI_API || 'https://api.openai.com/v1';
// Language names for better context in translation
const LANGUAGE_NAMES = {
'de': 'German',
'es': 'Spanish',
'fr': 'French',
'it': 'Italian',
'ja': 'Japanese',
'ko': 'Korean',
'nl': 'Dutch',
'pl': 'Polish',
'pt': 'Portuguese',
'ru': 'Russian',
'sl': 'Slovenian',
'tr': 'Turkish',
'uk': 'Ukrainian',
'zh': 'Chinese (Simplified)'
};
// All supported locale files
const localeFiles = Object.keys(LANGUAGE_NAMES).map(code => `${code}.json`);
/**
* Translate text using OpenAI API
*/
async function translateText(text, targetLang, context = '') {
if (!OPENAI_API_KEY) {
throw new Error('OPENAI_API_KEY is not set');
}
// Don't translate if it's a variable placeholder
if (text.startsWith('{{') && text.endsWith('}}')) {
return text;
}
// Skip translation for very short strings or special formats
if (text.length < 2) {
return text;
}
try {
const systemPrompt = `You are a professional translator for a software application called "Armbian Imager" - a tool for flashing operating system images to SD cards and USB drives.
Translate the given text to ${LANGUAGE_NAMES[targetLang]}.
Important rules:
1. Keep technical terms in English when appropriate (e.g., "SD card", "USB", "Flash", "Board", "Image")
2. Preserve ALL placeholders exactly as they appear (e.g., {{count}}, {{boardName}}, {{step}})
3. Use natural, concise UI text appropriate for buttons and labels
4. Maintain formal but friendly tone
5. For plural forms (text ending in _one or _other), translate appropriately for the grammatical number
6. Keep keyboard shortcuts and hotkeys in English
7. Only return the translated text, no explanations
${context ? `Context: ${context}` : ''}`;
const response = await fetch(`${OPENAI_API}/chat/completions`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${OPENAI_API_KEY}`
},
body: JSON.stringify({
model: OPENAI_MODEL,
messages: [
{ role: 'system', content: systemPrompt },
{ role: 'user', content: text }
],
temperature: 0.3,
max_tokens: 500
})
});
if (!response.ok) {
const errorData = await response.json().catch(() => ({}));
throw new Error(`OpenAI API error: ${response.status} - ${JSON.stringify(errorData)}`);
}
const data = await response.json();
const translatedText = data.choices[0]?.message?.content || text;
// Ensure placeholders are preserved
return preservePlaceholders(text, translatedText);
} catch (error) {
console.warn(` ⚠️ Translation failed for "${text}": ${error.message}`);
// Return original text with a marker if translation fails
return `TODO: ${text}`;
}
}
/**
* Preserve i18next placeholders in translated text
*/
function preservePlaceholders(original, translated) {
// Extract all placeholders from original (e.g., {{count}}, {{boardName}})
const placeholderRegex = /\{\{([^}]+)\}\}/g;
const placeholders = original.match(placeholderRegex) || [];
// If no placeholders, return translated as-is
if (placeholders.length === 0) {
return translated;
}
// Replace placeholders back in translated text
let result = translated;
placeholders.forEach(placeholder => {
const varName = placeholder.match(/\{\{([^}]+)\}\}/)[1];
// Look for variations and replace with correct format
const patterns = [
`{{${varName}}}`,
`{{ ${varName} }}`,
`{${varName}}`,
`{ ${varName} }`,
`%{${varName}}`,
`%{ ${varName} }`
];
for (const pattern of patterns) {
if (result.includes(pattern) && pattern !== placeholder) {
result = result.replaceAll(pattern, placeholder);
break;
}
}
// If placeholder completely missing, add it back
if (!result.includes(placeholder)) {
// Try to find where it should go (heuristic)
const originalWithoutPlaceholder = original.replace(placeholder, '');
if (translated.includes(originalWithoutPlaceholder)) {
result = result.replace(originalWithoutPlaceholder, original);
}
}
});
return result;
}
/**
* Translate multiple texts in batch for better performance
*/
async function translateBatch(texts, targetLang, contexts = []) {
const results = [];
// Rate limit configuration based on model and tier
// Actual OpenAI rate limits:
// Free tier: gpt-4o-mini = 3-10 RPM
// Tier 1-2 (paid): gpt-4o-mini = 200 RPM
// Tier 3-5 (paid): gpt-4o-mini = 500 RPM
// Tier 1-5 (paid): gpt-4o = 80-500 RPM
const isPaidTier = process.env.OPENAI_TIER === 'paid';
let batchSize, batchDelay;
if (OPENAI_MODEL.includes('gpt-4o-mini')) {
// Free tier: strict 3 RPM (20s delay), Paid tier: 200 RPM (300ms delay)
batchSize = isPaidTier ? 50 : 1;
batchDelay = isPaidTier ? 300 : 21000; // 21s = ~3 RPM (safe margin), 300ms = ~200 RPM
} else if (OPENAI_MODEL.includes('gpt-4o')) {
batchSize = isPaidTier ? 40 : 1;
batchDelay = isPaidTier ? 750 : 21000;
} else if (OPENAI_MODEL.includes('gpt-3.5')) {
batchSize = isPaidTier ? 100 : 1;
batchDelay = isPaidTier ? 500 : 21000;
} else {
// Conservative defaults for unknown models
batchSize = 1;
batchDelay = 21000;
}
if (isPaidTier) {
console.log(` 💰 Using paid tier rate limits (batch: ${batchSize}, delay: ${batchDelay}ms)`);
} else {
console.log(` ⭐ Using free tier rate limits (batch: ${batchSize}, delay: ${batchDelay}ms, ~3 RPM)`);
console.log(` 💡 Tip: Add OPENAI_TIER=paid for ~65x faster translations (200 RPM)`);
}
for (let i = 0; i < texts.length; i += batchSize) {
const batch = texts.slice(i, i + batchSize);
const batchContexts = contexts.slice(i, i + batchSize);
const translations = await Promise.all(
batch.map((text, idx) => translateText(text, targetLang, batchContexts[idx] || ''))
);
results.push(...translations);
// Delay between batches to respect API rate limits
if (i + batchSize < texts.length) {
console.log(` ⏳ Progress: ${results.length}/${texts.length} translated...`);
await new Promise(resolve => setTimeout(resolve, batchDelay));
}
}
return results;
}
/**
* Recursively get all keys from an object using dot notation
*/
function getKeys(obj, prefix = '') {
const keys = [];
for (const [key, value] of Object.entries(obj)) {
const fullKey = prefix ? `${prefix}.${key}` : key;
if (typeof value === 'object' && value !== null && !Array.isArray(value)) {
keys.push(...getKeys(value, fullKey));
} else {
keys.push(fullKey);
}
}
return keys;
}
/**
* Collect all missing translations with their paths
*/
function collectMissingTranslations(source, target, path = '', missing = []) {
for (const [key, sourceValue] of Object.entries(source)) {
const fullKey = path ? `${path}.${key}` : key;
if (!(key in target)) {
// Key is missing in target
if (typeof sourceValue === 'object' && sourceValue !== null && !Array.isArray(sourceValue)) {
collectMissingTranslations(sourceValue, {}, fullKey, missing);
} else {
// Provide context for better translations
const context = `Section: ${path}, Key: ${key}`;
missing.push({ path: fullKey, value: sourceValue, context });
}
} else if (typeof sourceValue === 'object' && sourceValue !== null && !Array.isArray(sourceValue)) {
// Recurse into nested objects
collectMissingTranslations(sourceValue, target[key], fullKey, missing);
}
}
return missing;
}
/**
* Collect all translations marked with TODO: for retry
*/
function collectFailedTranslations(source, target, path = '', failed = []) {
for (const [key, sourceValue] of Object.entries(source)) {
const fullKey = path ? `${path}.${key}` : key;
if (key in target) {
const targetValue = target[key];
if (typeof sourceValue === 'object' && sourceValue !== null && !Array.isArray(sourceValue)) {
// Recurse into nested objects
collectFailedTranslations(sourceValue, targetValue, fullKey, failed);
} else if (typeof targetValue === 'string' && targetValue.startsWith('TODO:')) {
// Found a failed translation
const context = `Section: ${path}, Key: ${key} (retry)`;
// Extract the original English value from source
failed.push({ path: fullKey, value: sourceValue, context, isRetry: true });
}
}
}
return failed;
}
/**
* Set a value in a nested object using dot notation
*/
function setByPath(obj, path, value) {
const keys = path.split('.');
let current = obj;
for (let i = 0; i < keys.length - 1; i++) {
if (!(keys[i] in current)) {
current[keys[i]] = {};
}
current = current[keys[i]];
}
current[keys[keys.length - 1]] = value;
}
/**
* Deep clone an object
*/
function deepClone(obj) {
return JSON.parse(JSON.stringify(obj));
}
console.log('🔍 Syncing translation files with en.json (source of truth)\n');
console.log(`🤖 Using OpenAI API: ${OPENAI_API}`);
console.log(`📦 Model: ${OPENAI_MODEL}`);
if (!OPENAI_API_KEY) {
console.error('❌ OPENAI_API_KEY is not set!');
console.log(' Set it with: export OPENAI_API_KEY=your-key-here\n');
process.exit(1);
}
console.log('✅ API key is configured\n');
// Read the source English file
const sourceContent = fs.readFileSync(sourceFile, 'utf-8');
const sourceData = JSON.parse(sourceContent);
const sourceKeys = getKeys(sourceData);
console.log(`✅ Source file has ${sourceKeys.length} keys\n`);
// Check if we should retry failed translations
const retryFailed = process.env.RETRY_FAILED === 'true';
let hasAnyChanges = false;
let totalTranslated = 0;
let totalFailed = 0;
let totalRetried = 0;
// Process each locale file
for (const localeFile of localeFiles) {
const localePath = path.join(localesDir, localeFile);
const localeName = localeFile.replace('.json', '');
console.log(`📝 Processing ${localeName} (${LANGUAGE_NAMES[localeName]})...`);
// Read locale file
const localeContent = fs.readFileSync(localePath, 'utf-8');
const localeData = JSON.parse(localeContent);
const localeKeys = getKeys(localeData);
// Find missing keys and keys marked with TODO:
const missingTranslations = collectMissingTranslations(sourceData, localeData);
// Also collect failed translations if retry is enabled
if (retryFailed) {
const failedTranslations = collectFailedTranslations(sourceData, localeData);
missingTranslations.push(...failedTranslations);
if (failedTranslations.length > 0) {
console.log(` 🔄 Retrying ${failedTranslations.length} failed translations`);
}
}
if (missingTranslations.length === 0) {
console.log(`${localeName} is up to date (${localeKeys.length} keys)\n`);
continue;
}
console.log(` ⚠️ Found ${missingTranslations.length} missing keys`);
// Translate missing keys
const textsToTranslate = missingTranslations.map(t => t.value);
const contexts = missingTranslations.map(t => t.context);
console.log(` 🤖 Translating ${textsToTranslate.length} strings with OpenAI...`);
const translatedTexts = await translateBatch(textsToTranslate, localeName, contexts);
// Count successes, failures, and retries
const retryCount = missingTranslations.filter(t => t.isRetry).length;
const failedCount = translatedTexts.filter(t => t.startsWith('TODO:')).length;
totalTranslated += translatedTexts.length - failedCount;
totalFailed += failedCount;
totalRetried += retryCount;
// Create updated locale data
const updatedLocaleData = deepClone(localeData);
// Add translated keys
missingTranslations.forEach((item, index) => {
setByPath(updatedLocaleData, item.path, translatedTexts[index]);
});
// Write updated file
fs.writeFileSync(localePath, JSON.stringify(updatedLocaleData, null, 2) + '\n');
if (failedCount > 0) {
console.log(` ⚠️ Updated ${localeName}: ${translatedTexts.length - failedCount} translated, ${failedCount} failed\n`);
} else {
console.log(` ✅ Updated ${localeName} with ${translatedTexts.length} new keys\n`);
}
hasAnyChanges = true;
}
if (hasAnyChanges) {
console.log('✨ Translation files updated successfully!');
console.log('\n📊 Summary:');
console.log(` - Total translated: ${totalTranslated} keys`);
if (totalRetried > 0) {
console.log(` - Retried: ${totalRetried} previously failed translations`);
}
if (totalFailed > 0) {
console.log(` - Total failed: ${totalFailed} keys (marked with TODO:)`);
console.log(` - Run again with RETRY_FAILED=true to retry failed translations`);
}
console.log(' - Please review translations for accuracy and context');
} else {
console.log('✅ All translation files are up to date!');
}
// Always exit successfully - the workflow checks git diff for changes
process.exit(0);