Number of Applicants
:000+
Let AI Supercharge Your Job Hunt!
JobCopilot scans 500,000+ company career sites daily to find jobs for you
The Linguistic Data Analyst is responsible for collecting, analyzing, organizing, and cleaning multilingual conversational data, with a strong focus on diplomatic and formal terminology, to prepare high-quality datasets for training Speech-to-Text (STT) AI models.
This role is critical to ensuring linguistic accuracy, terminology consistency, and data readiness for AI model development, particularly in government, diplomatic, and formal communication domains.
Linguistic Data Collection:
- Collect and curate conversational audio and text data (meetings, interviews, speeches).
- Work with multilingual datasets, primarily Arabic and English.
- Ensure compliance with privacy and data governance standards.
Data Cleaning & Structuring:
- Clean datasets by removing noise, duplication, and inconsistencies.
- Normalize formal and semi-formal language usage.
- Organize data by speaker, context, and formality.
Linguistic & Terminology Analysis:
- Extract and standardize diplomatic and official terminology.
- Build and maintain a diplomatic glossary.
AI Training Data Preparation:
- Prepare AI-ready datasets with timestamps and metadata.
- Support annotation teams with linguistic guidelines.
Collaboration & Documentation:
- Work with AI Engineers, Data Scientists and PMs.
- Document standards and methodologies.
Education:
- Bachelor’s degree in Linguistics, Translation, Arabic/English Studies, or related field.
Core Skills:
- Strong linguistic analysis skills.
- Experience with conversational or textual datasets.
- High attention to detail.
Technical Skills (Preferred):
- Familiarity with STT and NLP concepts.
- Experience with data annotation workflows.
Languages:
- Arabic: Fluent (mandatory)
- English: Fluent (mandatory)
- Additional languages are a plus.
Auto-Apply to Linguistic Data Analyst Jobs with your AI JobCopilot
Copyright © 2026 Grabjobs Pte.Ltd. All Rights Reserved.