Skip to main content
Telecommut

AI Engineer

Propio Language Services

Full time Posted: 2 hours ago Other

Hiring from: United States

Description

Propio Language Services is a provider of the highest quality interpretation, translation, and localization services. Our people take pride in every resource we offer, and our users always have access to cutting-edge technology, exceptional support, and collaborative user experiences. We are driven by our passion for innovation, growth, and bridging communication gaps in a diverse world. If you’re passionate about delivering technology-driven solutions and building lasting client relationships while contributing to client growth, Propio could be the ideal place for you.

We are building AI-powered systems that enhance multilingual communication, improve interpreter workflows, and support next-generation AI applications across text, speech, and multimodal experiences.

Propio is hiring an AI Data Strategy Engineer / Applied Scientist, LLM Data to own the data strategy, curation pipelines, annotation workflows, and evaluation datasets that power our multilingual AI systems.

This is a hands-on technical role for someone who understands how to manage the full AI data lifecycle, from acquisition, curation, annotation, and quality control to evaluation datasets and post-training data, to directly improve model performance.

The ideal candidate can build scalable data pipelines, design high-quality annotation and QA processes, identify model failure modes, and close performance gaps through targeted data acquisition, curation, and synthetic data generation.

Requirements

  • Define the end-to-end data roadmap for multilingual and multimodal AI systems, including text, speech, translation, interpretation, low-resource languages, and agentic AI workflows.
  • Design and build dataset curation pipelines for training, post-training, and evaluation, including cleaning, deduplication, filtering, PII redaction, quality scoring, sampling, balancing, and versioning.
  • Create annotation schemas, labeling guidelines, QA rubrics, golden datasets, and reviewer workflows for multilingual, speech, translation, and agentic AI data.
  • Build evaluation datasets and benchmarks, analyze model failure modes, and translate performance gaps into targeted data improvements.
  • Support post-training data workflows such as SFT, instruction tuning, preference data, RLHF/DPO-style data, reward model data, and synthetic data generation.
  • Use modern annotation tools and AWS-based data infrastructure to scale secure, traceable, and compliant AI data workflows.

Qualifications

  • Bachelor’s degree in Computer Science, Machine Learning, Data Science, Computational Linguistics, Linguistics, Statistics, or a related field, or equivalent practical experience.
  • 4+ years of experience in AI data, ML data operations, NLP data engineering, applied ML, speech/translation data, or LLM data workflows.
  • Strong hands-on experience with Python, SQL, and dataset curation pipelines.
  • Experience with annotation workflows, QA rubrics, evaluation datasets, or human-in-the-loop data processes.
  • Familiarity with multilingual NLP, speech data, translation data, low-resource languages, conversational AI, or agentic AI datasets.
  • Working knowledge of AWS data and ML tools such as S3, Glue, SageMaker, Bedrock, Lambda, Step Functions, EKS/ECS, IAM, or KMS.
  • Strong communication skills and ability to work with ML engineers, applied scientists, product teams, linguists, data teams, and vendors.

Preferred Qualifications

  • Master’s or PhD in Computer Science, Machine Learning, NLP, Computational Linguistics, Data Science, Statistics, or a related field.
  • Experience with LLM post-training workflows such as SFT, instruction tuning, preference data, RLHF, DPO, reward modeling, or evaluation data generation.
  • Experience with synthetic data generation, active learning, weak supervision, LLM-as-judge workflows, or automated data quality scoring.
  • Experience with modern annotation and data platforms such as Labelbox, Scale AI, Prodigy, Argilla, Snorkel, Humanloop, or custom internal tooling.

How to apply

To apply for this job you need to login. If you don't have an account yet, please register.

Post a resume

Similar jobs

Organizational Overview: Goddard Riverside Community Center (Goddard Riverside) and the Stanley M. Isaacs Neighborhood Center (Isaacs Center) are well-established Manhattan community-based agencies that support services to approximately 30,000 New Yorkers. We are two of New York City’s leading human services...

Full time 65,000 - 70,000 USD / year Posted: 42 minutes ago Hiring from: United States

GRADE 18 LOCATION OF POSITION MDH Worcester County Health Department 13070 St. Martin's Neck Road Bishopville, MD 21813 Main Purpose of Job The main purpose of this full-time position will be to supervise and evaluate the work of lower-level environmental...

Full time Posted: 42 minutes ago Hiring from: United States

Job Description We are looking for a reliable and experienced Janitorial Cleaner to perform a variety of cleaning duties to ensure a clean, safe, and orderly environment across assigned facilities. This role plays a critical part in maintaining the overall...

Full time Posted: 1 hour ago Hiring from: United States

Position Summary This role primarily focuses on financial analysis and asset strategy creation of 16M SF of commercial office space in Orange County. The responsibilities encompass a range of analytical support from annual property valuations and underwriting for leases and...

Full time Posted: 1 hour ago Hiring from: United States