Build the data foundations that production AI depends on — vector databases, knowledge graphs, and data governance at enterprise scale.
You will design and build the data infrastructure that underpins production AI deployments. This means working across data pipelines, vector stores, embeddings infrastructure, and governance tooling to ensure AI systems have reliable, high-quality data at their core.
What you'll do
- Build and maintain data pipelines feeding production AI systems
- Design and operate vector database infrastructure and embedding pipelines
- Implement data quality, lineage, and governance tooling
- Work closely with ML Engineers and AI Architects on data contracts
- Contribute to customer-facing data platform deployments
What we're looking for
- Solid data engineering background with production experience
- Experience with vector databases (Pinecone, Weaviate, pgvector, or similar)
- Proficiency in Python and SQL; experience with dbt, Airflow, or similar
- Interest in and exposure to AI/ML platform requirements