Engineer, Data Platform
at Pinecone, San Francisco
About Pinecone
Pinecone is on a mission to build the search and database technology to power AI applications for the next decade and beyond. Our fully managed vector database makes it easy to add vector search to AI applications. Since creating the “vector database” category, demand has grown incredibly fast and it shows in our user base.
We are a distributed team with clusters in New York, San Francisco, Tel-Aviv, and Manchester.
About The Role
Pinecone is seeking a skilled and highly motivated Engineer for our internal Data Platform team to drive the development and maintenance of our data infrastructure, ensuring the efficient orchestration, governance, quality, and accessibility of data across the organization. As a Senior Engineer in the Data Team, you will play a critical role in building and optimizing our data ecosystem to enable data delivery, literacy, insights, and data science work at scale.
You will work in a fast-paced and rewarding environment that demands the highest quality work with minimal supervision. And as we all do a little bit of everything, you will also be a strong generalist, work directly with executive leadership, and mentor new data engineers and scientists.
Responsibilities
Design and Build Data Infrastructure
Architect and develop scalable and efficient data infrastructure, encompassing orchestration, metric store, feature store, governance, data quality, alerting infrastructure, and reverse ETL processes.
Enable Data Quality and Governance
Establish robust data quality framework/tooling and governance processes to maintain high data quality and integrity throughout the data lifecycle.
Collaborate with Data Science and Engineering Teams
Collaborate with data science teams to understand their requirements and ensure the availability and accessibility of data for modeling, experimentation, and analysis.
What we look for:
A passion for technology
5+ years of experience with SQL and Python
5+ years of experience with designing and developing high performance systems
BS in Computer Science, Math, a related technical field or equivalent experience
Strong foundations in databases, warehousing, data infrastructure, ELT/ETL
Proficient in building and optimizing data infrastructure using modern technologies and frameworks (e.g., Kafka, Airflow, API Integrations, CI/CD, Terraform, etc.).
Bonus Points:
Experience with orchestration platforms
Experience with Data Governance infrastructure (RBAC, Data Quality, Alerting, etc)
Expertise working with cloud-based data warehouse solutions (BigQuery, Snowflake)
Knowledge and experience with code deployment and K8s resource management