data.world is looking for an experienced software engineer to contribute to our distributed graph system. This system builds a large knowledge graph with billions of nodes/edges using Apache Spark to achieve a high degree of parallelism. The right candidate will have experience deploying and running Apache Spark in a production environment, will have experience managing data at scale using relational databases and SQL, and will be comfortable with graph algorithms.
At data.world, you will:
- build Apache Spark code in Scala to drive a pipeline of graph based transformations.
- work closely with product, engineering, documentation and business stakeholders to ensure the delivery and improvement of the collector product.
- collaborate with a small, dedicated team
- execute on a key area of the data.world platform.
- learn… constantly.
We’d love to see:
- BS in technology or engineering field, or equivalent experience.
- 5+ years experience as an engineer
- experience working with Apache Spark and with Scala or Java systems
- strong computer science fundamentals - particularly algorithms, graph theory, and relational data (SQL) experience
- experience with AWS (Amazon Web Services) will be a strong plus
- strong opinions, loosely held. You admit when you're wrong, and integrate new learnings quickly.
- a craftsperson. You know your way around and take pride in your work.
- an appreciation of the user, even when you're building a CLI or API.
- familiarity with a variety of languages and libraries. You know which tools to use for which tasks.
- the ability to provide, as well as seek out, mentorship.
- passion for continuous integration, and test-driven engineering methodologies.
- strong written, verbal, and visual communication skills. You should be able to articulate your decisions, whiteboard new solutions, present ideas concisely, and defend your beliefs.
- an appetite to try new things. You’re curious and excited to improve your process, and always looking to learn. You ask questions and don't shy away from challenges.
Big pluses include:
- interest in the semantic web, RDF and/or graph based data storage technologies.
- experience with Docker
- experience with Dropwizard, or other Java-based web framework
- experience working in a fast-paced, startup environment
Perks and benefits:
- Successful company with strong leadership, the right values, and a product well-positioned within a growing category
- Innovative technology/architecture (RDF, knowledge graph)
- Agile and highly disciplined engineering culture/practices (CI, TDD, soc2, peer review, etc.), not to mention a productive and fun environment.
- Competitive market compensation with a generous bonus structure
- Fully paid health/dental/vision insurance for the whole family
- Charitable corporate programs and volunteer events throughout the year
- Open PTO, and a personalized wellness incentive
- Lots of regularly scheduled team events including game nights, rock climbing, and cocktail competitions - even virtually for now!
- A flexible work environment
- A tight-knit team of startup veterans with integrity, passion, and curiosity
If you have the exceptional combination of skills and qualities that we are looking for, then we’re excited to meet you!
Note: We encourage people from underrepresented groups to apply.
We are the world’s largest collaborative data community and we very much believe that our people need to represent the very diverse nature of the community we are serving and the customer base we are winning. We believe that diversity leads to the most creative discussions, ideas, and outcomes.