Overview
This is a ground-floor startup opportunity to develop a novel "big data" mining system that will search and index relevant R&D information throughout the Web. You will help to build a Software-as-a-Service (SaaS) platform that uses cutting-edge techniques for semantic data integration, entity disambiguation, topic modeling, and expertise discovery. Unlike most search solutions that just deliver content, our goal is to identify leading experts and collaborators to accelerate research and life sciences product development. Learn more about KNODE at http://www.knodeinc.com/
Responsibilities
Develop KNODE data pipeline processing, extract-transform-load (ETL) capabilities; optimize for scalability
Implement tools and framework for search index and data warehouse management
Define and implement cost-effective Cloud computing and storage resource utilization
Identify and qualify new information sources for KNODE data integration and warehouse process
Entity-relationship graph database design and optimization
Contribute to overall KNODE system design and architecture decisions
Various responsibilities as required on an Agile project team.
Experience
Solid commercial software product development skills and understanding of SaaS implementation using Cloud resources
Experience with parallel data processing (Hadoop) and related stack, ETL frameworks, big data management
Programming with Java, as well as scripting languages such as Python, Perl, Bash
Familiarity with Linux system and network administration
Large scale database design using PostgreSQL or similar RDBMS; knowledge of key-value stores is a plus.
Education
BS degree in a technical discipline and 5+ years job experience; or MS in a technical discipline and 2-3 years job experience
Compensation
For full-time employees, KNODE offers a competitive compensation package which includes medical and dental benefits, enrollment into a 401(k) plan, 3 weeks of vacation/personal time plus company holidays, and equity stock options.