At Ephesoft, we create advanced capture products to help Transacts of all sizes and industries identify and extract meaning from their documents. More than 80% of business information is trapped in unstructured content—our goal is to make that data actionable. We initiate processes that add value to a company’s existing content and information management platforms. From automating mortgage applications, streamlining insurance claims, accelerating invoice payments and simplifying sales orders, Ephesoft allows our customers to put information to work and do business faster. This is why we were just named to the 2015 Inc. 500 list of fastest growing privately-held companies in the US.

Position: Sr. Data Scientist
Location: Remote
Job/Position Summary:
Will be responsible for creating analyses and models that would help improve customer and operational KPI’s, which will then create measurable value for the organization and respective clients.
Primary Roles/Responsibilities:
Ability to understand business needs and apply analytical concepts to provide business solutions.

  • Build predictive models, develop advanced algorithms that extract and classify information from large data-sets quantify model performance.
  • Analyze data, develop robust statistical algorithms/models that drive the product outputs.
  • Solve problems of large dimensional in a computationally efficient and statistically effective manner.
  • Based on business strategy and knowledge of emerging technologies, drive the architecture and design of the analytical platform.
  • Evaluate emerging technologies that may contribute to our analytical platform.
  • Identify and exploit new patterns in data using various techniques.
  • Create meaningful analysis and modelling outputs that can be easily understood by clients and practically implemented to drive uplifts in performance.
  • Use a range of structured and unstructured data and tools (including text, speech and digital analytic) as required to meet project objectives.
  • Maintain our analytical engines to ensure that day to day they deliver the functionality required by analysts and operations.
  • Ensure the use of client/customer and internal data (e.g. staff, financial) meets regulatory obligations and is done so within the risk appetite of the organization.

Desired Skills and Experience:
The ideal candidate must have strong coding ability and familiarity with Java programming language.
Should be passionate about applications in big data : crunching billions of data points for statistical modeling, data mining for insights and recommendation solutions.

  • Predictive Analytics, Data Mining, Machine learning & data mining toolkits like Weka ,Mahout, R* , *NLTK , Sage .
  • Good to have experience on Information retrieval libraries like Lucene/SOLR and Hadoop, Hive , Cassendra , HBase , MongoDB .
  • You thrive in a fast paced, test driven, collaborative and iterative programming environment.
  • Active participation in the open source communities.

Required education:

  • Bachelor’s

Required experience:

  • Building Predictive Models: 3 years
  • Information Retrieval Libraries: 3 years
  • Machine Learning & Data Mining Toolkits: 3 years
  • Developing Robust Statistical Algorithms: 4 years

To apply, send resume in PDF format to