Mountain View, California or Pune, India
- One Data Scientist position for our Mountain View, CA headquarters to be hired in Q3 2018. On Oct 1st, we are moving to Sunnyvale to accommodate our FogHorn staff doubling in size over 2018.
- Two Data Scientist positions in Pune, India for Q4 2018 hiring are currently open.
- We have an ongoing hiring plan per location per quarter per location.
The candidate’s experience level will determine both the salary and the Data Science job title seniority level. We welcome senior experience.
Role and Responsibilities
- Primarily, develop and deploy data mining models in a consulting role for our IoT clients to generate revenue or reduce costs. Projects are in Python Scikit-learn, to be deployed on the FogHorn IoT system with our EdgeML component. The project duration is typically 2-3 months, but may range from 2 weeks to 6 months.
- Some projects are in Vel, our proprietary functional streaming language that runs in a smaller footprint than Python. Vel parallelizes naively on multicore systems, supporting DSP, a variety of numerical methods and algorithms. We provide training on Vel.
- Secondarily, some team members may develop data science-based software applications running on our system, along with other FH engineers.
Candidates must meet ALL of the following qualifications
- Have analyzed, trained and deployed at least three data mining models in the past. If the candidate did not directly deploy their own models, they will have worked with others who have put their models into production. The models should have been validated as robust over at least an initial time period. Past algorithm deployment experience using R, SAS, MATLAB, Spark MLlib or other data mining libraries is also valid, as long as the candidate can deploy in Python now.
- Four or more years of full-time industry work experience, developing data mining models which were deployed and used.
- Programming experience in Python is core using data mining related libraries like Scikit-Learn. Other relevant Python mining libraries include NumPy, SciPy and Pandas.
- Data mining model or algorithm experience in at least 2 from the following list. More experience on this list is a bonus.
- Predictive or supervised algorithms may include: regression, neural nets (backpropagation, radial basis functions or architectures from the deep learning list below), decision trees (CART, C50, Cubist, Random Forests, XGBoost), Support Vector Machines (SVM), time series or ARIMA.
- Clustering or unsupervised algorithms may include: K-means, DBSCAN, Gaussian Mixture Models (GMM). Other outlier or anomaly detection or non-stationary data drift detection experience is useful.
- The past algorithm experience is valid on a wide variety of vertical applications (internet advertising, fraud detection, predicting X). Data science project experience over different verticals is generally transferable.
Any of the following extra qualifications will make a candidate more competitive over other candidates.
- Experience in deploying (or other substantial experience) in more than 2 of the data mining algorithms in the list above.
- Training or experience in Deep Learning, such as Keras, TensorFlow, in architectures such as convolutional neural networks (CNN), U-Net, General Adversarial Networks (GAN), Reinforcement Learning, Recurrent Nets or Long Short Term Memory (LSTM) neural network architectures. Experience in transfer learning, model shrinking or deep compression is helpful as well.
- NOTE: if you don’t have DL experience, we will provide initial training on deep learning, to help prepare you for the mix of projects we have in our pipeline. Come to FogHorn to get into deep learning.
- Vertical experience in Internet of Things (IoT) applications
- Smart Cities (elevators, power, video monitoring)
- Manufacturing (predictive maintenance on cells, or scrap prediction / classification on items produced)
- Oil and Gas
- Mobile phone
- Transportation or automotive
- Wind Turbines
- Other IoT
- Mechanical engineering or a hard sciences background helps in the development of first principle models.
- Time series applications, Digital Signal Processing (DSP), Fast Fourier Transforms (FFT), band pass filtering, extracting features from a spectrogram and related experience is useful.
- Experience with Complex Event Processing (CEP) or other streaming data as a data source for data mining analysis
- Having managed past models in production over their full lifecycle until model replacement is needed. Having developed automated model refreshing on newer data.
- Having developed frameworks for model automation as a prototype for product.
- Experience with PMML or PFA is of interest (see www.DMG.org). The FogHorn product is a PMML consumer, converting to Vel for execution.
- Our model training may involve use of GPU’s. We have a Google Cloud pub-sub secure integration and are working on other cloud integrations. Experience in different model training environments can be helpful.
How To Apply
- To apply, submit resume and cover letter to HR at email@example.com. Please indicate how you came across the job posting.
- It can be helpful to write a paragraph per past deployed data mining model (for up to 3 past systems) in a cover email. Consider a PSR format: Problem (i.e. newspaper title and the context), Solution and Result.