H2O
Awesome H2O
¶
以下是精选的所有令人敬畏的项目,应用程序,研究,教程,课程和书籍的清单 H2O ,一个开源的分布式机器学习平台. H2O提供了许多有监督和无监督机器学习算法的并行实现,例如通用线性模型,梯度提升机(包括XGBoost),随机森林,深度神经网络(深度学习),堆叠式集成,朴素贝叶斯,考克斯比例风险,K均值,PCA,Word2Vec以及全自动机器学习算法(AutoML).
H2O.ai 产生很多 tutorials, blog posts, presentations 和 videos 有关H2O的内容,但以下列表由更大的H2O用户社区产生的精彩内容组成.
我们刚刚开始使用此列表,因此非常感谢拉取请求! 请查看 contribution guidelines 在提出拉取请求之前. 如果您不是GitHub用户,但想做出贡献,请发送电子邮件至community@h2o.ai.
如果您也认为H2O很棒,请⭐ H2O GitHub repository.
Blog Posts & Tutorials¶
- Using H2O AutoML to simplify training process (and also predict wine quality) 2020年8月4日
- Visualizing ML Models with LIME
- Parallel Grid Search in H2O 2020年1月17日
- Importing, Inspecting and Scoring with MOJO models inside H2O 2019年12月10日
- Artificial Intelligence Made Easy with H2O.ai: A Comprehensive Guide to Modeling with H2O.ai and AutoML in Python 2019年6月12日
- Anomaly Detection With Isolation Forests Using H2O 2018年12月3日
- Predicting residential property prices in Bratislava using recipes - H2O Machine learning 2018年11月25日
- Inspecting Decision Trees in H2O Nov 07, 2018
- Gentle Introduction to AutoML from H2O.ai 2018年9月13日
- Machine Learning With H2O — Hands-On Guide for Data Scientists Jun 27, 2018
- Using machine learning with LIME to understand employee churn 六月25,2018
- Analytics at Scale: h2o, Apache Spark and R on AWS EMR 六月21,2018
- Automated and unmysterious machine learning in cancer detection 2017年11月7日
- Time series machine learning with h2o+timetk 2017年10月28日
- Sales Analytics: How to use machine learning to predict and optimize product backorders 2017年10月16日
- HR Analytics: Using machine learning to predict employee turnover 2017年9月18日
- Autoencoders and anomaly detection with machine learning in fraud analytics 2017年5月1日
- Building deep neural nets with h2o and rsparkling that predict arrhythmia of the heart 2017年2月27日
- Predicting food preferences with sparklyr (machine learning) 2017年2月19日
- Moving largish data from R to H2O - spam detection with Enron emails 2016年2月18日
- Deep learning & parameter tuning with mxnet, h2o package in R 2017年1月30日
- Are categorical variables getting lost in your random forests? 2016年10月28日
Books¶
- Hands on Time Series with R 拉米·克里斯平(Rami Krispin). (2019)
- Mastering Machine Learning with Spark 2.x Alex Tellez,Max Pumperla,Michal Malohlava. (2017)
- Machine Learning Using R Karthik Ramasubramanian,Abhishek Singh. (2016年)
- Practical Machine Learning with H2O: Powerful, Scalable Techniques for Deep Learning and AI 达伦·库克(Darren Cook). (2016年)
- Disruptive Analytics 托马斯·丁斯莫尔. (2016年)
- Computer Age Statistical Inference: Algorithms, Evidence, and Data Science 布拉德利·埃夫隆(Bradley Efron),特雷弗·哈斯蒂(Trevor Hastie). (2016年)
- R Deep Learning Essentials 约书亚·威利(Joshua F. (2016年)
- Spark in Action Petar Zecevic,Marko Bonacci. (2016年)
- Handbook of Big Data PeterBühlmann,Petros Drineas,Michael Kane,Mark J.van der Laan(2015)
Research Papers¶
- Predicting Risk of Delays in Postal Deliveries with Neural Networks and Gradient Boosting Machines MatildaSöderholm. (2020年)
- Stock Market Analysis using Stacked Ensemble Learning Method 马尔卡·塔克(Malkar Tackle). (2020年)
- H2O AutoML: Scalable Automatic Machine Learning . Erin LeDell,Sebastien Poirier. (2020年)
- Single-cell mass cytometry on peripheral blood identifies immune cell subsets associated with primary biliary cholangitis Jin Sung Jang,Brian D.Juran,Kevin Y.Cunningham,Vinod K.Gupta,Young Min Son,Ju Dong Yang,Ahmad H.Ali,Elizabeth Ann L.Enninga,Jaeyun Sung和Konstantinos N. (2020年)
- Prediction of the functional impact of missense variants in BRCA1 and BRCA2 with BRCA-ML 史蒂芬·N·哈特(Steven N.Hart),埃里克·C·波利(Eric C.Polley),埃米拉·希梅利斯(Hermella Shimelis),悉达多·亚达夫(Siddhartha Yadav),弗格斯·J·库什(Fergus J.Couch). (2020年)
- Innovative deep learning artificial intelligence applications for predicting relationships between individual tree height and diameter at breast height 伊尔卡·埃坎利(İlkerErcanlı). (2020年)
- An Open Source AutoML Benchmark Peter Gijsbers,Erin LeDell,Sebastien Poirier,Janet Thomas,Berndt Bischl,Joaquin Vanschoren. (2019)
- Machine Learning in Python: Main developments and technology trends in data science, machine learning, and artificial intelligence 塞巴斯蒂安·拉施卡(Sebastian Raschka),约书亚·帕特森(Joshua Patterson),科里·诺莱特(Corey Nolet). (2019)
- Human actions recognition in video scenes from multiple camera viewpoints 费尔南多·伊塔诺(Fernando Itano),里卡多·皮雷斯(Ricardo Pires),米格尔·安吉洛·德·阿布鲁乌·德·索萨(Miguel Angelo de Abreu de Sousa),埃米利奥·德尔·莫拉·埃尔南德萨(Emilio Del-Moral-Hernandeza). (2019)
- Extending MLP ANN hyper-parameters Optimization by using Genetic Algorithm 费尔南多·伊塔诺(Fernando Itano),米格尔·安杰洛·德·阿布雷乌·德·索萨(Emirio Del-Moral-Hernandez). (2018)
- askMUSIC: Leveraging a Clinical Registry to Develop a New Machine Learning Model to Inform Patients of Prostate Cancer Treatments Chosen by Similar Men Gregory B.Auffenberg,Khurshid R.Ghani,Shreyas Ramani,Etiowo Code,Brian Denton,Craig Rogers,Benjamin Stockton,David C.Miller,Karandeep Singh. (2018)
- Machine Learning Methods to Perform Pricing Optimization. A Comparison with Standard GLMs Giorgio Alfredo Spedicato,Christophe Dutang和Leonardo Petrini. (2018)
- Comparative Performance Analysis of Neural Networks Architectures on H2O Platform for Various Activation Functions Yuriy Kochura,Sergii Stirenko,Yuri Gordienko. (2017)
- Algorithmic trading using deep neural networks on high frequency data AndrésArévalo,JaimeNiño,德国人Hernandez,Javier Sandoval,DiegoLeón,ArbeyAragón. (2017)
- Generic online animal activity recognition on collar tags 雅各布·W·卡明加,海伦娜·C·比斯比,杜克·V·勒,涅rv乐队,保罗·J·哈马加. (2017)
- Soil nutrient maps of Sub-Saharan Africa: assessment of soil nutrient content at 250 m spatial resolution using machine learning Tomislav Hengl,Johan GB Leenaars,Keith D.Shepherd,Markus G.Walsh,Gerard BM Heuvelink,Tekalign Mamo,Helina Tilahun,Ezra Berkhout,Matthew Cooper,Eric Fegraus,Ichsani Wheeler,Nketia A.Kwabena. (2017)
- Robust and flexible estimation of data-dependent stochastic mediation effects: a proposed method and example in a randomized trial setting Kara E. Rudolph, Oleg Sofrygin, Wenjing Zheng, and Mark J. van der Laan. (2017)
- Automated versus do-it-yourself methods for causal inference: Lessons learned from a data analysis competition Vincent Dorie,Jennifer Hill,Uri Shalit,Marc Scott和Dan Cervone. (2017)
- Using deep learning to predict the mortality of leukemia patients Reena Shaw Muthalaly. (2017)
- Use of a machine learning framework to predict substance use disorder treatment success 劳拉·阿西翁(Laura Acion),戴安娜·凯尔曼斯基(Diana Kelmansky),马克·范德兰(Mark van der Laan),伊桑·萨克(Ethan Sahker),德肖纳·琼斯(DeShauna Jones),斯蒂芬·阿恩德(Stephan Arnd). (2017)
- Ultra-wideband antenna-induced error prediction using deep learning on channel response data Janis Tiemann,Johannes Pillmann,Christian Wietfeld. (2017)
- Inferring passenger types from commuter eigentravel matrices 埃里卡·菲尔(Erika Fille T. (2017)
- Deep neural networks, gradient-boosted trees, random forests: Statistical arbitrage on the S&P 500 克里斯托弗·克劳斯(Christopher Krauss),宣安·杜(Anna Xuan Anh),尼古拉斯·赫克(Nicolas Huckb). (2016年)
- Identifying IT purchases anomalies in the Brazilian government procurement system using deep learning 西尔维奥·多明戈斯(Silvio L.Domingos),隆美尔·卡瓦略(Ronmel N.Carvalho),里卡多·卡瓦略(Ricardo S. (2016年)
- Predicting recovery of credit operations on a Brazilian bank RogérioG. Lopes,Rommel N. Carvalho,Marcelo Ladeira,Ricardo S. Carvalho. (2016年)
- Deep learning anomaly detection as support fraud investigation in Brazilian exports and anti-money laundering Ebberth L. Paula,Marcelo Ladeira,Rommel N. Carvalho,ThiagoMarzagão. (2016年)
- Deep learning and association rule mining for predicting drug response in cancer Konstantinos N.Vougas,Thomas Jackson,Alexander Polyzos,Michael Liontos,Elizabeth O.Johnson,Vassilis Georgoulias,Paul Townsend,Jiri Bartek,Vassilis G.Gorgoulis. (2016年)
- The value of points of interest information in predicting cost-effective charging infrastructure locations 斯蒂芬妮·佛罗伦萨·维瑟(StéphanieFlorence Visser). (2016年)
- Adaptive modelling of spatial diversification of soil classification units. Journal of Water and Land Development 斯坦尼斯瓦夫·格鲁什奇津斯克(KrzysztofUrbański). (2016年)
- Scalable ensemble learning and computationally efficient variance estimation 艾琳·莱戴尔(Erin LeDell). (2015年)
- Superchords: decoding EEG signals in the millisecond range 罗杰欧·诺曼德(Rogerio Normand),雨果·亚历山大·费雷拉(Hugo Alexandre Ferreira). (2015年)
- Understanding random forests: from theory to practice 吉尔斯·洛佩(Gilles Louppe). (2014年)
Benchmarks¶
- Are categorical variables getting lost in your random forests? -分类编码方案的基准及其对基于树的模型的影响(Scikit学习与H2O). 2016年10月28日
- Deep learning in R -R.2016年3月7日开放源码深度学习软件包的基准
- Szilard's machine learning benchmark -通用开源ML框架中的随机森林,GBM,深度学习和GLM实施基准. 2015年7月3日
Presentations¶
- Pipelines for model deployment 2017年4月25日
- Machine learning with H2O.ai 2017年1月23日
Courses¶
- University of San Francisco (USF) Distributed Data System Class (MSDS 697) -数据科学计划理学硕士.
- University of Oslo: Introduction to Automatic and Scalable Machine Learning with H2O and R -研究市场2019
- UCLA: Tools in Data Science (STATS 418) -应用统计硕士.
- GWU: Data Mining (Decision Sciences 6279) -商业分析理学硕士.
- University of Cape Town: Analytics Module -统计科学研究生荣誉课程.
- Coursera: How to Win a Data Science Competition: Learn from Top Kagglers -高级机器学习专业化.
Utilities¶
- Spin up PySpark and PySparkling on AWS
- h2oparsnip R package:一组包装程序,用于将“ h2o”算法与“ parsnip”程序包绑定
License¶
在法律允许的范围内, H2O.ai 放弃了此作品的所有版权以及相关或邻近的权利.