H2O
Awesome H2O ¶
以下是使用的所有出色项目、应用程序、研究、教程、课程和书籍的精选列表 H2O ,一个开源的分布式机器学习平台. H2O 提供许多监督和非监督机器学习算法的并行实现,例如广义线性模型、梯度提升机(包括 XGBoost)、随机森林、深度神经网络(深度学习)、堆叠集成、朴素贝叶斯、Cox 比例风险、K 均值、PCA、Word2Vec,以及全自动机器学习算法 (AutoML).
H2O.ai 产生许多 tutorials, blog posts, presentations and videos 关于 H2O,但下面的列表由更大的 H2O 用户社区制作的精彩内容组成.
我们刚刚开始使用此列表,因此非常感谢拉取请求! 请查看 contribution guidelines 在提出拉取请求之前. 如果您不是 GitHub 用户并想做出贡献,请发送电子邮件至 community@h2o.ai.
如果你也觉得H2O很棒,请⭐ H2O GitHub repository.
Blog Posts & Tutorials¶
- Using H2O AutoML to simplify training process (and also predict wine quality) 2020 年 8 月 4 日
- Visualizing ML Models with LIME
- Parallel Grid Search in H2O 2020 年 1 月 17 日
- Importing, Inspecting and Scoring with MOJO models inside H2O 2019 年 12 月 10 日
- Artificial Intelligence Made Easy with H2O.ai: A Comprehensive Guide to Modeling with H2O.ai and AutoML in Python 2019 年 6 月 12 日
- Anomaly Detection With Isolation Forests Using H2O 2018 年 12 月 3 日
- Predicting residential property prices in Bratislava using recipes - H2O Machine learning 2018 年 11 月 25 日
- Inspecting Decision Trees in H2O 2018 年 11 月 7 日
- Gentle Introduction to AutoML from H2O.ai 2018 年 9 月 13 日
- Machine Learning With H2O — Hands-On Guide for Data Scientists Jun 27, 2018
- Using machine learning with LIME to understand employee churn 2018 年 6 月 25 日
- Analytics at Scale: h2o, Apache Spark and R on AWS EMR 2018 年 6 月 21 日
- Automated and unmysterious machine learning in cancer detection 2017 年 11 月 7 日
- Time series machine learning with h2o+timetk 2017 年 10 月 28 日
- Sales Analytics: How to use machine learning to predict and optimize product backorders 2017 年 10 月 16 日
- HR Analytics: Using machine learning to predict employee turnover 2017 年 9 月 18 日
- Autoencoders and anomaly detection with machine learning in fraud analytics 2017 年 5 月 1 日
- Building deep neural nets with h2o and rsparkling that predict arrhythmia of the heart 2017 年 2 月 27 日
- Predicting food preferences with sparklyr (machine learning) 2017 年 2 月 19 日
- Moving largish data from R to H2O - spam detection with Enron emails 2016 年 2 月 18 日
- Deep learning & parameter tuning with mxnet, h2o package in R 2017 年 1 月 30 日
Books¶
- Big data in psychiatry and neurology, Chapter 11: A scalable medication intake monitoring system 黛安·明京·伍德布里奇和凯文·本特森·王. (2021)
- Hands on Time Series with R 拉米·克里斯平. (2019)
- Mastering Machine Learning with Spark 2.x Alex Tellez、Max Pumperla、Michal Malohlava. (2017)
- Machine Learning Using R Karthik Ramasubramanian,Abhishek Singh. (2016)
- Practical Machine Learning with H2O: Powerful, Scalable Techniques for Deep Learning and AI 达伦·库克. (2016)
- Disruptive Analytics 托马斯·丁斯莫尔. (2016)
- Computer Age Statistical Inference: Algorithms, Evidence, and Data Science 布拉德利·埃夫隆,特雷弗·哈斯蒂. (2016)
- R Deep Learning Essentials 约书亚·威利. (2016)
- Spark in Action 佩塔尔泽切维奇,马尔科博纳奇. (2016)
- Handbook of Big Data Peter Bühlmann、Petros Drineas、Michael Kane、Mark J. van der Laan (2015)
Research Papers¶
- Automated machine learning: AI-driven decision making in business analytics 马克·施密特. (2023)
- Water-Quality Prediction Based on H2O AutoML and Explainable AI Techniques Hamza Ahmad Madni、Muhammad Umer、Abid Ishaq、Nihal Abuzinadah、Oumaima Saidani、Shtwai Alsubai、Monia Hamdi、Imran Ashraf. (2023)
- Which model to choose? Performance comparison of statistical and machine learning models in predicting PM2.5 from high-resolution satellite aerosol optical depth Padmavati Kulkarnia、V.Sreekantha、Adithi R.Upadhyab、Hrishikesh ChandraGautama. (2022)
- Prospective validation of a transcriptomic severity classifier among patients with suspected acute infection and sepsis in the emergency department Noa Galtung、Eva Diehl-Wiesenecker、Dana Lehmann、Natallia Markmann、Wilma H Bergstrom、James Wacker、Oliver Liesenfeld、Michael Mayhew、Ljubomir Buturovic、Roland Luethy、Timothy E Sweeney、Rudolf Tauber、Kai Kappert、Rajan Somasundaram、Wolfgang Bauer. (2022)
- Depression Level Prediction in People with Parkinson’s Disease during the COVID-19 Pandemic) Hashneet Kaur, Patrick Ka-Cheong Poon, Sophie Yuefei Wang, Diane Myung-kyung Woodbridge. (2021)
- Machine Learning-based Meal Detection Using Continuous Glucose Monitoring on Healthy Participants: An Objective Measure of Participant Compliance to Protocol 维克多·帕拉西奥斯、黛安·明京·伍德布里奇、让·L·弗莱. (2021)
- Maturity of gray matter structures and white matter connectomes, and their relationship with psychiatric symptoms in youth Alex Luna、Joel Bernanke、Kakyeong Kim、Natalie Aw、Jordan D. Dworkin、Jioook Cha、Jonathan Posner (2021).
- Appendectomy during the COVID-19 pandemic in Italy: a multicenter ambispective cohort study by the Italian Society of Endoscopic Surgery and new technologies (the CRAC study) 阿尔贝托·萨托里、毛罗·波达、埃马努埃莱·博特里、罗伯托·帕塞拉、费迪南多·阿格雷斯塔、阿尔贝托·阿雷佐. (2021)
- Forecasting Canadian GDP Growth with Machine Learning Shafiullah Qureshi, Ba Chu, Fanny S. Demers. (2021)
- Morphological traits of reef corals predict extinction risk but not conservation status Nussaïbah B. Raja、Andreas Lauchstedt、John M. Pandolfi、Sun W. Kim、Ann F. Budd、Wolfgang Kiessling. (2021)
- Machine Learning as a Tool for Improved Housing Price Prediction Henrik I W. Wolstad 和 Didrik Dewan. (2020)
- Citizen Science Data Show Temperature-Driven Declines in Riverine Sentinel Invertebrates 蒂莫西·J·马奎尔、斯科特·OC Mundle. (2020)
- Predicting Risk of Delays in Postal Deliveries with Neural Networks and Gradient Boosting Machines 玛蒂尔达索德霍尔姆. (2020)
- Stock Market Analysis using Stacked Ensemble Learning Method 马尔卡铲球. (2020)
- H2O AutoML: Scalable Automatic Machine Learning . 艾琳·勒戴尔,塞巴斯蒂安·普瓦里埃. (2020)
- Single-cell mass cytometry on peripheral blood identifies immune cell subsets associated with primary biliary cholangitis Jin Sung Jang、Brian D. Juran、Kevin Y. Cunningham、Vinod K. Gupta、Young Min Son、Ju Dong Yang、Ahmad H. Ali、Elizabeth Ann L. Enninga、Jaeyun Sung 和 Constantinos N. Lazaridis. (2020)
- Prediction of the functional impact of missense variants in BRCA1 and BRCA2 with BRCA-ML Steven N. Hart、Eric C. Polley、Hermella Shimelis、Siddhartha Yadav、Fergus J. Couch. (2020)
- Innovative deep learning artificial intelligence applications for predicting relationships between individual tree height and diameter at breast height 伊尔克·埃尔坎利. (2020)
- An Open Source AutoML Benchmark Peter Gijsbers、Erin LeDell、Sebastien Poirier、Janek Thomas、Berndt Bischl、Joaquin Vanschoren. (2019)
- Machine Learning in Python: Main developments and technology trends in data science, machine learning, and artificial intelligence 塞巴斯蒂安·拉斯卡、约书亚·帕特森、科里·诺莱特. (2019)
- Human actions recognition in video scenes from multiple camera viewpoints 费尔南多·伊塔诺、里卡多·皮雷斯、米格尔·安吉洛·德·阿布鲁·德索萨、埃米利奥·德尔-莫拉尔-埃尔南德萨. (2019)
- Extending MLP ANN hyper-parameters Optimization by using Genetic Algorithm Fernando Itano、Miguel Angelo de Abreu de Sousa、Emilio Del-Moral-Hernandez. (2018)
- askMUSIC: Leveraging a Clinical Registry to Develop a New Machine Learning Model to Inform Patients of Prostate Cancer Treatments Chosen by Similar Men Gregory B. Auffenberg、Khurshid R. Ghani、Shreyas Ramani、Etiowo Usoro、Brian Denton、Craig Rogers、Benjamin Stockton、David C. Miller、Karandeep Singh. (2018)
- Machine Learning Methods to Perform Pricing Optimization. A Comparison with Standard GLMs 乔治·阿尔弗雷多·斯佩迪卡托、克里斯托夫·杜唐和莱昂纳多·佩特里尼. (2018)
- Comparative Performance Analysis of Neural Networks Architectures on H2O Platform for Various Activation Functions Yuriy Kochura、Sergii Stirenko、Yuri Gordienko. (2017)
- Algorithmic trading using deep neural networks on high frequency data Andrés Arévalo、Jaime Niño、German Hernandez、Javier Sandoval、Diego León、Arbey Aragón. (2017)
- Generic online animal activity recognition on collar tags Jacob W. Kamminga、Helena C. Bisby、Duc V. Le、Nirvana Meratnia、Paul JM Havinga. (2017)
- Soil nutrient maps of Sub-Saharan Africa: assessment of soil nutrient content at 250 m spatial resolution using machine learning [ PubMed ] Tomislav Hengl、Johan GB Leenaars、Keith D. Shepherd、Markus G. Walsh、Gerard BM Heuvelink、Tekalign Mamo、Helina Tilahun、Ezra Berkhout、Matthew Cooper、Eric Fegraus、Ichsani Wheeler、Nketia A. Kwabena. (2017)
- Robust and flexible estimation of data-dependent stochastic mediation effects: a proposed method and example in a randomized trial setting Kara E. Rudolph、Oleg Sofrygin、Wenjing Zheng 和 Mark J. van der Laan. (2017)
- Automated versus do-it-yourself methods for causal inference: Lessons learned from a data analysis competition 文森特·多里、詹妮弗·希尔、乌里·沙利特、马克·斯科特、丹·塞沃内. (2017)
- Using deep learning to predict the mortality of leukemia patients 瑞娜·肖·穆塔拉利. (2017)
- Use of a machine learning framework to predict substance use disorder treatment success Laura Acion、Diana Kelmansky、Mark van der Laan、Ethan Sahker、DeShauna Jones、Stephan Arnd. (2017)
- Ultra-wideband antenna-induced error prediction using deep learning on channel response data 贾尼斯·蒂曼、约翰内斯·皮尔曼、克里斯蒂安·维特菲尔德. (2017)
- Inferring passenger types from commuter eigentravel matrices Erika Fille T. Legara,克里斯托弗 P. 蒙特罗拉. (2017)
- Deep neural networks, gradient-boosted trees, random forests: Statistical arbitrage on the S&P 500 Christopher Krauss、Xuan Anh Doa、Nicolas Huckb. (2016)
- Identifying IT purchases anomalies in the Brazilian government procurement system using deep learning Silvio L. Domingos、Rommel N. Carvalho、Ricardo S. Carvalho、Guilherme N. Ramos. (2016)
- Predicting recovery of credit operations on a Brazilian bank Rogério G. Lopes、Rommel N. Carvalho、Marcelo Ladeira、Ricardo S. Carvalho. (2016)
- Deep learning anomaly detection as support fraud investigation in Brazilian exports and anti-money laundering Ebberth L. Paula、Marcelo Ladeira、Rommel N. Carvalho、Thiago Marzagão. (2016)
- Deep learning and association rule mining for predicting drug response in cancer Konstantinos N. Vougas、Thomas Jackson、Alexander Polyzos、Michael Liontos、Elizabeth O. Johnson、Vassilis Georgoulias、Paul Townsend、Jiri Bartek、Vassilis G. Gorgoulis. (2016)
- The value of points of interest information in predicting cost-effective charging infrastructure locations 斯蒂芬妮佛罗伦萨维瑟. (2016)
- Adaptive modelling of spatial diversification of soil classification units. Journal of Water and Land Development 克日什托夫·乌尔班斯基,斯坦尼斯瓦夫·格鲁什钦斯克. (2016)
- Scalable ensemble learning and computationally efficient variance estimation 艾琳·勒戴尔. (2015)
- Superchords: decoding EEG signals in the millisecond range 罗杰里奥·诺曼德,雨果·亚历山大·费雷拉. (2015)
- Understanding random forests: from theory to practice 吉尔斯卢普. (2014)
Benchmarks¶
- Are categorical variables getting lost in your random forests? - 分类编码方案的基准和对基于树的模型的影响(Scikit-learn vs H2O). 2016 年 10 月 28 日
- Deep learning in R - R 中开源深度学习包的基准测试,2016 年 3 月 7 日
- Szilard's machine learning benchmark - 常见开源 ML 框架中随机森林、GBM、深度学习和 GLM 实施的基准. 2015 年 7 月 3 日
Presentations¶
- Pipelines for model deployment 2017 年 4 月 25 日
- Machine learning with H2O.ai 2017 年 1 月 23 日
Courses¶
- University of San Francisco (USF) Distributed Data System Class (MSDS 697) - 数据科学计划理学硕士.
- University of Oslo: Introduction to Automatic and Scalable Machine Learning with H2O and R - 研究集市 2019
- UCLA: Tools in Data Science (STATS 418) - 应用统计学硕士课程.
- GWU: Data Mining (Decision Sciences 6279) - 商业分析理学硕士.
- University of Cape Town: Analytics Module - 统计科学研究生荣誉课程.
- Coursera: How to Win a Data Science Competition: Learn from Top Kagglers - 高级机器学习专业化.
Software¶
- modeltime.h2o R package: 使用 H2O AutoML 进行预测
- Evaporate :通过 Javascript 在浏览器中运行 H2O 模型. 更多信息 here.
- splash R package : 将用户界面飞溅到 H2O MOJO 文件上. 更多信息 here.
- h2oparsnip R package: 一组包装器,用于将 h2o 算法与 parsnip 包裹.
- Spin up PySpark and PySparkling on AWS
- Forecast the US demand for electricity: 实时 dashboard 美国电力需求(使用 H2O GLM 预测)
- h2o3-pam:H2O-3 中的 Partition Around Mediods (PAM) 聚类算法
- h2o3-gapstat: H2O-3 中的间隙统计算法
License¶
在法律允许的范围内, H2O.ai 已放弃该作品的所有版权和相关或邻接权.