Rust для машинного обучения - библиотека

Материал из support.qbpro.ru

ВВЕДЕНИЕ

Эта статья содержит список библиотек машинного обучения, написанных на Rust.
Представляет собой сборник репозитариев GitHub, блогов, книг, уроков, форумов, статей.
Статья разбита на несколько основных категорий библиотек и алгоритмов. В статье нет библиотек,
которые больше не поддерживаются, а так же почти нет небольших библиотек, которые давно не обновлялись.

ЛИНЕЙНАЯ АЛГЕБРА

  • Большинство пакетов в списке используют ndarray или std::vec.

ИНСТРУМЕНТЫ ПОДДЕРЖКИ

  • Jupyter Notebook
  • evcxr может обрабатывать как Jupyter Kernel или REPL. Эти библиотеки нужны для обучения алгоритмов и проверки гипотез машинного обучения.

РАБОТА С ВИЗУАЛИЗАЦИЕЙ

  • Список полезных ресурсов для визуализации данных.


  • ASCII line graph:


  • Примеры:


  • Дата-фреймы:

ОБРАБОТКА ИЗОБРАЖЕНИЙ

  • Для обработка изображений вам стоит попробовать библиотеку image-rs.

Здесь приведены такие алгоритмы, как линейные преобразования, подобное есть и в других библиотеках.

ОБРАБОТКА ЕСТЕСТВЕННОГО ЯЗЫКА ИЛИ ПРЕДВАРИТЕЛЬНАЯ ОБРАБОТКА

ГРАФЫ

AUTOML

РАБОЧИЕ ПОТОКИ

ВЫЧИСЛЕНИЯ НА GPU С ПОМОЩЬЮ RUST

SKLEARN И ПОДОБНЫЕ БИБЛИОТЕКИ

  • Библиотеки поддерживают следующие алгоритмы:
Linear Regression
Logistic Regression
K-Means Clustering
Neural Networks
Gaussian Process Regression
Support Vector Machines
kGaussian Mixture Models
Naive Bayes Classifiers
DBSCAN
k-Nearest Neighbor Classifiers
Principal Component Analysis
Decision Tree
Support Vector Machines
Naive Bayes
Elastic Net

СТАТИСТИКА

ГРАДИЕНТНЫЙ БУСТИНГ(Gradient Boosting)

НЕЙРОННЫЕ СЕТИ

  • Tensorflow и PyTorch являются наиболее распространенными библиотеками для построения нейронных сетей.

ГРАФОВЫЕ МОДЕЛИ

НЛП

huggingface/tokenizers - The core of tokenizers, written in Rust. Provides an implementation of today's most used tokenizers, with a focus on performance and versatility. guillaume-be/rust-tokenizers - Rust-tokenizer offers high-performance tokenizers for modern language models, including WordPiece, Byte-Pair Encoding (BPE) and Unigram (SentencePiece) models guillaume-be/rust-bert - Rust native ready-to-use NLP pipelines and transformer-based models (BERT, DistilBERT, GPT2,...) sno2/bertml - Use common pre-trained ML models in Deno! cpcdoy/rust-sbert - Rust port of sentence-transformers (https://github.com/UKPLab/sentence-transformers) vongaisberg/gpt3_macro - Rust macro that uses GPT3 codex to generate code at compiletime proycon/deepfrog - An NLP-suite powered by deep learning ferristseng/rust-tfidf - Library to calculate TF-IDF messense/fasttext-rs - fastText Rust binding mklf/word2vec-rs - pure rust implementation of word2vec DimaKudosh/word2vec - Rust interface to word2vec. lloydmeta/sloword2vec-rs - A naive (read: slow) implementation of Word2Vec. Uses BLAS behind the scenes for speed.

Рекомендательные системы

PersiaML/PERSIA - High performance distributed framework for training deep learning recommendation models based on PyTorch. jackgerrits/vowpalwabbit-rs - 🦀🐇 Rusty VowpalWabbit outbrain/fwumious_wabbit - Fwumious Wabbit, fast on-line machine learning toolkit written in Rust hja22/rucommender - Rust implementation of user-based collaborative filtering maciejkula/sbr-rs - Deep recommender systems for Rust chrisvittal/quackin - A recommender systems framework for Rust snd/onmf - fast rust implementation of online nonnegative matrix factorization as laid out in the paper "detect and track latent factors with online nonnegative matrix factorization" rhysnewell/nymph - Non-Negative Matrix Factorization in Rust

Работа с текстом

quickwit-inc/quickwit - Quickwit is a big data search engine. bayard-search/bayard - A full-text search and indexing server written in Rust. neuml/txtai.rs - AI-powered search engine for Rust meilisearch/MeiliSearch - Lightning Fast, Ultra Relevant, and Typo-Tolerant Search Engine toshi-search/Toshi - A full-text search engine in rust BurntSushi/fst - Represent large sets and maps compactly with finite state transducers. tantivy-search/tantivy - Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust tinysearch/tinysearch - 🔍 Tiny, full-text search engine for static websites built with Rust and Wasm quantleaf/probly-search - A lightweight full-text search library that provides full control over the scoring calculations https://github.com/andylokandy/simsearch-rs - A simple and lightweight fuzzy search engine that works in memory, searching for similar strings jameslittle230/stork - 🔎 Impossibly fast web search, made for static sites. elastic/elasticsearch-rs - Official Elasticsearch Rust Client

Алгоритмы поиска ближайших соседей.

Enet4/faiss-rs - Rust language bindings for Faiss rust-cv/hnsw - HNSW ANN from the paper "Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs" hora-search/hora - 🚀 efficient approximate nearest neighbor search algorithm collections library, which implemented with Rust 🦀. horasearch.com InstantDomain/instant-distance - Fast approximate nearest neighbor searching in Rust, based on HNSW index lerouxrgd/ngt-rs - Rust wrappers for NGT approximate nearest neighbor search granne/granne - Graph-based Approximate Nearest Neighbor Search u1roh/kd-tree - k-dimensional tree in Rust. Fast, simple, and easy to use. qdrant/qdrant - Qdrant - vector similarity search engine with extended filtering support rust-cv/hwt - Hamming Weight Tree from the paper "Online Nearest Neighbor Search in Hamming Space" fulara/kdtree-rust - kdtree implementation for rust. mrhooray/kdtree-rs - K-dimensional tree in Rust for fast geospatial indexing and lookup kornelski/vpsearch - C library for finding nearest (most similar) element in a set petabi/petal-neighbors - Nearest neighbor search algorithms including a ball tree and a vantage point tree. ritchie46/lsh-rs - Locality Sensitive Hashing in Rust with Python bindings kampersanda/mih-rs - Rust implementation of multi-index hashing for neighbor searches on 64-bit codes in the Hamming space

Обучение с подкреплением

taku-y/border - Border is a reinforcement learning library in Rust. NivenT/REnforce - Reinforcement learning library written in Rust edlanglois/relearn - Reinforcement learning with Rust tspooner/rsrl - A fast, safe and easy to use reinforcement learning framework in Rust. milanboers/rurel - Flexible, reusable reinforcement learning (Q learning) implementation in Rust Ragnaroek/bandit - Bandit Algorithms in Rust MrRobb/gym-rs - OpenAI Gym bindings for Rust

Обучение с учителем

tomtung/omikuji - An efficient implementation of Partitioned Label Trees & its variations for extreme multi-label classification shadeMe/liblinear-rs - Rust language bindings for the LIBLINEAR C/C++ library. messense/crfsuite-rs - Rust binding to crfsuite ralfbiedert/ffsvm-rust - FFSVM stands for "Really Fast Support Vector Machine" zenoxygen/bayespam - A simple bayesian spam classifier written in Rust. Rui_Vieira/naive-bayesnaive-bayes - A Naive Bayes classifier written in Rust. Rui_Vieira/random-forests - A Rust library for Random Forests. sile/randomforest - A random forest implementation in Rust tomtung/craftml-rs - A Rust🦀 implementation of CRAFTML, an Efficient Clustering-based Random Forest for Extreme Multi-label Learning nkaush/naive-bayes-rs - A Rust library with homemade machine learning models to classify the MNIST dataset. Built in an attempt to get familiar with advanced Rust concepts.

Обучение без учителя

frjnn/bhtsne - Barnes-Hut t-SNE implementation written in Rust. vaaaaanquish/label-propagation-rs - Label Propagation Algorithm by Rust. Label propagation (LP) is graph-based semi-supervised learning (SSL). LGC and CAMLP have been implemented. nmandery/extended-isolation-forest - Rust port of the extended isolation forest algorithm for anomaly detection avinashshenoy97/RusticSOM - Rust library for Self Organising Maps (SOM). diffeo/kodama - Fast hierarchical agglomerative clustering in Rust. kno10/rust-kmedoids - k-Medoids clustering in Rust with the FasterPAM algorithm petabi/petal-clustering - DBSCAN and OPTICS clustering algorithms. savish/dbscan - A naive DBSCAN implementation in Rust gu18168/DBSCANSD - Rust implementation for DBSCANSD, a trajectory clustering algorithm. lazear/dbscan - Dependency free implementation of DBSCAN clustering in Rust whizsid/kddbscan-rs - A rust library inspired by kDDBSCAN clustering algorithm Sauro98/appr_dbscan_rust - Program implementing the approximate version of DBSCAN introduced by Gan and Tao quietlychris/density_clusters - A naive density-based clustering algorithm written in Rust milesgranger/gap_statistic - Dynamically get the suggested clusters in the data for unsupervised learning. genbattle/rkm - Generic k-means implementation written in Rust selforgmap/som-rust - Self Organizing Map (SOM) is a type of Artificial Neural Network (ANN) that is trained using an unsupervised, competitive learning to produce a low dimensional, discretized representation (feature map) of higher dimensional data.

Статистические модели

Redpoll/changepoint - Includes the following change point detection algorithms: Bocpd -- Online Bayesian Change Point Detection Reference. BocpdTruncated -- Same as Bocpd but truncated the run-length distribution when those lengths are unlikely. krfricke/arima - ARIMA modelling for Rust Daingun/automatica - Automatic Control Systems Library rbagd/rust-linearkalman - Kalman filtering and smoothing in Rust sanity/pair_adjacent_violators - An implementation of the Pair Adjacent Violators algorithm for isotonic regression in Rust

Эволюционные алгоритмы

martinus/differential-evolution-rs - Generic Differential Evolution for Rust innoave/genevo - Execute genetic algorithm (GA) simulations in a customizable and extensible way. Jeffail/spiril - Rust library for genetic algorithms sotrh/rust-genetic-algorithm - Example of a genetic algorithm in Rust and Python willi-kappler/darwin-rs - darwin-rs, evolutionary algorithms with rust

Еще проекты

Are we learning yet?, A work-in-progress to catalog the state of machine learning in Rust e-tony/best-of-ml-rust, A ranked list of awesome machine learning Rust libraries The Best 51 Rust Machine learning Libraries, RustRepo rust-unofficial/awesome-rust, A curated list of Rust code and resources Top 16 Rust Machine learning Projects, Open-source Rust projects categorized as Machine learning 39+ Best Rust Machine learning frameworks, libraries, software and resourcese, ReposHub

Блоги

About Rust’s Machine Learning Community, Medium, 2016/1/6, Autumn Engineering Rust vs Python: Technology And Business Comparison, 2021/3/4, Miłosz Kaczorowski I wrote one of the fastest DataFrame libraries, 2021/2/28, Ritchie Vink Polars: The fastest DataFrame library you've never heard of 2021/1/19, Analytics Vidhya Data Manipulation: Polars vs Rust, 2021/3/13, Xavier Tao State of Machine Learning in Rust – Ehsan's Blog, 2019/5/13, Published by Ehsan Ritchie Vink, Machine Learning Engineer, writes Polars, one of the fastest DataFrame libraries in Python and Rust, Xomnia, 2021/5/11 Quickwit: A highly cost-efficient search engine in Rust, 2021/7/13, quickwit, PAUL MASUREL Data Manipulation: Polars vs Rust, 2021/3/13, Xavier Tao Check out Rust in Production, 2021/8/10, Qovery, @serokell Why I started Rust instead of stick to Python, 2021/9/26, Medium, Geek Culture, Marshal SHI

Обучения

Rust Machine Learning Book, Examples of KMeans and DBSCAN with linfa-clustering Artificial Intelligence and Machine Learning – Practical Rust Projects: Building Game, Physical Computing, and Machine Learning Applications – Dev Guis , 2021/5/19 Machine learning in Rust using Linfa, LogRocket Blog, 2021/4/30, Timeular, Mario Zupan, Examples of LogisticRegression Machine Learning in Rust, Smartcore, Medium, The Startup, 2021/1/15, Vlad Orlov, Examples of LinerRegression, Random Forest Regressor, and K-Fold Machine Learning in Rust, Logistic Regression, Medium, The Startup, 2021/1/6, Vlad Orlov Machine Learning in Rust, Linear Regression, Medium, The Startup, 2020/12/16, Vlad Orlov Machine Learning in Rust, 2016/3/7, James, Examples of LogisticRegressor Machine Learning and Rust (Part 1): Getting Started!, Level Up Coding, 2021/1/9, Stefano Bosisio Machine Learning and Rust (Part 2): Linear Regression, Level Up Coding, 2021/6/15, Stefano Bosisio Machine Learning and Rust (Part 3): Smartcore, Dataframe, and Linear Regression, Level Up Coding, 2021/7/1, Stefano Bosisio Tensorflow Rust Practical Part 1, Programmer Sought, 2018 A Machine Learning introduction to ndarray, RustFest 2019, 2019/11/12, Luca Palmieri Simple Linear Regression from scratch in Rust, Web Development, Software Architecture, Algorithms and more, 2018/12/13, philipp Interactive Rust in a REPL and Jupyter Notebook with EVCXR, Depth-First, 2020/9/21, Richard L. Apodaca Rust for Data Science: Tutorial 1, dev, 2021/8/25, Davide Del Papa petgraph_review, 2019/10/11, Timothy Hobbs Rust for ML. Rust, Medium, Tempus Ex, 2021/8/1, Michael Naquin Adventures in Drone Photogrammetry Using Rust and Machine Learning (Image Segmentation with linfa and DBSCAN), 2021/11/14, CHRISTOPHER MORAN

Прикладные ресурсы

Deep Learning in Rust: baby steps, Medium, 2016/2/2, Theodore DeRego A Rust SentencePiece implementation, Rust NLP tales, 2020/5/30 Accelerating text generation with Rust, Rust NLP tales, 2020/11/21 A Simple Text Summarizer written in Rust, Towards Data Science, 2020/11/24, Charles Chan, Examples of Text Sentence Vector, Cosine Distance and PageRank Extracting deep learning image embeddings in Rust, RecoAI, 2021/6/1, Paweł Jankiewic, Examples of ONNX Deep Learning in Rust with GPU, 2021/7/30, Xavier Tao tch-rs pretrain example - Docker for PyTorch rust bindings tch-rs. Example of pretrain model, 2021/8/15, vaaaaanquish Rust ANN search Example - Image search example by approximate nearest-neighbor library In Rust, 2021/8/15, vaaaaanquish dzamkov/deep-learning-test - Implementing deep learning in Rust using just a linear algebra library (nalgebra), 2021/8/30, dzamkov vaaaaanquish/rust-machine-learning-api-example - The axum example that uses resnet224 to infer images received in base64 and returns the results., 2021/9/7, vaaaaanquish Rust for Machine Learning: Benchmarking Performance in One-shot - A Rust implementation of Siamese Neural Networks for One-shot Image Recognition for benchmarking performance and results, UofT Machine Intelligence Student Team Why Wallaroo Moved From Pony To Rust, 2021/8/19, Wallaroo.ai epwalsh/rust-dl-webserver - Example of serving deep learning models in Rust with batched prediction, 2021/11/16, epwalsh Production users - Rust Programming Language, by rust-lang.org Taking ML to production with Rust: a 25x speedup, A LEARNING JOURNAL, 2019/12/1, @algo_luca 9 Companies That Use Rust in Production, serokell, 2020/11/18, Gints Dreimanis Masked Language Model on Wasm, BERT on flontend examples, optim-corp/masked-lm-wasm, 2021/8/27, Optim Serving TensorFlow with Actix-Web, kykosic/actix-tensorflow-example Serving PyTorch with Actix-Web, kykosic/actix-pytorch-example

Форумы

Natural Language Processing in Rust : rust, 2016/12/6 Future prospect of Machine Learning in Rust Programming Language : MachineLearning, 2017/11/11 Interest for NLP in Rust? - The Rust Programming Language Forum, 2018/1/19 Is Rust good for deep learning and artificial intelligence? - The Rust Programming Language Forum, 2018/11/18 ndarray vs nalgebra : rust, 2019/5/28 Taking ML to production with Rust | Hacker News, 2019/12/2 Who is using Rust for Machine learning in production/research? : rust, 2020/4/5 Deep Learning in Rust, 2020/8/26 SmartCore, fast and comprehensive machine learning library for Rust! : rust, 2020/9/29 Deep Learning in Rust with GPU on ONNX, 2021/7/31 Rust vs. C++ the main differences between these popular programming languages, 2021/8/25 I wanted to share my experience of Rust as a deep learning researcher, 2021/9/2 How far along is the ML ecosystem with Rust?, 2021/9/15

Книги

Practical Machine Learning with Rust: Creating Intelligent Applications in Rust (English Edition), 2019/12/10, Joydeep BhattacharjeeWrite machine learning algorithms in Rust Use Rust libraries for different tasks in machine learning Create concise Rust packages for your machine learning applications Implement NLP and computer vision in Rust Deploy your code in the cloud and on bare metal servers source code: Apress/practical-machine-learning-w-rust DATA ANALYSIS WITH RUST NOTEBOOKS, 2021/9/3, Shahin RostamiPlotting with Plotters and Plotly Operations with ndarray Descriptive Statistics Interactive Diagram Visualisation of Co-occurring Types download source code and dataset

full texthttps://datacrayon.com/posts/programming/rust-notebooks/preface/

Видео уроки

The /r/playrust Classifier: Real World Rust Data Science, RustConf 2016, 2016/10/05, Suchin Gururangan & Colin O'Brien Building AI Units in Rust, FOSSASIA 2018, 2018/3/25, Vigneshwer Dhinakaran Python vs Rust for Simulation, EuroPython 2019, 2019/7/10, Alisa Dammer Machine Learning is changing - is Rust the right tool for the job?, RustLab 2019, 2019/10/31, Luca Palmieri Using TensorFlow in Embedded Rust, 2020/09/29, Ferrous Systems GmbH, Richard Meadows Writing the Fastest GBDT Library in Rust, 2021/09/16, RustConf 2021, Isabella Tromba

Подкасты

DATA SCIENCE AT HOMERust and machine learning #1 (Ep. 107) Rust and machine learning #2 with Luca Palmieri (Ep. 108) Rust and machine learning #3 with Alec Mocatta (Ep. 109) Rust and machine learning #4: practical tools (Ep. 110) Machine Learning in Rust: Amadeus with Alec Mocatta (Ep. 127) Rust and deep learning with Daniel McKenna (Ep. 135) Is Rust flexible enough for a flexible data model? (Ep. 137) Pandas vs Rust (Ep. 144) Apache Arrow, Ballista and Big Data in Rust with Andy Grove (Ep. 145) Polars: the fastest dataframe crate in Rust (Ep. 146) Apache Arrow, Ballista and Big Data in Rust with Andy Grove RB (Ep. 160)

ИСТОЧНИКИ