5 Minutes of Data Science - week 36
Highlights from September 05 to September 11
Second week of the new format. A quick recap: github’s trending repositories have a lot of diffusion-related content. Also exciting to know that DeepMind is exploring language models on confirmed veracity - models that only output content that is true.
Enjoy and see you next week.
- My journey from DeepMind intern to mentor, by DeepMind
- In conversation with AI: building better language models, by DeepMind
- Learning to Walk in the Wild from Terrain Semantics, by Google AI
- A Multi-Axis Approach for Vision Transformer and MLP Models, by Google AI
- Digitizing Smell: Using Molecular Maps to Understand Odor, by Google AI
- Master’s student uses SURE opportunity to explore impact of machine learning, by Amazon Science
- Automatically optimizing execution of dynamic tensor operations, by Amazon Science
- Pinch-grasping robot handles items with precision, by Amazon Science
- A quick guide to Amazon’s 40-plus papers at Interspeech 2022, by Amazon Science
- Interspeech 2022, by Apple Machine Learning
- Zero-Cost Proxies: How to find the best neural network without training (Ep. 201), by Data Science At Home
- Fairness in e-Commerce Search, by Data Skeptic
- Ryan Fedasiuk - Can the U.S. and China collaborate on AI safety?, by Towards Data Science
- Licensing & automating creativity, by Practical AI
- Understanding Collective Insect Communication with ML, w/ Orit Peleg - #590, by The TWIML AI
- Big brain time, at r/Data Science (💬69)
- Happy meme Monday, at r/Data Science (💬32)
- Here are the questions I was asked for my entry level DS job!, at r/Data Science (💬209)
- [P] Simple fastai based face restoration project, GitHub link in comments., at r/Machine Learning (💬34)
- [R] SIMPLERECON — 3D Reconstruction without 3D Convolutions — 73ms per frame !, at r/Machine Learning (💬24)
- [P] pytorch’s Newest nvFuser, on Stable Diffusion to make your favorite diffusion model sample 2.5 times faster (compared to full precision) and 1.5 times faster (compared to half-precision), at r/Machine Learning (💬13)
- Fellow statisticians, how do you develop your reading comprehension in statistics? what are your learning strategies?, at r/Ask Statistics (💬17)
- What is the best way to explain the difference between Standard Deviation and Mean Absolute Deviation?, at r/Ask Statistics (💬19)
- Modeling for causal inference vs prediction, at r/Ask Statistics (💬29)
- AI Turns my Drawings into Pure Art || Stable Diffusion Drawing App, at r/Latest in ML (💬0)
- General Video Recognition with AI (How AI Understands Videos), at r/Latest in ML (💬1)
Github jupyter notebook trends
Github python trends
- AUTOMATIC1111/stable-diffusion-webui: Stable Diffusion web UI
- sd-webui/stable-diffusion-webui: Stable Diffusion web UI
- xinntao/Real-ESRGAN: Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
- hpcaitech/ColossalAI: Colossal-AI: A Unified Deep Learning System for Big Model Era
- vinta/awesome-python: A curated list of awesome Python frameworks, libraries, software and resources
- python-poetry/poetry: Python dependency management and packaging made easy.
- TencentARC/GFPGAN: GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.
- karpathy/minGPT: A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
- impira/docquery: An easy way to extract information from documents
- microsoft/unilm: Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
- kakaobrain/coyo-dataset: COYO-700M: Large-scale Image-Text Pair Dataset
- alibaba/EasyCV: An all-in-one toolkit for computer vision
- tiangolo/fastapi: FastAPI framework, high performance, easy to learn, fast to code, ready for production
- eloialonso/iris: Transformers are Sample Efficient World Models
- geohot/tinygrad: You like pytorch? You like micrograd? You love tinygrad!
- WZMIAOMIAO/deep-learning-for-image-processing: deep learning for image processing including classification and object-detection etc.
- PaddlePaddle/PaddleOCR: Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
- apache/airflow: Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
See you next week!