Position: 吳俊逸 > AI
AI Topics
by 吳俊逸 2022-11-16 00:33:35, Reply(0), Views(221)
Including Diffusion models​, Advanced language models​, Machine Programming​, Graph NN (GNNs)​, Self-supervision​, Advancements in Reinforcement Learning (RL)​, Multimodalities (Modeling with text, image and audio data)​ and Compression/ Tiny ML​


Short Description​


Diffusion models​

Diffusion models are a new class of state-of-the-art generative models that has applications in Vision , NLG etc. They have already attracted a lot of attention after OpenAI, Nvidia and Google managed to train large-scale models. Example architectures that are based on diffusion models are GLIDE, DALLE-2, Imagen, and the full open-source stable diffusion. They inspire creativity and push the boundaries of machine learning.​



Advanced language models​

The introduction of transfer learning and pre-trained language models in natural language processing (NLP) pushed the limits of language understanding and generation. Some of the notable large pre-trained language models like BERT and GPT demonstrate impressive “human-like” NLP capabilities. We can use these methods for any tasks that include the need for NLP and generation.  ​

在自然語言處理(NLP)中引入遷移學習和預訓練語言模型,推動了語言理解和生成的極限。一些著名的大型預訓練語言模型,如BERTGPT,展示了令人印象深刻的類人”NLP能力。我們可以將這些方法用於任何任務,包括需要 NLP 和生成。 


Machine Programming​

Machine programming - software that creates its own software - is at an inflection point. It may redefine many industries. It is a fusion of machine learning, formal methods, programming languages, compilers, and computer systems. A notable recent release is GitHub copilot which very effectively helps people to ease and improve code creation. Machine programming development is related to recent progress in language models and can be applied to diverse HW and SW creation tasks.​

機器程式設計 - 創建自己的軟體的軟體 - 正處於拐點。它可能會重新定義許多行業。它是機器學習、形式化方法、程式設計語言、編譯器和計算機系統的融合。最近一個值得注意的版本是GitHub copilot,它非常有效地幫助人們簡化和改進代碼創建。機器程式設計開發與語言模型的最新進展有關,可應用於各種硬體和軟體創建任務。


Graph NN (GNNs)​

GNNs are a class of DL methods designed to perform inference on data described by graphs. It is highly applicable when data entities have complex relationships and interdependencies. Application examples include text classification, machine translation, image classification with few examples, and other sub-applications, modeling real-world physical or natural systems, combinatorial optimization, and more.​


GNN 是一類 DL 方法,旨在對圖形描述的數據執行推理。當數據實體具有複雜的關係和相互依賴關係時,它非常適用。應用範例包括文本分類、機器翻譯、圖像分類(示例很少)和其他子應用、真實世界物理或自然系統建模、組合優化等。



Self-supervised learning (SSL) is an evolving machine learning technique poised to solve the challenges posed by the over-dependence of labeled data. Consequently, the cost of high-quality annotated data is a major bottleneck in the overall training process, and it is not practical to annotate all data always. Self-supervised learning obtains supervisory signals from the data itself, often leveraging the underlying structure of the data. It can assist in various applications including improving computer vision and speech recognition results, NLP, and more ​

自我監督學習(SSL)是一種不斷發展的機器學習技術,有望解決標籤數據過度依賴帶來的挑戰。因此,高品質註釋數據的成本是整個訓練過程中的主要瓶頸,並且始終註釋所有數據是不切實際的。自監督學習從數據本身獲取監督信號,通常利用數據的底層結構。它可以協助各種應用,包括改善計算機視覺和語音辨識結果、NLP 等。


Advancements in Reinforcement Learning (RL)​

RL is an area of machine learning concerned with how intelligent agents ought to take action in an environment in order to maximize the notion of cumulative reward.  In the last few years, we’ve seen numerous advancements in this field such as overcoming some of the previous shortcomings like the level of human definitions required, the ability to work on a larger scale,  unsupervised and self-supervised setups, and more. These advancements also helped in application of RL in newer domains like Hardware Design etc.​

RL是機器學習的一個領域,涉及智慧代理應該如何在環境中採取行動,以最大化累積獎勵的概念。 在過去的幾年裡,我們看到了這個領域的許多進步,比如克服了以前的一些缺點,比如所需的人類定義水準、更大規模工作的能力、無監督和自我監督的設置等等。這些進步也有助於RL在硬體設計等較新的領域中的應用。


Multimodalities (Modeling with text, image and audio data)​

Multimodal AI is a new AI paradigm, in which various data types (e.g.: image, text, speech, numerical data) are combined with multiple intelligence processing algorithms to achieve higher performances. Multimodal AI often outperforms unimodal AI in many real-world problems. Multimodal AI has been extensively research and applied to challenges like Visual QA, Visual Commonsense Reasoning etc.​



Compression/ Tiny ML​

Tiny machine learning is broadly defined as a fast-growing field of machine learning technologies and applications including hardware, algorithms, and software capable of performing on-device sensor data ML at extremely low power. Thus, enabling the application of AI even on edge devices, reducing energy consumption, and integrating the technology in a more affordable and day-to-day manner. ​

微型機器學習被廣義地定義為一個快速增長的機器學習技術和應用領域,包括能夠以極低的功耗執行設備感測器數據 ML 的硬體、演算法和軟體。因此,即使在邊緣設備上也能應用人工智慧,降低能耗,並以更實惠的日常方式集成技術。