Relevant Papers: Extracting Latent Steering Vectors from Pretrained Language Modes (Subramani et al., 2022) Steering Language Models With Activation Engineering (Turner et al., 2024) Adaptive Activation Steering: A Tuning-Free LLM Truthfulness Improvement Method for Diverse Hallucinations Categories (Wang et al., 2024) Improving Instruction-Following in Language Models through Activation Steering (Stolfo et al., 2024) Recent advancements in natural language processing (NLP) have revealed new ways to control large language models (LLMs) without requiring costly fine-tuning or retraining. Among these methods, steering LLMs via their latent activations has emerged as a powerful approach. Starting with latent steering vectors introduced by Subramani et al. (2022) and followed by Activation Addition (ActAdd) from Turner et al. (2024), the field has expanded with Adaptive Activation Steering (ACT) and Instruction-Following Steering (IFS), which refine and extend the concepts of activation engineering. This article delves into these advancements, highlighting their mechanics, strengths, and applications.
Today, I happened to discuss the future development trends of AI with some friends. Regarding the future of AI in the coming years, the core viewpoint is that people might be “overly optimistic” about AI development, and the current estimation of AI’s capabilities is overestimated. Without “killer” application scenarios, is it possible that the AI boom could subside in two to three years, eventually bursting the “huge bubble”?
Recently, our school celebrated the 60th anniversary of Computer Science & AI. To mark the occasion, the organizers invited Fernando Pereira to deliver a lecture on the connection between form and meaning in language. This subject has captivated the minds of linguists, computer scientists, and cognitive researchers for many years.
I went to Dubrovnik, Croatia for the EACL 2023 conference. It was my first time there. I made new friends and had an amazing time watching the most beautiful sunset in Kings Landing.
Related reading: The Beginner’s Guide to Contrastive Learning SimCSE: Simple Contrastive Learning of Sentence Embeddings A Simple Framework for Contrastive Learning of Visual Representations Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks Background Contrastive learning aims to learn effective representation by pulling semantically close neighbors together and pushing apart non-neighbors. Initially, contrastive learning was applied to computer vision tasks. As what it is shown in the figure below, we expect the model to learn the communities between two images that share the same label and the difference between a pair of images with different labels.
Brief introduction to the famous paper about Transformers: Attention Is All You Need Transformer Paper: Attention Is All You Need Github link of Transformer 1. Background encoder-decoder self attention feed-forward network positional encoding Dataset: WMT 2014 English-to-French translation task
A step-by-step guidance to create a simple real-time speech recogniser
An interesting discover of using i++ and ++i in Java loop
Def: judging test suite thoroughness based on the structure of the program itself Also known as white-box testing Distinguish from functional testing (black-box testing) structural testing is still testing product functionality against its specification only the measure of thoroughness has changed Why need structural testing: