~hackernoon | Bookmarks (1961)
-
Comparing Costs, Usability and Results Diversity of Mutation Testing Techniques
This study compares mutation testing methods, showing LLMs generate more diverse mutations but at higher cost...
-
Experiment Design and Metrics for Mutation Testing with LLMs
This study defines cost, usability, and behavior metrics to evaluate LLM-generated Java mutations and describes experimental...
-
Using LLMs to Mutate Java Code
This study evaluates how various LLMs generate Java code mutations through carefully designed prompts, comparing open...
-
We Designed a Study to See If AI Can Imitate Real Software Bugs
The study evaluates LLMs like GPT-4 and CodeLlama for mutation testing using real Java bugs. It...
-
Mutation Testing with GPT and CodeLlama
Traditional mutation testing is evolving—this section reviews how LLMs like GPT-4 and CodeLlama generate high-utility, cost-effective,...
-
Study Finds AI Code Mutations Help Developers Catch Bugs Faster
This study shows that GPT-4 and other LLMs can generate diverse, fault-revealing code mutations in Java,...
-
How Coral Protocol Is Building the Internet of Agents
Coral Protocol co-founders dive into agent composability, interoperability, and the Internet of Agents. A must-read for...
-
What Happens When AI Starts Paying for Its Own GPU?
AI is evolving from assistant to economic agent; those who adapt early and leverage this shift...
-
Behind the Scenes: The Prompts and Tricks That Made Many-Shot ICL Work
Appendix details prompts, selection robustness tests, GPT4V-Turbo comparisons, and medical QA extensions validating many-shot ICL methodology.
-
Scientists Just Found a Way to Skip AI Training Entirely. Here's How
Many-shot ICL enables quick model adaptation without fine-tuning, improving accessibility. Future work: other tasks, open models,...
-
How Many Examples Does AI Really Need? New Research Reveals Surprising Scaling Laws
Gemini 1.5 Pro shows log-linear gains up to ~1K examples (+38% accuracy). Batching reduces costs 45x...
-
The Science Behind Many-Shot Learning: Testing AI Across 10 Different Vision Domains
Evaluates GPT-4o vs Gemini 1.5 Pro on 10 vision datasets with many-shot ICL, using stratified sampling...
-
Why Thousands of Examples Beat Dozens Every Time
Many-shot multimodal ICL with thousands of examples improves LMM performance. Gemini 1.5 Pro shows log-linear gains;...
-
AI Transforms 800K+ Grocery Transactions into Smart Insights
InteraSSort demo using Ta-Feng grocery dataset with MNL model and GPT-3.5-turbo, showing conversational optimization with constraint...
-
Meet Leobit: HackerNoon Company of the Week
This week, HackerNoon features Leobit—a .NET, AI, and web application development provider for technology companies and...
-
What Conway, Ants, and Apache Kafka Can Teach Us About AI System Design
This article explores how principles like emergence, decomposition, and multi-agent systems (MAS) can transform AI from...
-
The AI Framework That Makes Optimization as Easy as Chatting
InteraSSort framework: prompt design → decomposition → tool execution for interactive assortment optimization with multi-turn conversation...
-
Standing on AI Giants: How InteraSSort Builds on Marketing and Tool Integration Research
Reviews AI in marketing (chatbots, personalization) and LLM tool integration frameworks (TaskMatrix, HuggingGPT, Optiguide) for assortment...
-
Chat Your Way to Better Shelves: InteraSSort Revolutionizes Retail Assortment Planning
InteraSSort combines LLMs with optimization tools for interactive assortment planning, enabling store planners to optimize via...
-
In the Age of AI, Going Analogue Can Give You the Edge
Yes, AI-assisted writing prompts is easier, but so is avoiding the gym. Sometimes you need to...
-
Tech Stacks to Consider Based on Your Project Scope
Choosing the right tech stack is a pivotal decision that sets the trajectory of your product's...
-
The Case for a Decentralized Cloud: How Vendor Lock-in Broke Cloud Storage
The failures of centralized cloud storage providers demonstrate the inherent risks of entrusting data to a...
-
How To Introduce a New API Quickly Using Micronaut
Knowing when to pivot can be vital to staying ahead of the competition. See how Cursor...
-
10 Red Flags to Help You Spot a Deepfake Scam
Defending against deepfake scams doesn’t necessarily mean being paranoid. Rather, it requires recognizing patterns, training staff...