Daniel Melcer

daniel@melcer.dev

dmelcer9

Google Scholar

Computer Science PhD Candidate at Northeastern University

View Resume

Projects & Publications

Textual Gradients are a Flawed Metaphor for Automatic Prompt Optimization

Daniel Melcer, Qi Chen, Wen-Hao Chiang, Shweta Garg, Pranav Garg, Christian Bock

(Abstract) A well-engineered prompt can increase the performance of large language models; automatic prompt optimization techniques aim to increase performance without requiring human effort to tune the prompts. One leading class of prompt optimization techniques introduces the analogy of textual gradients. We investigate the behavior of these textual gradient methods through a series of experiments and case studies. While such methods often result in a performance improvement, our experiments suggest that the gradient analogy does not accurately explain their behavior. Our insights may inform the selection of prompt optimization strategies, and development of new approaches.

Approximately Aligned Decoding

Daniel Melcer, Sujan Gonugondla, Pramuditha Perera, Haifeng Qian, Wen-Hao Chiang, Yanjun Wang, Nihal Jain, Pranav Garg, Xiaofei Ma, Anoop Deoras - NeurIPS 2025

(Abstract) It is common to reject undesired outputs of Large Language Models (LLMs); however, current methods to do so require an excessive amount of computation to re-sample after a rejection, or distort the distribution of outputs by constraining the output to highly improbable tokens. We present a method, Approximately Aligned Decoding (AprAD), to balance the distortion of the output distribution with computational efficiency, inspired by algorithms from the speculative decoding literature. AprAD allows for the generation of long sequences of text with difficult-to-satisfy constraints, while amplifying low probability outputs much less compared to existing methods. We show through a series of experiments that the task-specific performance of AprAD is comparable to methods that do not distort the output distribution, while being much more computationally efficient.

Learned Shields for Multi-Agent Reinforcement Learning

Daniel Melcer, Christopher Amato, Stavros Tripakis - ALA@AAMAS 2025

(Abstract) Shielding is an effective method for ensuring safety in multi-agent domains; however, its applicability has previously been limited to environments for which an approximate discrete model and safety specification are known in advance. We present a method for learning shields in cooperative fully-observable multi-agent environments where neither a model nor safety specification are provided, using architectural constraints to realize several important properties of a shield. We show through a series of experiments that our learned shielding method is effective at significantly reducing safety violations, while largely maintaining the ability of an underlying reinforcement learning agent to optimize for reward.

Constrained Decoding for Code Language Models via Efficient Left and Right Quotienting of Context-Sensitive Grammars

Daniel Melcer, Nathan Fulton, Sanjay Krishna Gouda, Haifeng Qian

(Abstract) Large Language Models are powerful tools for program synthesis and advanced auto-completion, but come with no guarantee that their output code is syntactically correct. This paper contributes an incremental parser that allows early rejection of syntactically incorrect code, as well as efficient detection of complete programs for fill-in-the-middle (FItM) tasks. We develop Earley-style parsers that operate over left and right quotients of arbitrary context-free grammars, and we extend our incremental parsing and quotient operations to several context-sensitive features present in the grammars of many common programming languages. The result of these contributions is an efficient, general, and well-grounded method for left and right quotient parsing. To validate our theoretical contributions – and the practical effectiveness of certain design decisions – we evaluate our method on the particularly difficult case of FItM completion for Python 3. Our results demonstrate that constrained generation can significantly reduce the incidence of syntax errors in recommended code.

Shield Decentralization for Safe Multi-Agent Reinforcement Learning

Daniel Melcer, Christopher Amato, Stavros Tripakis
Fully Observable - NeurIPS 2022 Oral (2.1% acceptance rate)
Partially Observable - RLC 2024

ProofViz: An Interactive Visual Proof Explorer

Daniel Melcer, Stephen Chang- TFP 2021

(Abstract) We introduce ProofViz, an extension to the Cur proof assistant ecosystem that enables interactive visualization and exploration of in-progress proofs. The tool displays a representation of the underlying proof tree, information about each node in the tree, and the values of partially-completed proof terms. Users can interact with the proof by executing tactics, changing the focus to any hole, and undoing or redoing their previous interactions. We anticipate that ProofViz will be useful both as a tool to introduce tactic proofs to learners of dependent type systems, and as a mechanism for advanced users looking for insight into the effects of their custom tactics.

Verification-Guided Tree Search

Alvaro Velasquez, Daniel Melcer- AAMAS 2020 (Extended Abstract)

(Abstract) Monte-Carlo Tree Search (MCTS) algorithms have been adopted by the artificial intelligence planning community for decades due to their ability to reason over long time horizons while providing guarantees on the convergence of the solution policy. In recent years, such algorithms have been modernized through the integration of Convolutional Neural Networks (CNNs) as part of the state evaluation process. However, both traditional and modern MCTS algorithms suffer from poor performance when the underlying reward signal of the environment is sparse. In this paper, we propose a verification-guided tree search solution which incorporates a reward shaping function within modern MCTS implementations. This function leverages the mathematical representation of the underlying objective by leveraging techniques from formal verification.

ShellShare

HackMIT 2019 - Best Command Line Tool Runner Up

What if you didn’t have to install RDP software to view the screen of another computer? What if you could just use SSH or something to share your screen? What would that even look like? This. This is what it would look like.

As it turns out, your (linux) terminal provides enough functionality to allow you to view a live desktop feed, handle keypresses and mouse inputs, and automatically adjust the display when the window is resized.

Charity Cart

Brickhack 2019 - Most socially impactful project

Donate to your favorite charities without spending any more than you already do.

Find generic versions of the food you already buy, and donate the price difference to charities.

Don’t know who to donate to? The application automatically suggests charities based on the types of items in your cart.

Git Mess

BostonHacks 2018 - Best Use of Twilio API

Send and receive SMS messages using the tools you already know and love. Specifically git.

Type out a message and git push to send. Then git pull to check for any new messages.

Includes an ASCII art converter to view any image attachments that you may have been sent.

Invertible Programming Language

This project was made in collaboration with Josh Goldman for Building Extensible Systems (“Hack your own language”).

This is a Racket-based programming language that is capable of analyzing simple functions and automatically creating the inverse of that function such that applying the inverse of a one-argument function to the result of the “normal” function results in the original argument given.

> (define f->c
    (λ-auto-invertible (x)
    ((muln 5/9)
    ((subn 32) x))))
> (f->c 32)
0
> (f->c -40)
-40
> (define c->f (invert f->c))
> (c->f 0)
32
> (c->f -40)
-40

Ethereum Connect Four

Play Connect Four on the Ethereum blockchain! A smart contract autonomously manages the current participants of a game, ensures the validity of moves made by a user, and awards of the winner of a game. Currently deployed on Ropsten testnet.

Note Requires MetaMask extension to interact with the application. If you would like to go to the game anyways, click here.

Riemann Zeta Function GPU Fraction Search

The Riemann Zeta function is an accurate predictor of the number of primes less than a given value. The inputs of the Zeta function that cause the function to equal zero have special mathematical properties.

Under the mentorship of David Biersach, I wrote a program that searched for a pattern in the smallest 100 known inputs with this property using their continued fraction representations. I used the CUDA framework to accelerate the computation by performing calculations hundreds of times in parallel on a consumer graphics card.

This research was conducted as part of the 2016 High School Research Program for Brookhaven National Laboratory.

Image source: Linas Vepstas [CC BY-SA 3.0]