Frequently Asked Questions (FAQ)

This page contains the most frequently asked questions about the EAI Challenge.

It will be updated regularly with new questions and answers.

General Questions

What is the Embodied Agent Interface Challenge?

The Embodied Agent Interface Challenge is a competition at NeurIPS 2025 that aims to evaluate Large Language Models (LLMs) in embodied decision-making tasks. The challenge provides a standardized framework for benchmarking and encourages reproducible research in embodied AI.

Who can participate in the challenge?

The challenge is open to researchers, practitioners, and enthusiasts in the field of AI and robotics. We welcome participants from diverse backgrounds to contribute to the advancement of embodied AI.

Participation

How do I get started with the challenge?

To get started, please follow the instructions outlined in the Participate section. We recommend reviewing the provided resources and preparing your submission according to the guidelines.

Is there a submission fee for the challenge?

No, there is no submission fee for participating in the Embodied Agent Interface Challenge.

Do I have to be a part of a team to participate?

No, individuals can participate in the challenge independently. However, we encourage collaboration and teamwork to foster a diverse range of ideas and approaches.

Is in-person attendance at NeurIPS 2025 mandatory for award recipients or finalists?

The Competition Track is planned to be held in person at NeurIPS 2025, and in-person attendance is strongly encouraged, especially for finalists and award recipients. However, we understand that personal circumstances may prevent some participants from traveling. In such cases, we will do our best to arrange remote alternatives (e.g., Zoom presentation) so that you can still present your work and be recognized, even if you are not physically present at the conference.

Do I have to beat baseline performance provided by the organizers to win?

No, you are not required to beat the baseline performance to participate or submit your work. The baseline serves as a reference point for evaluating submissions, but innovation and improvement beyond the baseline are encouraged. We will rank submissions among all participants based on their performance.

Can I submit multiple separate models, or is only a single model submission permitted?

Yes, you are allowed to use any combination of models or methods. Once you are satisfied with the performance, you can choose which results to display on the leaderboard through https://eval.ai/web/challenges/challenge-page/2621/my-submission.

Evaluation

How will submissions be evaluated?

Submissions will be evaluated based on their performance in the defined tasks and metrics outlined in the challenge description. The evaluation process will be conducted in a controlled environment to ensure fairness and reproducibility.

Can I use external data or models for my submission?

Participants are encouraged to use any publicly available data or models to enhance their submissions. However, all submissions must adhere to the challenge guidelines and ethical considerations.

Can I use external datasets to enhance model performance, including synthesized datasets?

Yes, you are allowed to use external datasets to enhance model performance. Publicly available datasets released prior to the challenge, as well as synthesized datasets (such as those generated with ResActGraph for VirtualHome), are permitted. The main restriction is that any external dataset used must be openly accessible to all participants and properly cited in your final technical report. Please ensure transparency by documenting all datasets you incorporate in your approach.

Are there limitations on the parameter size of language models?

No, there are no limitations on the parameter size of the language model. Evaluation will be conducted purely based on the framework metrics we have provided. You are free to use models of any size that best suit your approach.

Why do I receive zero scores on my submission?

There could be several reasons for receiving zero scores on your submission. Common issues include:

Incorrect output format or structure
Missing required output files
Failure to meet the challenge guidelines

We recommend reviewing your submission carefully and ensuring that it aligns with the provided guidelines.

Support

Who can I contact for questions or support?

If you have any questions or need assistance, please reach out to us at TianweiBao@u.northwestern.edu or post in our Slack. We are here to help!

From Thought to Action: The Embodied Agent Interface Challenge for Next-Generation Intelligent Agents

Official Website for NeurIPS 2025 Embodied Agent Interface Competition