Mitigating Cross-Modal Distraction and Ensuring Geometric Feasibility via Affordance-Guided, Self-Consistent MLLMs for Task Planning in Instruction-Following Manipulation
- Install Isaac Gym & Create Conda Environment
conda activate rlgpu
- Clone This Repo
git clone https://github.com/HCIS-Lab/Affordance-Guided-Self-Consistent-MLLM.git
- Install needed package.
- Please check the website to install pytorch according to your local device.
- Run pip install -r requirements.txt to install other package.
cd Affordance-Guided-Self-Consistent-MLLM
pip install -r requirements.txt
- Set API key of OpenAI
export OPENAI_API_KEY=XXXXX
- Run experiment of different pipelines and task types
chmod +x run_experiment.sh
run_experiment.sh -n <PIPELINE_NAME> [ -e <EXP_ID>] [-c <CONFIG_FILE] [-l <LOG_ROOT>] [-t <MAX_TRIALS>]
# Run our method
run_experiment.sh -n our
- Collect trajectories of skills or manually control the robot
python data_collection.py
- Upload requirement.txt
- Upload experimental log
The work is sponsored by the National Science and Technology Council (NSTC) under grants 113-2813-C-A49-019-E.
Parts of this project page were adopted from the Nerfies page.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
@misc{shen2025mitigatingcrossmodaldistractionensuring,
title={Mitigating Cross-Modal Distraction and Ensuring Geometric Feasibility via Affordance-Guided, Self-Consistent MLLMs for Task Planning in Instruction-Following Manipulation},
author={Yu-Hong Shen and Chuan-Yu Wu and Yi-Ru Yang and Yen-Ling Tai and Yi-Ting Chen},
year={2025},
eprint={2503.13055},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2503.13055},
}

