Mitigating Cross-Modal Distraction and Ensuring Geometric Feasibility via Affordance-Guided, Self-Consistent MLLMs for Task Planning in Instruction-Following Manipulation

Setup

Install Isaac Gym & Create Conda Environment

conda activate rlgpu

Clone This Repo

git clone https://github.com/HCIS-Lab/Affordance-Guided-Self-Consistent-MLLM.git

Install needed package.

Please check the website to install pytorch according to your local device.
Run pip install -r requirements.txt to install other package.

cd Affordance-Guided-Self-Consistent-MLLM
pip install -r requirements.txt

Usage

Set API key of OpenAI

export OPENAI_API_KEY=XXXXX

Run experiment of different pipelines and task types

chmod +x run_experiment.sh
run_experiment.sh -n <PIPELINE_NAME> [ -e <EXP_ID>] [-c <CONFIG_FILE] [-l <LOG_ROOT>] [-t <MAX_TRIALS>]

# Run our method
run_experiment.sh -n our

Collect trajectories of skills or manually control the robot

python data_collection.py

TODO

Upload requirement.txt
Upload experimental log

Acknowledgments

The work is sponsored by the National Science and Technology Council (NSTC) under grants 113-2813-C-A49-019-E.

Parts of this project page were adopted from the Nerfies page.

Website License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Citation

@misc{shen2025mitigatingcrossmodaldistractionensuring,
      title={Mitigating Cross-Modal Distraction and Ensuring Geometric Feasibility via Affordance-Guided, Self-Consistent MLLMs for Task Planning in Instruction-Following Manipulation}, 
      author={Yu-Hong Shen and Chuan-Yu Wu and Yi-Ru Yang and Yen-Ling Tai and Yi-Ting Chen},
      year={2025},
      eprint={2503.13055},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2503.13055}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
docs		docs
src		src
urdf		urdf
.gitignore		.gitignore
LICENSE		LICENSE
concat_config.py		concat_config.py
convert_codec.py		convert_codec.py
data_collection.py		data_collection.py
environment.py		environment.py
experiment.py		experiment.py
render.py		render.py
requirements.txt		requirements.txt
run_experiment.sh		run_experiment.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mitigating Cross-Modal Distraction and Ensuring Geometric Feasibility via Affordance-Guided, Self-Consistent MLLMs for Task Planning in Instruction-Following Manipulation

Setup

Usage

TODO

Acknowledgments

Website License

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Mitigating Cross-Modal Distraction and Ensuring Geometric Feasibility via Affordance-Guided, Self-Consistent MLLMs for Task Planning in Instruction-Following Manipulation

Setup

Usage

TODO

Acknowledgments

Website License

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages