- 0. Environment Setup
- 1. Download Model Weights
- 2. Download Datasets
- 3. Data Processing
- 4. Inference
- 5. License
We recommend using our docker image for environment setup
make build
make up
make into
make stopInstall our package inside docker:
pip install -e .cd AgentGrounder/weights
git clone https://huggingface.co/IDEA-Research/Rex-Omni # Rex-Omni
git clone https://huggingface.co/IDEA-Research/Rex-Omni-AWQ # Quantized Rex-Omni
git clone https://huggingface.co/facebook/sam3 # SAM3Download ScanRefer dataset from official repo, and place it in the following directory:
data/ScanRefer/ScanRefer_filtered_val.jsonDownload the Nr3D dataset from the official repo, and place it in the following directory:
data/Nr3D/Nr3D.json
Download the preprocessed Vil3dref data from vil3dref.
The expected structure should look like this:
referit3d/
.
├── annotations
| ├── meta_data
| │ ├── cat2glove42b.json
| │ ├── scannetv2-labels.combined.tsv
| │ └── scannetv2_raw_categories.json
│ └── ...
├── ...
└── scan_data
├── ...
├── instance_id_to_name
└── pcd_with_global_alignment
Download mask3d pred first.
- ScanRefer
python -m prepare_data.object_lookup_table_scanrefer- Nr3D
python -m prepare_data.process_feat_3d
python -m prepare_data.object_lookup_table_nr3dWe use ollama to deploy the VLM. Please install ollama server on your server.
- ScanRefer
python -m parse_query.generate_query_data_scanrefer- Nr3D
python -m parse_query.generate_query_data_nr3dpython -m inference.inference --config_path <nr3d_or_scanrefer_config_path>- ScanRefer
python -m eval.eval_nr3d- Nr3D
python -m eval.eval_scanreferThis work is released under the CC BY 4.0 license.