GIA Dataset

Dataset Description

The GIA dataset combines a wide range of individual datasets. It includes expert demonstrations by expert RL agents, image and caption pairs, textual data and more. The GIA dataset is part of the GIA project, which aims to build a multimodal generalist agent.

Usage

>>> from datasets import load_dataset
>>> dataset = load_dataset("gia-project/gia-dataset", "metaworld-assembly")
>>> first_episode = dataset["train"][0]
>>> first_episode.keys()
dict_keys(['continuous_observations', 'continuous_actions', 'rewards'])
>>> len(first_episode["rewards"])
500
>>> first_episode["continuous_actions"][0]
[6.459120273590088, 2.2422609329223633, -5.914587020874023, -19.799840927124023]

Dataset Structure

Data Instances

Click to expand the score information for each task

The following table presents a comparative analysis of scores across various domains and tasks. The scores highlight the performance difference between a random agent and the episodes recorded in our dataset.

Task	Random Agent Score	Dataset Episode Score
Atari
atari-alien	205.50 ± 111.97	16912.50 ± 7087.42
atari-amidar	2.38 ± 2.50	2164.71 ± 1229.47
atari-assault	262.50 ± 89.61	15699.12 ± 9572.12
atari-asterix	213.50 ± 110.87	3699.62 ± 2421.30
atari-asteroids	856.40 ± 434.32	177011.05 ± 35334.20
atari-atlantis	17764.00 ± 6662.43	320679.59 ± 418247.37
atari-bankheist	13.40 ± 11.07	1322.43 ± 60.84
atari-battlezone	2170.00 ± 2121.58	295592.59 ± 161960.96
atari-beamrider	357.28 ± 143.97	29589.35 ± 16132.96
atari-berzerk	160.10 ± 118.87	57085.26 ± 13104.53
atari-bowling	23.81 ± 6.07	20.40 ± 7.29
atari-boxing	0.52 ± 4.37	97.97 ± 3.77
atari-breakout	1.24 ± 1.30	702.97 ± 203.62
atari-centipede	2150.06 ± 1113.28	11624.29 ± 4918.34
atari-choppercommand	875.00 ± 416.98	90990.62 ± 270876.93
atari-crazyclimber	7376.00 ± 2253.09	179296.94 ± 39862.06
atari-defender	3417.50 ± 1443.41	351958.33 ± 40466.82
atari-demonattack	165.55 ± 92.93	92195.25 ± 26174.79
atari-doubledunk	-18.54 ± 3.07	20.94 ± 3.65
atari-enduro	0.00 ± 0.00	2292.22 ± 147.54
atari-fishingderby	-93.90 ± 3.51	7.18 ± 25.06
atari-freeway	0.01 ± 0.10	33.88 ± 0.35
atari-frostbite	67.60 ± 37.61	13196.12 ± 4341.00
atari-gopher	319.40 ± 228.24	81676.15 ± 46329.48
atari-gravitar	188.50 ± 203.33	3986.57 ± 1729.05
atari-hero	475.25 ± 894.95	44677.35 ± 1754.42
atari-icehockey	-9.83 ± 3.24	25.17 ± 5.79
atari-jamesbond	28.50 ± 45.42	27786.89 ± 33819.20
atari-kangaroo	52.00 ± 108.15	574.05 ± 636.94
atari-krull	1754.00 ± 583.56	11439.83 ± 1218.34
atari-kungfumaster	390.00 ± 359.03	32392.81 ± 10006.55
atari-montezumarevenge	0.00 ± 0.00	393.53 ± 50.45
atari-mspacman	246.40 ± 121.22	6896.08 ± 2031.99
atari-namethisgame	2447.40 ± 888.97	22991.18 ± 2473.15
atari-phoenix	776.80 ± 635.86	424583.16 ± 97649.17
atari-pitfall	-259.75 ± 384.26	-1.45 ± 4.50
atari-pong	-20.22 ± 0.95	20.99 ± 0.18
atari-privateeye	41.65 ± 191.83	100.00 ± 0.00
atari-qbert	164.25 ± 151.79	42971.37 ± 85070.72
atari-riverraid	1474.40 ± 314.59	14800.94 ± 7924.56
atari-roadrunner	11.00 ± 42.18	77942.80 ± 6088.62
atari-robotank	1.87 ± 1.59	80.51 ± 13.28
atari-seaquest	73.20 ± 57.91	2597.34 ± 386.09
atari-skiing	-16299.52 ± 1850.70	-10738.06 ± 111.13
atari-solaris	2360.40 ± 1852.03	1353.68 ± 516.96
atari-spaceinvaders	137.20 ± 95.82	29425.29 ± 23623.89
atari-stargunner	652.00 ± 312.24	360588.57 ± 49207.71
atari-surround	-9.99 ± 0.10	9.39 ± 0.85
atari-tennis	-23.95 ± 0.22	11.11 ± 7.57
atari-timepilot	3396.00 ± 2128.85	69583.33 ± 29838.67
atari-tutankham	12.73 ± 17.40	291.16 ± 30.37
atari-upndown	358.90 ± 380.11	429418.33 ± 7187.43
atari-venture	0.00 ± 0.00	0.00 ± 0.00
atari-videopinball	23917.17 ± 19449.59	441507.92 ± 283264.62
atari-wizardofwor	620.00 ± 837.85	49333.33 ± 16157.08
atari-yarsrevenge	3503.91 ± 906.14	270262.86 ± 161815.96
atari-zaxxon	21.00 ± 102.27	73097.22 ± 14825.77
BabyAI
babyai-action-obj-door	0.37 ± 0.39	0.99 ± 0.01
babyai-blocked-unlock-pickup	0.00 ± 0.02	0.95 ± 0.01
babyai-boss-level	0.06 ± 0.21	0.94 ± 0.05
babyai-boss-level-no-unlock	0.06 ± 0.19	0.94 ± 0.05
babyai-find-obj-s5	0.08 ± 0.23	0.95 ± 0.04
babyai-go-to	0.13 ± 0.29	0.92 ± 0.07
babyai-go-to-door	0.45 ± 0.38	0.99 ± 0.00
babyai-go-to-imp-unlock	0.08 ± 0.23
babyai-go-to-local	0.16 ± 0.30	0.93 ± 0.04
babyai-go-to-obj	0.13 ± 0.27	0.93 ± 0.03
babyai-go-to-obj-door	0.53 ± 0.39	0.99 ± 0.01
babyai-go-to-red-ball	0.17 ± 0.30	0.93 ± 0.04
babyai-go-to-red-ball-grey	0.12 ± 0.27	0.92 ± 0.05
babyai-go-to-red-ball-no-dists	0.14 ± 0.28	0.93 ± 0.03
babyai-go-to-red-blue-ball	0.12 ± 0.27	0.92 ± 0.05
babyai-go-to-seq	0.08 ± 0.23	0.94 ± 0.05
babyai-key-corridor	0.00 ± 0.00	0.91 ± 0.01
babyai-key-in-box	0.01 ± 0.06
babyai-mini-boss-level	0.07 ± 0.21	0.89 ± 0.10
babyai-move-two-across-s8n9	0.00 ± 0.00	0.96 ± 0.01
babyai-one-room-s8	0.08 ± 0.21	0.92 ± 0.03
babyai-open	0.10 ± 0.24	0.95 ± 0.05
babyai-open-door	0.23 ± 0.34	0.99 ± 0.00
babyai-open-doors-order-n4	0.16 ± 0.30	0.99 ± 0.01
babyai-open-red-door	0.08 ± 0.21	0.92 ± 0.03
babyai-open-two-doors	0.08 ± 0.20	0.98 ± 0.00
babyai-pickup	0.08 ± 0.22	0.92 ± 0.07
babyai-pickup-above	0.02 ± 0.09	0.91 ± 0.07
babyai-pickup-dist	0.10 ± 0.24	0.86 ± 0.21
babyai-pickup-loc	0.08 ± 0.23	0.91 ± 0.04
babyai-synth	0.11 ± 0.26	0.93 ± 0.06
babyai-synth-loc	0.13 ± 0.29	0.94 ± 0.06
babyai-synth-seq	0.07 ± 0.20	0.95 ± 0.04
babyai-unblock-pickup	0.08 ± 0.22	0.91 ± 0.08
babyai-unlock	0.03 ± 0.15
babyai-unlock-local	0.01 ± 0.09	0.98 ± 0.01
babyai-unlock-pickup	0.00 ± 0.00	0.75 ± 0.04
babyai-unlock-to-unlock	0.00 ± 0.00
MetaWorld
metaworld-assembly	45.30 ± 4.13	245.99 ± 3.50
metaworld-basketball	2.81 ± 1.24	627.99 ± 1.98
metaworld-bin-picking	1.89 ± 0.45	425.58 ± 101.86
metaworld-box-close	76.39 ± 17.91	512.49 ± 107.81
metaworld-button-press	31.73 ± 5.20	643.10 ± 12.85
metaworld-button-press-topdown	28.97 ± 10.37	490.18 ± 27.21
metaworld-button-press-topdown-wall	29.04 ± 10.52	497.19 ± 31.37
metaworld-button-press-wall	8.98 ± 3.99	675.41 ± 15.04
metaworld-coffee-button	31.72 ± 6.36	731.08 ± 29.34
metaworld-coffee-pull	4.09 ± 0.38	259.86 ± 88.48
metaworld-coffee-push	4.17 ± 0.76	496.78 ± 118.20
metaworld-dial-turn	29.64 ± 16.67	793.56 ± 80.06
metaworld-disassemble	40.31 ± 7.53	42.83 ± 6.30
metaworld-door-close	5.30 ± 1.33	529.75 ± 27.24
metaworld-door-lock	112.35 ± 28.63	811.52 ± 34.07
metaworld-door-open	56.37 ± 11.23	581.94 ± 19.67
metaworld-door-unlock	94.17 ± 15.56	802.88 ± 17.05
metaworld-drawer-close	116.73 ± 253.11	867.92 ± 4.48
metaworld-drawer-open	126.85 ± 25.22	492.99 ± 2.52
metaworld-faucet-close	253.12 ± 22.94	753.92 ± 13.42
metaworld-faucet-open	244.10 ± 23.25	705.76 ± 7.15
metaworld-hammer	95.33 ± 9.02	693.17 ± 34.62
metaworld-hand-insert	2.75 ± 3.53	740.53 ± 36.69
metaworld-handle-press	80.41 ± 110.19	855.91 ± 72.75
metaworld-handle-press-side	57.00 ± 39.47	861.12 ± 20.01
metaworld-handle-pull	10.34 ± 13.54	669.35 ± 24.81
metaworld-handle-pull-side	2.13 ± 2.76	384.65 ± 102.89
metaworld-lever-pull	60.31 ± 15.77	612.04 ± 38.85
metaworld-peg-insert-side	1.71 ± 0.36	315.23 ± 140.07
metaworld-peg-unplug-side	4.75 ± 2.83	456.12 ± 81.65
metaworld-pick-out-of-hole	1.51 ± 0.24	219.61 ± 88.85
metaworld-pick-place	1.61 ± 0.99	419.10 ± 98.19
metaworld-pick-place-wall	0.00 ± 0.01	450.57 ± 64.10
metaworld-plate-slide	74.64 ± 13.84	527.01 ± 155.34
metaworld-plate-slide-back	33.47 ± 11.22	718.22 ± 87.41
metaworld-plate-slide-back-side	34.34 ± 11.53	729.61 ± 69.15
metaworld-plate-slide-side	22.61 ± 17.36	662.81 ± 102.81
metaworld-push	5.51 ± 2.43	750.57 ± 43.98
metaworld-push-back	1.21 ± 0.16	85.05 ± 107.12
metaworld-push-wall	6.13 ± 3.17	748.87 ± 10.62
metaworld-reach	149.67 ± 44.70	681.37 ± 133.68
metaworld-reach-wall	143.26 ± 36.56	746.12 ± 104.19
metaworld-shelf-place	0.00 ± 0.01	241.34 ± 24.60
metaworld-soccer	5.66 ± 4.61	375.15 ± 140.24
metaworld-stick-pull	2.64 ± 1.41	523.55 ± 18.94
metaworld-stick-push	2.81 ± 1.04	627.95 ± 10.20
metaworld-sweep	11.23 ± 7.28	494.85 ± 43.29
metaworld-sweep-into	12.55 ± 10.72	799.21 ± 19.07
metaworld-window-close	57.46 ± 7.11	591.30 ± 38.63
metaworld-window-open	43.36 ± 2.09	590.82 ± 57.08
MuJoCo
mujoco-ant	-59.95 ± 99.62	5846.42 ± 942.55
mujoco-doublependulum	57.46 ± 17.54	9338.69 ± 352.61
mujoco-halfcheetah	-284.97 ± 79.83	7437.77 ± 173.30
mujoco-hopper	18.38 ± 17.09	1858.73 ± 534.07
mujoco-humanoid	122.02 ± 35.28	6281.02 ± 1795.84
mujoco-pendulum	6.07 ± 3.47	475.40 ± 178.96
mujoco-pusher	-149.69 ± 7.41	-25.21 ± 6.66
mujoco-reacher	-43.00 ± 3.91	-5.68 ± 2.53
mujoco-standup	33135.75 ± 2481.89	273574.16 ± 85253.26
mujoco-swimmer	0.80 ± 10.71	92.18 ± 4.44
mujoco-walker	2.68 ± 6.06	4631.22 ± 1059.01

Data Fields

text: a string feature
images: a image feature
image_observations : a Sequence(image) feature
text_observations : a Sequence(string) feature
discrete_observations: a Sequence(Sequence(int64)) feature
continuous_observations: a Sequence(Sequence(float32)) feature
continuous_actions: a Sequence(Sequence(float32)) feature
discrete_actions: a Sequence(int64) feature
rewards: a Sequence(float32) feature

Data Splits

train: `` examples
test: `` examples

Dataset Creation

This section describes how our dataset was created. We specifically detail how data for each domain and task were generated. The generation scripts are available in the GIA repository. For RL tasks, we trained one agent per task using the Sample Factory. Then we used the trained agent to generate episodes.

Atari

We used the 57 ALE/Atari games as our environment, configuring the following parameters for our experiments. We rendered the images in grayscale with an 84x84 pixel resolution. The agent interacted with the environment every 4 frames. Sticky actions were not used, and the raw reward (no clipping) was reported. Episodes were stored as complete, i.e. with no termination on life loss.

BabyAI

We used BabyAI's implementation from Minigrid. We reused the bot agent provided with BabyAI's paper and adapted it to the new Minigrid API. Using the bot, we generated 100.000 episodes for each of the 39 tasks of Minigrid's BabyAI and stored for each step:

the mission: str
the concatenation of the symbolic observation flattened and the direction: Array of integers of size (147,)
the action: integer
the reward: float

Conceptual Captions

The Conceptual Captions dataset, offered by Google LLC, comprises pairs of image links and their corresponding captions. Each image has been downloaded and, when required, resized to ensure the maximum dimension does not exceed 352 pixels.

MetaWorld

We used the 50 tasks from MetaWorld v2. We constrained the episode to a duration of 100 timesteps, which is always sufficient to solve the task.

MuJoCo

We used the 11 environments of Gymnasium MuJoCo.

OK-VQA

The OK-VQA dataset released by Kenneth Marino, Mohammad Rastegari, Ali Farhadi, Roozbeh Mottaghi was used. The data were formatted to match Model Database dataset's requirements and images were resized such that the largest dimension is at most 352.

OSCAR

We modified the "unshuffled_deduplicated_en" split of OSCAR 2019 dataset, initially put together by Pedro J. Ortiz, Benoît Sagot, and Laurent Romary and licensed under CC BY 4.0. We cleaned and deduplicated the dataset using the methods and parameters used for the ROOTS dataset (Lurençon et al., 2023).

The dataset was splitted into 30 even shards each cleaned and deduplicated independently before being concatenated again.

Considerations for Using the Data

Known Issues

Some BabyAI tasks are missing due to incompatibility with the training bot:
- babyai-key-in-box
- babyai-go-to-imp-unlock
- babyai-unlock-to-unlock
- babyai-unlock
For some atari tasks, the episode is too long, causing an OverflowError when loading the dataset:
- atari-enduro
For some tasks, although the score can be higher than the random agent, we can't consider the task as solved:
- atari-bowling
- atari-privateeye
- atari-solaris
- atari-venture
- metaworld-bin-picking
- metaworld-disassemble
- metaworld-peg-insert-side
- metaworld-plate-slide
- metaworld-push-back

Future Developments

We plan to expand the dataset to include the following additional domains:

DM Lab
Sokoban
Procgen
DM Control Suite (w and w/o pixels)

Additional Information

Licensing Information

This dataset is release under the Apache 2.0 license.

Citation Information

@misc{gallouedec2023giadataset,
  title={GIA Dataset: A Multi-Modal, Multi-Task Learning Resource},
  author={Gallouédec, Quentin and Beeching, Edward and Romac, Clément},
  year={2023},
  howpublished={\url{https://huggingface.co/datasets/gia-project/gia-dataset}},
  note={Part of the GIA Project}
}

Acknowledgment

We would like to extend our sincere gratitude to:

Shengyi Costa Huang for his invaluable assistance with the pretrained models used in this research