Installing the ImageNet Dataset
The ImageNet 2012 Classification Dataset (ILSVRC 2012-2017) can be downloaded at image-net.org after signing up and accepting the terms of access. It is therefore required that you download this dataset manually.
Existing Installation
The dataset structure is assumed to look as follows:
ImageNet
├── train
├── val
│ ├── n01440764
│ │ ├── ILSVRC2012_val_00000293.JPEG
│ │ ├── ILSVRC2012_val_00002138.JPEG
│ │ └── ...
│ ├── n01443537
│ └── ...
├── test
└── devkit
├── data
│ ├── meta.mat
│ └── ...
└── ...ImageNetDataset.jl can be used to load a single split and doesn't require all subdirectories train, test and val to be available.
ImageNetDataset.jl expects the ImageNet directory to live in ~/.julia/datadeps/. If this structure is not given, we offer two solutions:
Option 1: Specifying custom directories
When creating an ImageNet dataset, a custom root directory can be specified via the keyword argument dir:
dir = joinpath(homedir(), "Path", "To", "ImageNet")
dataset = ImageNet(; dir=dir)If the existing subdirectory names differ from train, val, test and devkit as well, they can be changed via the keyword arguments train_dir, val_dir, test_dir and devkit_dir.
Option 2: Using symbolic links
Instead of manually specifying directories, you can also create a symbolic link from your existing installation to ~/.julia/datadeps/ImageNet.
On Unix-like operating systems, this can be done using ln:
ln -s my/path/to/ImageNet ~/.julia/datadeps/ImageNetIn case the subdirectory structure is different as well, multiple symbolic links can be set to the following directories:
~/.julia/datadeps/ImageNet/train~/.julia/datadeps/ImageNet/val~/.julia/datadeps/ImageNet/test~/.julia/datadeps/ImageNet/devkit
New Installation
Download the following files from the ILSVRC2012 download page:
| Name | File name | Size | Note |
|---|---|---|---|
| Development kit (Task 1 & 2) | ILSVRC2012_devkit_t12.tar.gz | 2.5MB | Always required, contains metadata |
| Training images (Task 1 & 2) | ILSVRC2012_img_train.tar | 138GB | Only required for :train split |
| Validation images (all tasks) | ILSVRC2012_img_val.tar | 6.3GB | Only required for :val split |
| Test images (all tasks) | ILSVRC2012_img_test_v10102019.tar | 13GB | Only required for :test split |
You can use ImageNetDataset.jl without downloading all splits.
After downloading the data, move and extract the training and validation images to labeled subfolders by running the following shell scripts:
Extract metadata from the devkit:
mkdir -p ImageNet/devkit && tar -xvf ILSVRC2012_devkit_t12.tar.gz -C ImageNet/devkit --strip-components=1Extract the training data:
mkdir -p ImageNet/train && tar -xvf ILSVRC2012_img_train.tar -C ImageNet/train
# Unpack all 1000 compressed tar-files, one for each category:
cd ImageNet/train
find . -name "*.tar" | while read NAME ; do mkdir -p "\${NAME%.tar}"; tar -xvf "\${NAME}" -C "\${NAME%.tar}"; rm -f "\${NAME}"; doneExtract the validation data:
mkdir -p ImageNet/val && tar -xvf ILSVRC2012_img_val.tar -C ImageNet/valAnd run the following script adapted from soumith to create all class directories and move images into corresponding directories:
cd ImageNet/val
wget -qO- https://raw.githubusercontent.com/Julia-XAI/ImageNetDataset.jl/master/docs/src/valprep.sh | bash