ImageNetDataset

API Reference

Dataset

ImageNetDataset.ImageNet — Type

ImageNet(; split=:train, dir=nothing, kwargs...)
ImageNet([split])

The ImageNet 2012 Classification Dataset (ILSVRC 2012-2017).

This is the most highly-used subset of ImageNet. It spans 1000 object classes and contains 1,281,167 training images, 50,000 validation images and 100,000 test images. By defaults, each image is in 224 × 224 × 3 format in RGB color space. This can be changed by modifying the preprocessor transform.

Arguments

You can pass a specific dir where to load or download the dataset, otherwise uses the default one.
train_dir, val_dir, test_dir, devkit_dir: optional subdirectory names of dir. Default to "train", "val", "test" and "devkit".
split: selects the data partition. Can take the values :train:, :val and :test. Defaults to :train.
transform: preprocessor applied to convert an image file to an array. Assumes a file path as input and an array in WHC format as output. Defaults to CenterCropNormalize, which applies a center-cropping view and normalization using coefficients from PyTorch's vision models.

Fields

split: Symbol indicating the selected data partition
transform: Preprocessing pipeline. Can be configured to select output dimensions and type.
paths: paths to ImageNet images
targets: An array storing the targets for supervised learning.
metadata: A dictionary containing additional information on the dataset.

Also refer to AbstractTransform, CenterCropNormalize.

Methods

dataset[i]: Return observation(s) i as a named tuple of features and targets.
dataset[:]: Return all observations as a named tuple of features and targets.
length(dataset): Number of observations.
convert2image converts features to RGB images.

Examples

julia> using ImageNetDataset

julia> dataset = ImageNet(:val);

julia> dataset[1:5].targets
5-element Vector{Int64}:
 1
 1
 1
 1
 1

julia> X, y = dataset[1:5];

julia> size(X)
(224, 224, 3, 5)

julia> X, y = dataset[2000];

julia> convert2image(dataset, X)

julia> dataset.metadata
Dict{String, Any} with 4 entries:
  "class_WNIDs"       => ["n01440764", "n01443537", "n01484850", "n01491361", "n01494475", …
  "class_description" => ["freshwater dace-like game fish of Europe and western Asia noted …
  "class_names"       => Vector{SubString{String}}[["tench", "Tinca tinca"], ["goldfish", "…
  "wnid_to_label"     => Dict("n07693725"=>932, "n03775546"=>660, "n01689811"=>45, "n021008…

julia> dataset.metadata["class_names"][y]
  3-element Vector{SubString{String}}:
   "common iguana"
   "iguana"
   "Iguana iguana"

References

[1]: Russakovsky et al., ImageNet Large Scale Visual Recognition Challenge

source

ImageNetDataset.convert2image — Function

convert2image(dataset::ImageNet, i)
convert2image(dataset::ImageNet, x)

Convert the observation(s) i from dataset d to image(s). It can also convert a numerical array x.

Examples

julia> using ImageNetDataset, ImageInTerminal

julia> d = ImageNet()

julia> convert2image(d, 1:2)
# You should see 2 images in the terminal

julia> x = d[1].features;

julia> convert2image(MNIST, x) # or convert2image(d, x)

source

Preprocessing

ImageNetDataset.AbstractTransform — Type

AbstractTransform

Abstract type of ImageNet Preprocessing pipelines. Expected interface:

transform(method, image_path): load image and convert it to a WHC array
inverse_transform(method, array): convert WHC[N] array to image[s]

source

ImageNetDataset.CenterCropNormalize — Type

CenterCropNormalize([; output_size, open_size, mean, std])

Preprocessing pipeline center-crops an input image to output_size and normalizes it according to mean and std. Returns an array in WHC format (width, height, channels).

Applied using transform and inverse_transform.

Keyword arguments:

output_size: Output size (width, height) of the center-crop. Defaults to (224, 224).
open_size: Preferred size (width, height) to open the image in using JpegTurbo. Defaults to (256, 256)
mean: Mean of the normalization over color channels. Defaults to (0.485f0, 0.456f0, 0.406f0).
std: Standard deviation of the normalization over color channels Defaults to (0.229f0, 0.224f0, 0.225f0).

source

ImageNetDataset.RandomCropNormalize — Type

RandomCropNormalize([; output_size, open_size, mean, std])

Preprocessing pipeline crops an input image to output_size at a random position and normalizes it according to mean and std. Returns an array in WHC format (width, height, channels).

Applied using transform and inverse_transform.

Keyword arguments:

output_size: Output size (width, height) of the center-crop. Defaults to (224, 224).
open_size: Preferred size (width, height) to open the image in using JpegTurbo. Defaults to (256, 256)
mean: Mean of the normalization over color channels. Defaults to (0.485f0, 0.456f0, 0.406f0).
std: Standard deviation of the normalization over color channels Defaults to (0.229f0, 0.224f0, 0.225f0).

source

Preprocessing transforms can also be applied manually:

ImageNetDataset.transform — Function