ImageNetDataset

Documentation for ImageNetDataset.

API Reference

Dataset

ImageNetDataset.ImageNetType
ImageNet(; split=:train, dir=nothing, kwargs...)
ImageNet([split])

The ImageNet 2012 Classification Dataset (ILSVRC 2012-2017).

This is the most highly-used subset of ImageNet. It spans 1000 object classes and contains 1,281,167 training images, 50,000 validation images and 100,000 test images. By defaults, each image is in 224 × 224 × 3 format in RGB color space. This can be changed by modifying the preprocessor transform.

Arguments

  • You can pass a specific dir where to load or download the dataset, otherwise uses the default one.

  • train_dir, val_dir, test_dir, devkit_dir: optional subdirectory names of dir. Default to "train", "val", "test" and "devkit".

  • split: selects the data partition. Can take the values :train:, :val and :test. Defaults to :train.

  • transform: preprocessor applied to convert an image file to an array. Assumes a file path as input and an array in WHC format as output. Defaults to CenterCropNormalize, which applies a center-cropping view and normalization using coefficients from PyTorch's vision models.

Fields

  • split: Symbol indicating the selected data partition
  • transform: Preprocessing pipeline. Can be configured to select output dimensions and type.
  • paths: paths to ImageNet images
  • targets: An array storing the targets for supervised learning.
  • metadata: A dictionary containing additional information on the dataset.

Also refer to AbstractTransform, CenterCropNormalize.

Methods

  • dataset[i]: Return observation(s) i as a named tuple of features and targets.

  • dataset[:]: Return all observations as a named tuple of features and targets.

  • length(dataset): Number of observations.

  • convert2image converts features to RGB images.

Examples

julia> using ImageNetDataset

julia> dataset = ImageNet(:val);

julia> dataset[1:5].targets
5-element Vector{Int64}:
 1
 1
 1
 1
 1

julia> X, y = dataset[1:5];

julia> size(X)
(224, 224, 3, 5)

julia> X, y = dataset[2000];

julia> convert2image(dataset, X)

julia> dataset.metadata
Dict{String, Any} with 4 entries:
  "class_WNIDs"       => ["n01440764", "n01443537", "n01484850", "n01491361", "n01494475", …
  "class_description" => ["freshwater dace-like game fish of Europe and western Asia noted …
  "class_names"       => Vector{SubString{String}}[["tench", "Tinca tinca"], ["goldfish", "…
  "wnid_to_label"     => Dict("n07693725"=>932, "n03775546"=>660, "n01689811"=>45, "n021008…

julia> dataset.metadata["class_names"][y]
  3-element Vector{SubString{String}}:
   "common iguana"
   "iguana"
   "Iguana iguana"

References

[1]: Russakovsky et al., ImageNet Large Scale Visual Recognition Challenge

source
ImageNetDataset.convert2imageFunction
convert2image(dataset::ImageNet, i)
convert2image(dataset::ImageNet, x)

Convert the observation(s) i from dataset d to image(s). It can also convert a numerical array x.

Examples

julia> using ImageNetDataset, ImageInTerminal

julia> d = ImageNet()

julia> convert2image(d, 1:2)
# You should see 2 images in the terminal

julia> x = d[1].features;

julia> convert2image(MNIST, x) # or convert2image(d, x)
source

Preprocessing

ImageNetDataset.AbstractTransformType
AbstractTransform

Abstract type of ImageNet Preprocessing pipelines. Expected interface:

  • transform(method, image_path): load image and convert it to a WHC array
  • inverse_transform(method, array): convert WHC[N] array to image[s]
source
ImageNetDataset.CenterCropNormalizeType
CenterCropNormalize([; output_size, open_size, mean, std])

Preprocessing pipeline center-crops an input image to output_size and normalizes it according to mean and std. Returns an array in WHC format (width, height, channels).

Applied using transform and inverse_transform.

Keyword arguments:

  • output_size: Output size (width, height) of the center-crop. Defaults to (224, 224).
  • open_size: Preferred size (width, height) to open the image in using JpegTurbo. Defaults to (256, 256)
  • mean: Mean of the normalization over color channels. Defaults to (0.485f0, 0.456f0, 0.406f0).
  • std: Standard deviation of the normalization over color channels Defaults to (0.229f0, 0.224f0, 0.225f0).
source
ImageNetDataset.RandomCropNormalizeType
RandomCropNormalize([; output_size, open_size, mean, std])

Preprocessing pipeline crops an input image to output_size at a random position and normalizes it according to mean and std. Returns an array in WHC format (width, height, channels).

Applied using transform and inverse_transform.

Keyword arguments:

  • output_size: Output size (width, height) of the center-crop. Defaults to (224, 224).
  • open_size: Preferred size (width, height) to open the image in using JpegTurbo. Defaults to (256, 256)
  • mean: Mean of the normalization over color channels. Defaults to (0.485f0, 0.456f0, 0.406f0).
  • std: Standard deviation of the normalization over color channels Defaults to (0.229f0, 0.224f0, 0.225f0).
source

Preprocessing transforms can also be applied manually:

Metadata

Index