ImageNetDataset
Documentation for ImageNetDataset.
API Reference
Dataset
ImageNetDataset.ImageNet — Type
ImageNet(; split=:train, dir=nothing, kwargs...)
ImageNet([split])The ImageNet 2012 Classification Dataset (ILSVRC 2012-2017).
This is the most highly-used subset of ImageNet. It spans 1000 object classes and contains 1,281,167 training images, 50,000 validation images and 100,000 test images. By defaults, each image is in 224 × 224 × 3 format in RGB color space. This can be changed by modifying the preprocessor transform.
Arguments
You can pass a specific
dirwhere to load or download the dataset, otherwise uses the default one.train_dir,val_dir,test_dir,devkit_dir: optional subdirectory names ofdir. Default to"train","val","test"and"devkit".split: selects the data partition. Can take the values:train:,:valand:test. Defaults to:train.transform: preprocessor applied to convert an image file to an array. Assumes a file path as input and an array in WHC format as output. Defaults toCenterCropNormalize, which applies a center-cropping view and normalization using coefficients from PyTorch's vision models.
Fields
split: Symbol indicating the selected data partitiontransform: Preprocessing pipeline. Can be configured to select output dimensions and type.paths: paths to ImageNet imagestargets: An array storing the targets for supervised learning.metadata: A dictionary containing additional information on the dataset.
Also refer to AbstractTransform, CenterCropNormalize.
Methods
dataset[i]: Return observation(s)ias a named tuple of features and targets.dataset[:]: Return all observations as a named tuple of features and targets.length(dataset): Number of observations.convert2imageconverts features toRGBimages.
Examples
julia> using ImageNetDataset
julia> dataset = ImageNet(:val);
julia> dataset[1:5].targets
5-element Vector{Int64}:
1
1
1
1
1
julia> X, y = dataset[1:5];
julia> size(X)
(224, 224, 3, 5)
julia> X, y = dataset[2000];
julia> convert2image(dataset, X)
julia> dataset.metadata
Dict{String, Any} with 4 entries:
"class_WNIDs" => ["n01440764", "n01443537", "n01484850", "n01491361", "n01494475", …
"class_description" => ["freshwater dace-like game fish of Europe and western Asia noted …
"class_names" => Vector{SubString{String}}[["tench", "Tinca tinca"], ["goldfish", "…
"wnid_to_label" => Dict("n07693725"=>932, "n03775546"=>660, "n01689811"=>45, "n021008…
julia> dataset.metadata["class_names"][y]
3-element Vector{SubString{String}}:
"common iguana"
"iguana"
"Iguana iguana"References
[1]: Russakovsky et al., ImageNet Large Scale Visual Recognition Challenge
ImageNetDataset.convert2image — Function
convert2image(dataset::ImageNet, i)
convert2image(dataset::ImageNet, x)Convert the observation(s) i from dataset d to image(s). It can also convert a numerical array x.
Examples
julia> using ImageNetDataset, ImageInTerminal
julia> d = ImageNet()
julia> convert2image(d, 1:2)
# You should see 2 images in the terminal
julia> x = d[1].features;
julia> convert2image(MNIST, x) # or convert2image(d, x)Preprocessing
ImageNetDataset.AbstractTransform — Type
AbstractTransformAbstract type of ImageNet Preprocessing pipelines. Expected interface:
transform(method, image_path): load image and convert it to a WHC arrayinverse_transform(method, array): convert WHC[N] array to image[s]
ImageNetDataset.CenterCropNormalize — Type
CenterCropNormalize([; output_size, open_size, mean, std])Preprocessing pipeline center-crops an input image to output_size and normalizes it according to mean and std. Returns an array in WHC format (width, height, channels).
Applied using transform and inverse_transform.
Keyword arguments:
output_size: Output size(width, height)of the center-crop. Defaults to(224, 224).open_size: Preferred size(width, height)to open the image in using JpegTurbo. Defaults to(256, 256)mean: Mean of the normalization over color channels. Defaults to(0.485f0, 0.456f0, 0.406f0).std: Standard deviation of the normalization over color channels Defaults to(0.229f0, 0.224f0, 0.225f0).
ImageNetDataset.RandomCropNormalize — Type
RandomCropNormalize([; output_size, open_size, mean, std])Preprocessing pipeline crops an input image to output_size at a random position and normalizes it according to mean and std. Returns an array in WHC format (width, height, channels).
Applied using transform and inverse_transform.
Keyword arguments:
output_size: Output size(width, height)of the center-crop. Defaults to(224, 224).open_size: Preferred size(width, height)to open the image in using JpegTurbo. Defaults to(256, 256)mean: Mean of the normalization over color channels. Defaults to(0.485f0, 0.456f0, 0.406f0).std: Standard deviation of the normalization over color channels Defaults to(0.229f0, 0.224f0, 0.225f0).
Preprocessing transforms can also be applied manually:
ImageNetDataset.transform — Function
transform(tfm, path)Load image from path and convert it to a WHC array using preprocessing transformation tfm.
ImageNetDataset.inverse_transform — Function
inverse_transform(tfm, path)Convert WHC array to an image by applying the inverse of the preprocessing transformation tfm.
Metadata
ImageNetDataset.class — Function
class(dataset, i)Obtain class name for given target index i.
ImageNetDataset.description — Function
description(dataset, i)Obtain class class description for given target index i.
ImageNetDataset.wnid — Function
wnid(dataset, i)Obtain WordNet ID for given target index i.