ImageNetDataset
Documentation for ImageNetDataset.
API Reference
Dataset
ImageNetDataset.ImageNet
— TypeImageNet(; split=:train, dir=nothing, kwargs...)
ImageNet([split])
The ImageNet 2012 Classification Dataset (ILSVRC 2012-2017).
This is the most highly-used subset of ImageNet. It spans 1000 object classes and contains 1,281,167 training images, 50,000 validation images and 100,000 test images. By defaults, each image is in 224 × 224 × 3
format in RGB color space. This can be changed by modifying the preprocessor transform
.
Arguments
You can pass a specific
dir
where to load or download the dataset, otherwise uses the default one.train_dir
,val_dir
,test_dir
,devkit_dir
: optional subdirectory names ofdir
. Default to"train"
,"val"
,"test"
and"devkit"
.split
: selects the data partition. Can take the values:train:
,:val
and:test
. Defaults to:train
.transform
: preprocessor applied to convert an image file to an array. Assumes a file path as input and an array in WHC format as output. Defaults toCenterCropNormalize
, which applies a center-cropping view and normalization using coefficients from PyTorch's vision models.
Fields
split
: Symbol indicating the selected data partitiontransform
: Preprocessing pipeline. Can be configured to select output dimensions and type.paths
: paths to ImageNet imagestargets
: An array storing the targets for supervised learning.metadata
: A dictionary containing additional information on the dataset.
Also refer to AbstractTransform
, CenterCropNormalize
.
Methods
dataset[i]
: Return observation(s)i
as a named tuple of features and targets.dataset[:]
: Return all observations as a named tuple of features and targets.length(dataset)
: Number of observations.convert2image
converts features toRGB
images.
Examples
julia> using ImageNetDataset
julia> dataset = ImageNet(:val);
julia> dataset[1:5].targets
5-element Vector{Int64}:
1
1
1
1
1
julia> X, y = dataset[1:5];
julia> size(X)
(224, 224, 3, 5)
julia> X, y = dataset[2000];
julia> convert2image(dataset, X)
julia> dataset.metadata
Dict{String, Any} with 4 entries:
"class_WNIDs" => ["n01440764", "n01443537", "n01484850", "n01491361", "n01494475", …
"class_description" => ["freshwater dace-like game fish of Europe and western Asia noted …
"class_names" => Vector{SubString{String}}[["tench", "Tinca tinca"], ["goldfish", "…
"wnid_to_label" => Dict("n07693725"=>932, "n03775546"=>660, "n01689811"=>45, "n021008…
julia> dataset.metadata["class_names"][y]
3-element Vector{SubString{String}}:
"common iguana"
"iguana"
"Iguana iguana"
References
[1]: Russakovsky et al., ImageNet Large Scale Visual Recognition Challenge
ImageNetDataset.convert2image
— Functionconvert2image(dataset::ImageNet, i)
convert2image(dataset::ImageNet, x)
Convert the observation(s) i
from dataset d
to image(s). It can also convert a numerical array x
.
Examples
julia> using ImageNetDataset, ImageInTerminal
julia> d = ImageNet()
julia> convert2image(d, 1:2)
# You should see 2 images in the terminal
julia> x = d[1].features;
julia> convert2image(MNIST, x) # or convert2image(d, x)
Preprocessing
ImageNetDataset.AbstractTransform
— TypeAbstractTransform
Abstract type of ImageNet Preprocessing pipelines. Expected interface:
transform(method, image_path)
: load image and convert it to a WHC arrayinverse_transform(method, array)
: convert WHC[N] array to image[s]
ImageNetDataset.CenterCropNormalize
— TypeCenterCropNormalize([; output_size, open_size, mean, std])
Preprocessing pipeline center-crops an input image to output_size
and normalizes it according to mean
and std
. Returns an array in WHC format (width, height, channels)
.
Applied using transform
and inverse_transform
.
Keyword arguments:
output_size
: Output size(width, height)
of the center-crop. Defaults to(224, 224)
.open_size
: Preferred size(width, height)
to open the image in using JpegTurbo. Defaults to(256, 256)
mean
: Mean of the normalization over color channels. Defaults to(0.485f0, 0.456f0, 0.406f0)
.std
: Standard deviation of the normalization over color channels Defaults to(0.229f0, 0.224f0, 0.225f0)
.
ImageNetDataset.RandomCropNormalize
— TypeRandomCropNormalize([; output_size, open_size, mean, std])
Preprocessing pipeline crops an input image to output_size
at a random position and normalizes it according to mean
and std
. Returns an array in WHC format (width, height, channels)
.
Applied using transform
and inverse_transform
.
Keyword arguments:
output_size
: Output size(width, height)
of the center-crop. Defaults to(224, 224)
.open_size
: Preferred size(width, height)
to open the image in using JpegTurbo. Defaults to(256, 256)
mean
: Mean of the normalization over color channels. Defaults to(0.485f0, 0.456f0, 0.406f0)
.std
: Standard deviation of the normalization over color channels Defaults to(0.229f0, 0.224f0, 0.225f0)
.
Preprocessing transforms can also be applied manually:
ImageNetDataset.transform
— Functiontransform(tfm, path)
Load image from path and convert it to a WHC array using preprocessing transformation tfm
.
ImageNetDataset.inverse_transform
— Functioninverse_transform(tfm, path)
Convert WHC array to an image by applying the inverse of the preprocessing transformation tfm
.
Metadata
ImageNetDataset.class
— Functionclass(dataset, i)
Obtain class name for given target index i
.
ImageNetDataset.description
— Functiondescription(dataset, i)
Obtain class class description for given target index i
.
ImageNetDataset.wnid
— Functionwnid(dataset, i)
Obtain WordNet ID for given target index i
.