3) How to use (Python interface)

[Up to date with V-1.0.0.0 release]

This page provides basic instructions on how to use the CIANNA Python interface.
Please note that this does not represent the full capabilities of CIANNA.
For a full list and description of CIANNA high-level functions, go to Interface API document

This guide follows the MNIST example script => mnist_train.py and provides details on most of the instructions.
/!\ Only copying code portions from this guide will not result in a working script.
Use the full mnist_train.py for a complete working example.

Import CIANNA

Depending on the installation, CIANNA can be imported directly with the following:

import CIANNA as cnn

or after a sys.path.insert() statement specifying the path to the build directory in CIANNA:

import sys, glob
sys.path.insert(0,glob.glob('../../src/build/lib.*/')[-1])
import CIANNA as cnn

Initializing CIANNA

The first step before calling any interface function is to call:

cnn.init(in_dim=i_ar([28,28]), in_nb_ch=1, out_dim=10,
		bias=0.1, b_size=16, comp_meth="C_CUDA",
		dynamic_load=1, mixed_precision="FP32C_FP32A")

This function creates necessary data structures and sets up all the network parameters to constrain the entire dataset and layer structures. The input dimension, the number of input channels, and the output dimension are mandatory. The other parameters are optional and have default values (bias=0.1, b_size=10, *comp_meth="C_CUDA", ...), but they should be adjusted for optimal training and performance, considering the use case and available hardware. If no CUDA-compatible GPU is available, the value of comp_meth should be changed to "C_BLAS" (or "C_NAIV"). Detailed information about the other parameters is accessible on the API documentation Wiki page.

Data format

The Python interface requires data to be handled as NumPy arrays. To be interpreted properly, they need to be explicitly defined as dtype="float32" or dtype="int", depending on the parameter (specified for each parameter in the API documentation). The example script uses simple conversion functions (f_ar, i_ar) to improve code readability. The dataset arrays must be flattened as a single line per object using the C data ordering, and channels must be dissociated (e.g., for a single RGB image, the input vector must be made of all the R pixels flattened, then all the G pixels, and finally all the B pixels). For images, the dimensions should be ordered as [depth][height][width] with continuous memory on the right index. No bias is required as it is automatically added by the framework based on the configuration specified in the init_network function.

For example, we provide properly formatted files in a binary format corresponding to MNIST (an example image is 28x28 pixels), but ASCII-text data files would work as well:

data = np.fromfile("mnist_dat/mnist_input.dat", dtype="float32")
data = np.reshape(data, (80000,28*28))
target = np.fromfile("mnist_dat/mnist_target.dat", dtype="float32")
target = np.reshape(target, (80000,10))

Data are then split into Training (60000), Validation (10000), and Test (10000) datasets.

From these data, properly formatted C tables can be created by the CIANNA framework using the following:

cnn.create_dataset("TRAIN", size=60000, input=data_train, target=target_train)
cnn.create_dataset("VALID", size=10000, input=data_valid, target=target_valid)
cnn.create_dataset("TEST" , size=10000, input=data_test, target=target_test)

After this point, the NumPy arrays can be deleted if they are no longer required to save some RAM.
Warning: The CIANNA framework must be initialized before any call to create_dataset.

Network architecture

The appropriate input layer is automatically constructed from the dataset creation using the parameters provided in the init_network function. Successive layers can then be added sequentially. The following code example constructs an enhanced version of a LeNet-5, and the parameters should be self-explanatory:

cnn.conv(f_size=i_ar([5,5]), nb_filters=8 , padding=i_ar([2,2]), activation="RELU")
cnn.pool(p_size=i_ar([2,2]), p_type="MAX")
cnn.conv(f_size=i_ar([5,5]), nb_filters=16, padding=i_ar([2,2]), activation="RELU")
cnn.pool(p_size=i_ar([2,2]), p_type="MAX")
cnn.dense(nb_neurons=256, activation="RELU", drop_rate=0.5)
cnn.dense(nb_neurons=128, activation="RELU", drop_rate=0.2)
cnn.dense(nb_neurons=10, strict_size=1, activation="SMAX")

Warning: This network can already be quite long to train on light systems without a GPU. In such a case, it can be lightened by using half the convolutional filters and half the number of neurons on each layer, still achieving an accuracy above 99% in 20 epochs.

When loading a saved network, this step should be removed as it will automatically reconstruct the saved network architecture.

load_step = 10
if(load_step > 0):
    cnn.load("net_save/net0_s%04d.dat"%(load_step), load_step, bin=1)
else:
    cnn.conv(...)
    [...]

After the network declaration or loading, the print_arch_tex function can be used to save a summary of the network architecture. The function produces a LaTeX table (saved in a .tex file) and automatically compiles it into a PDF file (using pdflatex).

Training the network

While it is possible to construct advanced training schemes with CIANNA, a simple call to the train function should be enough in most cases:

cnn.train(nb_iter=20, learning_rate=0.004, momentum=0.8, confmat=1, save_every=0)

The only mandatory parameters are the number of iterations over the current "TRAIN" dataset and the learning rate. Detailed information about the other parameters is accessible on the API documentation Wiki page.
For this classification example, the momentum=0.8 parameter is very useful (default value to 0.0). The confmat parameter states that the confusion matrix (used for classification only) on the valid dataset must be displayed at each control step (by default, after each iteration over the training dataset). Finally, the save_every parameter defines how frequently (in terms of iterations) the network weights are saved into the ./net_save directory.

Performing prediction

Once the network is trained, a forward function can be called to perform prediction on the current "TEST" dataset.

cnn.forward()

Usually, this function does not require any parameters, as it uses the dataset constructed as the "TEST" dataset. To perform prediction on unlabeled data, it must be stored in the "TEST" dataset object and still associated with a target NumPy array of the appropriate shape (but that can be filled with zeros). In this context, the displayed "test set error" is irrelevant and does not impact the network inference. The forward function saves its prediction in the fwd_res/ directory.

Dynamic data loading

Dynamic data loading serves different purposes:

Handling very large datasets that can not be loaded at once into the host memory
Performing on-the-fly data augmentation (without saving all the resulting examples)
Concurrently training the network on a batch of data while loading/augmenting the next one

This behavior can be achieved using the CIANNA interface through batch dataset buffers that can then "swap" roles with the working batch dataset (no memory movement, just pointer updates). An easy way to do so is to define an augmentation/loading function that generates the data and updates the buffer dataset:

def data_augm():
    data_batch, targ_batch = create_augm_batch(data_train, target_train, 20000)
    cnn.delete_dataset("TRAIN_buf", silent = 1)
    cnn.create_dataset("TRAIN_buf", 20000, data_batch, targ_batch, silent = 1)
    return

In practice, this function can perform a wide variety of actions (such as data loading, data modification, or a call to an advanced library) to construct new data for training. A concurrent Python thread can then be executed while the network trains on the previous batch of data:

for k in range(0,40):
    t = Thread(target=data_augm)
    t.start()
    
    cnn.train(nb_iter=1, learning_rate=0.004, momentum=0.8, control_interv=5 , confmat=1, shuffle_every=0, save_every=0)
    #No shuffle is needed when using dynamic loading that includes a random selection
    
    t.join()
    cnn.swap_data_buffers("TRAIN")

After training on one iteration and data generation of the next iteration are finished, the batch dataset can be swapped. This process can be looped to achieve complete dynamic data loading. Depending on which process is the slowest, it can be beneficial to train the network on the working batch dataset multiple times (but not excessively, to avoid local overfitting).

In the future, we plan to add home-brewed augmentation routines in CIANNA that could facilitate this concurrent training and augmentation principle by utilizing either the host CPU or a secondary GPU (for large images).

Full example code

For the basic script => mnist_train.py

For the dynamic loading version => mnist_train_dyn_load.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

3) How to use (Python interface)

Import CIANNA

Initializing CIANNA

Data format

Network architecture

Training the network

Performing prediction

Dynamic data loading

Full example code

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally