DLEB
Documentation
1. Model & layer
1.1. Model architecture in DLEB
  • DLEB is designed to build models mainly consisting of 3 parts: (i) input layers, (ii) hidden layers, and (iii) output layers.
  • The model must be started with the input layer and ended with the output layer.
1.2. Input layer in DLEB
  • Input layer is the layer for converting input data in matrix format to a tensor.
  • Tensor is multi-dimensional array which is used in all kinds of operations in deep learning models. For more details about tensor, see https://www.tensorflow.org/guide/tensor.
  • A model can contain multiple input layers. If the multiple input layers are not merged by one of merging layers, such as concatenate layer, the input layers will be sequentially used to train or test the model.
1.3. Output layer in DLEB
  • The output layer in DLEB is defined as the layer which functions as the placeholder of model results and calculates losses for the results based on set training parameters including optimizer and loss function.
  • Generally, the output layer is used to produce the result for given inputs. However, the output layer in DLEB does not perform any operations to calculate the result. This means that the layer following the formal definition of output layer is the hidden layer just before the output layer in DLEB.
  • The training range parameter of the output layer helps to set layers to be trained by losses calculated in the output layer.
  • If the training range parameter is not set for the output layer, the output layer remains the model result placeholder.
  • As the input layer, a model can contain multiple output layers, and by setting different training ranges for different output layers in a model, those layers can be differently trained. This function can be useful to build complicate model, such as GAN.
1.4. Setting training parameter of output layer
  • Users can set parameters for model training in the "Parameter Settings" panel of the output layer.
  • Select input layers and hidden layers (or groups for hidden layers) that will be trained using the loss value of the current output layer in the "Training layer" option.
    • If the current model contains only the output layer or any layer is not set in the "Training layer" option, the output layer will only functions as the model result placeholder, and parameters for training will be hidden.
    • When the a set of layers layer is set in the "Training layer" option, the parameters for training will be shown.
    • The selected input layers and hidden layers have to be connected to each other.
  • Select the type of optimizer for training hidden layers.
    • Optimizers are algorithms used for training the selected layers to minimize the loss value.
  • Select a input layer for using to calculate model result.
  • Select a label data compared with the model result.
  • Select a loss function used to calculate loss by comparing the model result and label data.
  • If users want to calculate the loss using with multiple input and label data together, users can add the set of ‘Model loss’ parameters by clicking the plus icon. The final loss value will be calculated by summing loss values calculated by using each "Model loss" parameter.
1.5. Hidden layers in DLEB
  • Hidden layers in DLEB are intermediate layers between the input and output layers where all computations are conducted and the results are produced for given inputs.
  • All type of layers except the input and output layers can be used as hidden layers.
  • All hidden layers in main model have to be connected to each other.
  • The first and last hidden layers have to be connected to the input and output layers, respectively.
  • Multiple layers designed to work together for the same function can be grouped and visualized as a layer group.
1.6. Layer types available in DLEB
  • Layers consisting of multiple nodes are basic building blocks of deep learning models.
  • Basically, input data fed into a certain layer is combined using weights, transformed, and then transferred to the next layer.
  • DLEB supports various types of layers including convolution and recurrent layers.
  • Supported layer types
  • More description about layers in this link. * More description about 1D, 2D, and 3D convolution layer in this link.
    Type Name Description
    Core layer Dense layer Basic fully connected layer implementing an operation: output = (input * weight) + bias.
    Flatten layer Layer converting input data as a matrix form to a single array
    Reshape layer Layer changing the shape of input data.
    Convolution layer 1D convolution layer Convolution layer with one-dimensional input/output data and a kernel moving in one direction.
    2D convolution layer Convolution layer with two-dimensional input/output data and a kernel moving in two directions.
    3D convolution layer Convolution layer with three-dimensional input/output data and a kernel moving in three directions.
    1D pooling layer Layer for down-sampling input data by taking average or maximum value over a one-dimensional window.
    2D pooling layer Layer for down-sampling input data by taking average or maximum value over a two-dimensional window.
    3D pooling layer Layer for down-sampling input data by taking average or maximum value over a three-dimensional window.
    1D deconvolution layer Layer working through an opposite direction of a convolution layer in terms of data transformation for one-dimensional input/output data.
    2D deconvolution layer Layer working through an opposite direction of a convolution layer in terms of data transformation for two-dimensional input/output data.
    3D deconvolution layer Layer working through an opposite direction of a convolution layer in terms of data transformation for three-dimensional input/output data.
    Recurrent layer Simple RNN layer Fully connected recurrent neural network layer.
    GRU layer Gated recurrent unit layer.
    LSTM layer Long short-term memory layer.
    Merging layer Concatenate layer Layer for concatenating the list of input data.
    * The layer must be connected with at least two layers.
    Average layer Layer for averaging the list of input data.
    * The layer must be connected with at least two layers.
    Maximum layer Layer for computing the maximum value among the list of input data.
    * The layer must be connected with at least two layers.
    Minimum layer Layer for computing the minimum value among the list of input data.
    * The layer must be connected with at least two layers.
    Add layer Layer for adding the list of input data.
    * The layer must be connected with at least two layers.
    Subtract layer Layer for subtracting the list of input data.
    * The layer must be connected with at least two layers.
    Multiply layer Layer for multiplying the list of input data.
    * The layer must be connected with at least two layers.
    Normalization / Regularization Dropout Randomly selecting and not using some of input data for preventing overfitting.
    Batch normalization Normalization of output data by using the mean and the standard deviation of input data in a mini batch.
    Layer group Encoder A group of layers used for learning data representation in an autoencoder.
    Decoder A group of layers used for reconstructing data based on data features in an autoencoder.
    1D Convolutional encoder A encoder layer group with 1D convolutional layers.
    1D Convolutional decoder A decoder layer group with 1D convolutional layers.
    2D Convolutional decoder A encoder layer group with 2D convolutional layers.
    2D Convolutional decoder A decoder layer group with 2D convolutional layers.
    3D Convolutional decoder A encoder layer group with 3D convolutional layers.
    3D Convolutional decoder A decoder layer group with 3D convolutional layers.
    Generator A group of layers for generating similar data as input data in a GAN model.
    Discriminator A group of layers for discriminating real data from faked ones in a GAN model.
    Advanced layer Noise layer Layer for introducing additive zero-centered Gaussian noise.
    Sampling layer Layer for adding random noise to input data.
    Bottleneck layer Layer for generating encoded data in an autoencoder.
    Pretrained layer Xception Model for instantiating the Xception architecture.
    The default input size for this model is 299x299.
    * More description about Xception in this link.
    VGG Model for instantiating the VGG16 architecture.
    The default input size for this model is 224x224.
    * More description about VGG in this link.
    ResNet Model for instantiating the ResNet50 architecture.
    The default input size for this model is 224x224.
    * More description about ResNet in this link.
    Inception Model for instantiating the Inception V3 architecture.
    The default input size for this model is 299x299.
    * More description about Inception in this link.
    MobileNet Model for instantiating the MobileNetV3Large architecture.
    The default input size for this model is 224x224.
    * More description about MobileNet in this link.
    DenseNet Model for instantiating the Densenet121 architecture.
    The default input size for this model is 224x224.
    * More description about DenseNet in this link.
    NASNet Model for instantiating the NASNet model in the ImageNet mode.
    The default input size for this model is 331x331.
    * More description about NASNet in this link.
    Custom layer Custom layer Layer that can be customized by users.
1.7. Custom layer
  • If users want to make their own layers, they can use customized layers by following the steps.
  1. The "custom" layer is added while building model in the web interface.
    • The output shape for the customized layer should be defined in the web interface to check integrity of model structures and parameter values before importing Python codes.
  2. After the Python code for the model containing the “custom” layer is imported, the user just define the “custom” layer operation using the “custom_layer.py” file.
    • The config file will be downloaded with the Python code together and located in the following path.
    • (Directory containing Python code) model/src/custom_layer.py
    • The shape of the return value must be equal to the shape of "custom" layer set during building model.
    • An example of “custom” layer operation (“_example_sampling”).
    • class _example_sampling (layers.Layer):
          def call (self, inputs):
              z_mean, z_log_var = inputs
              batch = tf.shape(z_mean)[0]
              dim = tf.shape(z_mean)[1]
              epsilon = tf.keras.backend.random_normal(shape=(batch, dim))
              return z_mean + tf.exp(0.5 * z_log_var) * epsilon
1.8. Activation function
  • An activation function can be used to transform the weighted combination of input data to make the output data of a specific node in a layer.
  • The choice of the activation function has a large impact on the capability and performance of deep learning models, and different activation functions should be used in different parts of the models.
  • See 2.5 Handling activation function for setting the activation function for each layer.
  • Activation functions available in DLEB
  • Function Description
    Sigmoid A sigmoid function is an S-shaped activation function producing numbers between 0 and 1.
    Softmax A softmax function is a function producing a vector of values that sum to 1.
    Tanh A Tanh (hyperbolic tangent) activation function is an S-shaped activation function producing numbers between -1 to 1. Recurrent networks commonly use the Tanh activation function.
    ReLU A ReLU activation function is a linear function producing the same input data if it is positive and zero otherwise. It is the most common activation function in deep learning models.
    Elu An Elu activation function is a linear function producing the same input data if it is positive and negative values otherwise. The negative values are calculated by using an exponential function.
  • You can find more information at the following links and papers.
2. Designing and building your deep learning model
2.1. Adding and removing layers consisting of a deep learning model
  • Users can add layers by clicking the name of the layer from the list in the “Layer” panel.
  • ­Users can remove layers by selecting the layers they want to delete and clicking the “Trash can” icon on the toolbar or right-clicking on the selected layers and clicking the “Delete”.
  • A group of layers (e.g., encoder, generator) and a pre-trained model (e.g., ResNet, Inception) can be also added in the same manner as other layers.
  • ­All models must be started with the input layer and ended with the output layer. Multiple input and output layers are also allowed.
2.2. Setting the parameters of layers in a deep learning model
  • ­Users can set the parameters of layers in the “Parameter Settings” panel. The panel appears by clicking the layer the user wants to set the parameters for.
  • ­­Parameters in the “Parameter Settings” panel are divided into “Mandatory Parameters” and “Optional Parameters”. All parameters except for the ones related to dimensions of users' input data are set as default values recommended by DLEB. Users can also change the parameter values by users' preference.
  • ­“Mandatory Parameters” must be set before writing the final Python code for a deep learning model.
  • ­Users can get statistics for a deep learning model including the number of layers and parameters in the “Model Statistics” panel, which is useful information for estimating the size of a model and the time needed to train the model.
  • ­Trainable parameters are parameters updated during training. In contrast, non-trainable parameters are parameters not updated during training, and they can be added by using the “Batch normalization” layers and the “Pre-trained model” layers.
2.3. Handling layers in a deep learning model
  • ­­Users can select a single layer by mouse click or multiple layers by holding down the “Shift” key and dragging. The selected layer(s) will be highlighted by bold border.
  • ­The selected layers can be moved together by dragging one of them.
2.4. Adding and removing links between layers connecting layers in a deep learning model
  • ­­­Users can add links connecting layers by clicking and dragging the white circle on a layer.
    • The links that came from hidden layers and go to the input layer cannot be added.
  • ­­An edge is automatically created when a new layer is added when existing layers in a model are selected. The automatically added links connect all selected layers to a newly added one.
    • When the selected layer is hidden layer but the new layer is an input layer, the link will not be added.
  • ­Users can remove links by selecting them and clicking the “Trash can” icon on the toolbar or right-clicking on the selected links and clicking the “Delete”.
2.5. Handling activation function
  • ­­­­Users can set the activation function for each layer by clicking the links from the layer and selecting a function.
    • Users cannot set the activation function for the layers that are not hidden layers (input and output layers).
  • ­­­­If users do not want to use activation function for the layer, users select "None" option for the links that came from the layer. The marks for activation function will disappear in the links.
2.6. Grouping and ungrouping layers in a deep learning model
  • ­­­­Users can make a group of layers by selecting multiple layers and clicking the “Grouping” button on the toolbar or right-clicking on the selected layers and clicking “Grouping”.
  • ­­­Layers in a group can be ungrouped by clicking the “Ungrouping” button on the toolbar or right-clicking on the selected layer groups and clicking “Ungrouping”.
2.7. Using template models provided by DLEB
  • ­­­­­Users can use one of the template models by simply clicking the listed models in the “Template” panel.
  • ­­­­Users can also edit the structure of the template model and the parameters of the layers.
2.8. Importing to Python code
  • ­­Users can build only one deep learning model at a time.
  • ­­­­­Users can create the Python code for the deep learning model designed in DLEB by clicking the “Writing Code” button.
  • ­­­­­When users set the parameters for model training and click the "Importing code” on the popup window, the deep learning model will be imported to the Python source code.
  • ­­­­­When the Python source code is successfully created, the popup window will be changed as the below figure. A compressed file containing the source code and IPYNB file will be downloaded when users click the "Download code" on the popup window.

3. Using a deep learning model recommended by DLEB
3.1. Choosing the purpose of a deep learning model
  • ­­­­­­Users can choose the purpose of a deep learning model among four different tasks: “Feature extraction”, “Data generation”, “Classification”, and “Regression”. The description of each task is as follows.
  • ­­­­­­“Feature extraction” is the task of reducing the dimension of a dataset by summarizing original features into new informative and non-redundant features. The extracted features can be used as input data for clustering. This task does not require label data for model training.
  • ­­“Data generation” is the task of generating synthetic data that mimics an input dataset. A generative deep learning model learns patterns in input data and generates new synthetic data based on the learned patterns. This task does not require label data for model training.
  • ­­­“Classification” is the task of organizing data into predefined categories (labels) based on the features of input data. Label data is necessary for this task.
  • ­­­­ “Regression” is the process of investigating the relationship between input features and outputs (labels). Label data is necessary for this task.
3.2. Selecting the type of data for training a deep learning model
  • ­­­­­­Users can select the type of input data for training a deep learning model.
  • ­The Python code generated by DLEB contains functions specific for the selected input data type for preprocessing input data.
3.3. Selecting the structure of a deep learning model
  • ­Based on the purpose and the input data type users select, DLEB recommends several deep learning models.
  • ­Users choose a final deep learning model structure that is shown on the “Model Design” page. Users can edit the structure of the chosen model and the layer parameters on this page.

4. Executing the Python code generated by DLEB
4.1. Preparing the requirements for using the Python code generated by DLEB
4.2. Preparing the input data for a deep learning model
  • ­Sequence data (FASTA format)
    • (Case 1) The length of all sequences in the FASTA file should be equal if users want to use the sequences without any additional preprocessing steps, such as splitting sequences.
    • # Sequence data

      >SEQ1

      ACGTTTGCCGGGTGGGGTTCGAAAC

      >SEQ2

      GCGGTTTGCGCTCTCTCTCTAAATT

      >SEQ3

      ATGACTCTAGTCTCTCTAGTCTAGT


      # Label data

      SEQ1   1

      SEQ2   0

      SEQ3   1

    • (Case 2) Raw sequences can be also used as an input sequence file if the BED file defining the region of interest (ROI) is given together. Each region in the BED file can be same in length. The 5th column in the BED file will be used as labels for each region.
    • # Raw sequence data

      >SEQ1

      TTTGCTGTTCCTGCATGTAGTTTAAACGAGATTGCCAGCACCGGGTATCA

      TTCACCATTTTTCTTTTCGTTAACTTGCCGTCAGCCTTTTCTTTGACCTC

      ...

      CCTTTCCACCGGGCCTTTGAGAGGTCACAGGGTCTTGATGCTGTGGTCTT

      CATCTGCAGGTGTCTGACTTCCAGCAACTGCTGGCCTGTGCCAGGGTGCA


      # An example of label data in BED format defining the region of interest (ROI)

      SEQ1   10000   10010   .   0   +

      SEQ1   10020   10030   .   1   +

      SEQ1   10030   10040   .   1   +


      # Processed sequences split by DLEB code

      >SEQ1:10000-10010

      ACGTCCCGTA

      >SEQ1:10020-10030

      ACGAAATGTT

      >SEQ1:10030-10040

      TTGACTCTAT

    • (Case 3) If the length of each region in the BED file containing the ROI is different, users provide the BED file with bin size and step size for dividing sequences into bins of the same size.
    • # An example of Raw sequence data

      >SEQ1

      TTTGCTGTTCCTGCATGTAGTTTAAACGAGATTGCCAGCACCGGGTATCA

      TTCACCATTTTTCTTTTCGTTAACTTGCCGTCAGCCTTTTCTTTGACCTC

      ...

      CCTTTCCACCGGGCCTTTGAGAGGTCACAGGGTCTTGATGCTGTGGTCTT

      CATCTGCAGGTGTCTGACTTCCAGCAACTGCTGGCCTGTGCCAGGGTGCA


      # An example of label data in BED format defining the region of interest (ROI)

      SEQ1   10000   15000   .   0   +

      SEQ1   25500   28500   .   1   +

      SEQ1   30500   31000   .   1   +


      # An example of Processed sequences split by DLEB code

      # Bin size: 2,500 / Step size :2,500

      >SEQ1:10000-15000

      ACTGGTG ... ACCCTTGGG

      >SEQ1:12500-15000

      ACTTGCTT ... ATTGGATCA

      >SEQ1:25500-28000

      TGGCTAG ... ATGATACTTA

  • ­­­Alignment data (BAM format)
    • (Case 1) The alignment information in the BAM file can be preprocessed, if the BED file defining the ROI is given together. Each region in the BED file should be the same in length, if users want to use the regions without any binning or processing steps. The 5th column in the BED file will be used as labels for each region.
    • # An example of alignment data

      READ1 16 SEQ1 15944 255 36M * 0 0 TTATCACAATGTCATCCGCAGCTAATTTTGAGCCCA==>:;6;?>>;?8?937?;8;1A?@@>@@A?;8>?= XA:i:0 MD:Z:36 NM:i:0

      READ2 16 SEQ1 16076 255 36M * 0 0 AGGTTTCAATAACATCTTTGTCCTCTATTACAACGG AA@@@@ABBABAAB@BBBBBABBABABABACBBCCB XA:i:0 MD:Z:36 NM:i:0

      READ3 16 SEQ1 16542 255 36M * 0 0 GGCACCACTCACGATAACCTGGGCACCGGTGTTCCT 55;5=:4;;=5?=>;=6>A;=;:>4?>;=?=A??B@ XA:i:0 MD:Z:36 NM:i:0


      # An example of label data in BED format defining ROI

      SEQ1 10000 15000 . 1 *

      SEQ1 25500 31000 . 0 *

      SEQ1 30500 38000 . 1 *

    • (Case 2) If the lengths of each region in the BED file are different, users provide the BED file with bin size and step size for dividing sequences into bins of the same size.
  • ­­Signal data (bigWig format)
    • (Case 1) As same with the alignment data, the signal information in the bigWig file can be preprocessed, if the BED file defining the ROI is given at the same time. Each region in the BED file should be the same in length, if users want to use the regions without any binning or processing steps. The 5th column in the BED file will be used as labels for each region.
    • (Case 2) If the lengths of each region in the BED file are different, users provide the BED file with bin size and step size for dividing sequences into bins of the same size.
  • ­­­­Image data (JPEG or PNG format)
    • The sizes of an image data in JPEG or PNG format should be equal.
    • If the image data are different sizes, the data can be resized into the fixed width and height for images.
    • # An example of label data

      Img1.jpg   1

      Img2.jpg   0

      Img3.jpg   1

      Img4.jpg   0

  • ­­Text data (TXT, CSV or TSV format)
    • Users can also use the matrix-formatted data that are already preprocessed.
    • # An example of text data

                       Feature1   Feature2   Feature3   Feature4   Feature5

      Sample1   0.5            0.2             0.3             0.1             0.5

      Sample1   0.3            0.5             0.1              0.5            0.3

      Sample1   0.4            0.3             0.1              0.4            0.2


      # An example of label data

      Sample1   1

      Sample2   0

      Sample3   1

4.3. Setting config file
  • Users then set a config file for model input and output data.
  • When the Python codes for the designed model are imported, the config file in JSON format is also provided. The config file is automatically formatted with the information about the model structure.
  • Users can add file paths for input and output data. If there are multiple input or output layers in the model, users should add the file path for each layer.
  • {
        "inputs": {
            "_input_1": {
                "data_type": "txt",
                "input_filepath": "$DIR_PATH/$FILEPATH1",
            },
            "_input_2": {
                "data_type": "txt",
                "input_filepath": "$DIR_PATH/$FILEPATH2",
            },
        },
        "outputs": {
            "_output_1": {
                "label_type": "bed",
                "label_filepath": "$DIR_PATH/$FILEPATH3",
            },
            "_output_2": {
                "label_type": "txt",
                "label_filepath": "$DIR_PATH/$FILEPATH4",
            }
        }
    }
  • If input data should be preprocessed using the BED file, bin size, or step size, they can be also set by the config file.
    • If each region in the BED file should be the same length,
    • {
          "inputs": {
              "_input_1": {
                  "data_type": "seq",
                  "input_filepath": "$DIR_PATH/$FILEPATH1",
                  "roi_filepath": ""$DIR_PATH/$ROI_FILEPATH1",
                  "binsize": 0,
                  "stepsize": 0
              }
          }
      }
    • If each region in the BED file should be a different length,
    • {
          "inputs": {
              "_input_1": {
                  "data_type": "seq",
                  "input_filepath": "$DIR_PATH/$FILEPATH1",
                  "roi_filepath": ""$DIR_PATH/$ROI_FILEPATH1",
                  "binsize": 100000,
                  "stepsize": 50000
              }
          }
      }
  • If input data is image data and need to be resized into the fixed values, they can be also set by the config file.
  • {
        "inputs": {
            "_input_1": {
                "data_type": "img",
                "input_filepath": "$DIR_PATH/$FILEPATH1",
                "width": 0,
                "height": 0
            }
        }
    }
  • Users can select a BED file containing either ROI or TXT format as label data. If the labels are string, and needed to be encoded as numeric values, the “encoding” option can be set as “true”.
  • An example for config file.
  • {
        "inputs": {
            "_input_1": {
                "data_type": "seq",
                "input_filepath": "$DIR_PATH/$FILEPATH1",
                "roi_filepath": ""$DIR_PATH/$ROI_FILEPATH1",
                "binsize": 0,
                "stepsize": 0
            },
            "_input_2": {
                "data_type": "txt",
                "input_filepath": "$DIR_PATH/$FILEPATH2"
            }
        },
        "outputs": {
            "_output_1": {
                "label_type": "txt / bed",
                "label_filepath": "$DIR_PATH/$FILEPATH3",
                "encoding": true
            }
        }
    }
4.4. Running the Python code using users' machine
  • Command
    • Running the Python code
    • $ model.py --config_file [path of config_file] --outdir [path of out_directory]
    • ­­­­If users can use Tensorboard, they can track loss and accuracy of models in Tensorboard using the following command:
    • $ tensorboard $OUTDIR_PATH/log_dir
  • ­­Options
    • --config: The file path for the config file (config.json).
    • --outdir: The directory path for outputs of a deep learning model. If “--outdir” option is unused, the outputs will be created in the current directory.
    • --print_lyrs: The comma-separated name list of hidden layers whose output should be printed. The layer names of one or more layers can be passed by the option.
    • ­--model_name: The file name for the finally saved model.
    • ­--model_format: The format for saving an entire model to disk (SavedModel or h5).
4.5. Running the Python code using Google Colaboratory
  • Users can use Google Colab for training and testing their deep learning models constructed in DLEB.
    • Goolge Colab is an online browser-based platform that provides free computer resources including GPUs for deep learning applications.
  • How to run Python code
    1. Upload the following directory and files required for training deep learning models to users’ Google Drive.
      • ­Directory downloaded from DLEB
      • The directory includes Python source code generated by DLEB and ­IPYNB file for executing the source code in Google Colab.
      • ­Files containing input data including training, testing data and label data.
    2. Open the IPYNB file in uploaded directory with Google Colaboratory.
    3. In the ‘Runtime’ menu, click the ‘Change runtime type’ and select ‘GPU’ for hardware accelerator for training a deep learning model.
    4. Mount Google Drive by running the first cell by clicking the play button.
    5. Run the second cell by clicking the play button for installing Conda and setting the Conda environment.
    6. Open and edit the config.json file in Google Drive.
      • To open config.json file, double click the "config.json" tab in sibebar.
      • ­­File paths for input and label data in your Google Drive can be easily obtained by clicking the "Copy path" button as shown below.
      • For more details about setting the config.json file, See the 4.3. Setting config file section
    7. Run the third cell by clicking the play button for training and testing a deep learning model.
      • ­­­The default output directory path in Google Colab is "/content/", which can be changed by using the "--outdir" option. For more details about options for running the Python code, see the 4.4 Running the Python code using users' machine section.
4.6. Guidelines for hyperparameter tuning
  • Hyperparameter is the parameter used to control leaning process. Selecting proper hyperparameters helps to improve performances of the deep learning models. Here is some guidelines about hyperparameter tuning.
  • Number of hidden layers
    • For many problems, starting with just one or two hidden layers will work just fine.
    • For more complex problems, researchers can gradually increase the number of hidden layers, until overfitting is observed.
    • Very complex tasks, such as large image classification, typically require networks with dozens of layers, and they need a huge amount of training data.
    • Reusing parts of a pretrained network that performs a similar task can be a good alternative for such tasks.
  • Number of nodes in a hidden layer
    • The number of nodes in the input and output layers is determined by the type of input and output data.
    • For hidden layers, it is a common practice to size them to form a pyramid by gradually increase or decrease the number of nodes in hidden layers.
    • However, simply using the same number of nodes in all hidden layers performs just as well in many cases. Researchers can try increasing the number of nodes gradually until overfitting is observed.
  • Learning rate, batch size and other hyperparameters
    • In general, an optimal learning rate is about half of the maximum learning rate. A simple approach for tuning the learning rate is to start with a large value, then decrease this value and try again, and repeat until the training algorithm stops diverging.
    • ­Choosing a good optimizer for training is also quite important. Detailed information about optimizer is in the ‘Optimization of deep learning models’ section.
    • ­The batch size can also have a significant impact on the performance of deep learning models and the time of training. A small batch size ensures the short time of each training iteration, while a large batch size gives a more precise estimate of the gradients in the expense of training time. If ‘Batch normalization’ is used, the batch size should not be too small.
    • ­Selecting an appropriate activation function is also important. In general, the ReLU activation function will be a good default for all hidden layers. For the output layer, it depends on researchers’ tasks.
    • You can find more information at the following links and papers.
4.7. Optimization of deep learning models
  • The parameters in deep learning models can be obtained by minimizing the difference between the predicted values of the models and true ones in training dataset. This process is called optimization, and here is some information about the optimizers available in DLEB.
  • Optimizer Description
    Stochastic gradient descent The stochastic gradient descent (SGD) algorithm (momentum optimization) is the variant of the gradient descent (GD) algorithm which can find the local minimum of a loss function by following the opposite direction of a gradient at each iteration. Instead of using a whole training dataset in the GD algorithm, the SGD algorithm randomly selects a small portion of training dataset and uses them for calculating the gradient. The SGD algorithm converges in less time and requires less memory.
    AdaGrad A learning rate is an important parameter that controls the amount of movement during the optimization steps. Too small learning rate increases the time to converge, whereas too large value makes the model converge too quickly to a sub-optimum. The AdaGrad algorithm automatically adjusts the learning rate as training goes on. Therefore, researchers do not have to manually tune the learning rate. However, it is computationally expensive because of the need to calculate the second order derivative.
    Adadelta It is an extension of AdaGrad for removing the decaying learning rate problem in AdaGrad.
    RMSprop The RMSProp algorithm fixes the problem of AdaGrad, which slows down a bit too fast and ends up never converging to a global optimum, by accumulating only the gradients from the most recent iterations. RMSprop was the preferred optimization algorithm of many researchers until the Adam optimizer was developed.
    Adam Adam (Adaptive moment estimation) combines the ideas of the momentum optimization in SGD and RMSProp. Since Adam is an adaptive learning rate algorithm, it requires less tuning of the learning rate.
    Adamax Adamax is an extension to the Adam algorithm. Adamax scales down the parameter updates based on the infinity norm. This can make Adamax more stable than Adam.
    Nesterov Adam The Nesterov Adam (Nadam) optimizer is simply the Adam optimizer plus the Nesterov trick. It often converges slightly faster than Adam.
  • You can find more information at the following links and paper.
4.8. Overfitting and underfitting
  • Overfitting occurs when the error measured using a test dataset begins to increase while the error measured using a training dataset still decreases.
  • Although it is often possible to achieve high accuracy on a training dataset, our final goal should be to develop models that generalize well on a test dataset.
  • The opposite of overfitting is underfitting. It occurs when the model is not able to obtain a sufficiently low error on a training dataset.
  • This means the model is not flexible enough to learn relevant patterns in the training dataset.
  • You can find more information at the following links and book.
4.9. Outputs
  • ­­Preprocessed biological input data
    • File path:
    • $ OUTDIR_PATH/input_preproc_data/[layer_name]_[train|val|test].npy
    • The input data is preprocessed into the format available for training and testing a deep learning model. The preprocessed data is provided as one of the output files.
    • All preprocessed biological data are saved in the NumPy array format.
    • Biological data Preprocessed data
      Sequence data (FASTA) One-hot encoded array
      Alignment data (BAM) Read coverage array
      Signal data (bigWig) Signal coverage array
      Image data (JPEG, PNG) Decoded array

  • ­The structure of a deep learning model
    • File path:
    • $ OUTDIR_PATH/[model_name](.h5)
    • The structure (architecture) of a model, a set of weights, and optimizer information are saved in the Tensorflow SavedModel format and the Keras H5 format.
    • This saved model can be reused with new test datasets.
  • ­­­Outputs of the intermediate hidden layers
    • File path:
    • $ OUTDIR_PATH/layer_output/[layer_name].output.txt
    • The outputs of the intermediate layers selected by users using the “--print_lyrs” option are saved into a text file format.
  • ­The log directory for using Tensorboard
    • Directory path:
    • $ OUTDIR_PATH/log_dir/
    • Users can use Tensorboard for tracking model loss and accuracy and visualizing the model graph. If the users set the “Use Tensorboard” option on the web page, the log files for using Tensorboard are saved into the output directory.