1. | Model & layer |
1.1. | Model architecture in DLEB |
1.2. | Input layer in DLEB |
1.3. | Output layer in DLEB |
1.4. | Setting training parameter of output layer |
1.5. | Hidden layers in DLEB |
1.6. | Layer types available in DLEB |
1.7. | Custom layer |
1.8. | Activation function |
2. | Designing and building your deep learning model |
2.1. | Adding and removing layers |
2.2. | Setting parameters of layers |
2.3. | Handling layers |
2.4. | Adding and removing links between layers |
2.5. | Handling activation function |
2.6. | Grouping and ungrouping layers |
2.7. | Using template models |
2.8. | Importing to Python code |
3. | Using a deep learning model recommended by DLEB |
3.1. | Choosing the purpose of a deep learning model |
3.2. | Selecting the type of data for training a deep learning model |
3.3. | Selecting the structure of a deep learning model |
4. | Executing the Python code generated by DLEB |
4.1. | Preparing the requirements for using the Python code |
4.2. | Preparing the input data for a deep learning model |
4.3. | Setting config file |
4.4. | Running the Python code using users' machine |
4.5. | Running the Python code using Google Colaboratory |
4.6. | Guidelines for hyperparameter tuning |
4.7. | Optimization of deep learning models |
4.8. | Overfitting and underfitting |
4.9. | Outputs |
Type | Name | Description |
Core layer | Dense layer | Basic fully connected layer implementing an operation: output = (input * weight) + bias. |
Flatten layer | Layer converting input data as a matrix form to a single array | |
Reshape layer | Layer changing the shape of input data. | |
Convolution layer | 1D convolution layer | Convolution layer with one-dimensional input/output data and a kernel moving in one direction. |
2D convolution layer | Convolution layer with two-dimensional input/output data and a kernel moving in two directions. | |
3D convolution layer | Convolution layer with three-dimensional input/output data and a kernel moving in three directions. | * More description about 1D, 2D, and 3D convolution layer in this link.|
1D pooling layer | Layer for down-sampling input data by taking average or maximum value over a one-dimensional window. | |
2D pooling layer | Layer for down-sampling input data by taking average or maximum value over a two-dimensional window. | |
3D pooling layer | Layer for down-sampling input data by taking average or maximum value over a three-dimensional window. | |
1D deconvolution layer | Layer working through an opposite direction of a convolution layer in terms of data transformation for one-dimensional input/output data. | |
2D deconvolution layer | Layer working through an opposite direction of a convolution layer in terms of data transformation for two-dimensional input/output data. | |
3D deconvolution layer | Layer working through an opposite direction of a convolution layer in terms of data transformation for three-dimensional input/output data. | |
Recurrent layer | Simple RNN layer | Fully connected recurrent neural network layer. |
GRU layer | Gated recurrent unit layer. | |
LSTM layer | Long short-term memory layer. | |
Merging layer | Concatenate layer | Layer for concatenating the list of input data. * The layer must be connected with at least two layers. |
Average layer | Layer for averaging the list of input data. * The layer must be connected with at least two layers. |
|
Maximum layer | Layer for computing the maximum value among the list of input data. * The layer must be connected with at least two layers. |
|
Minimum layer | Layer for computing the minimum value among the list of input data. * The layer must be connected with at least two layers. |
|
Add layer | Layer for adding the list of input data. * The layer must be connected with at least two layers. |
|
Subtract layer | Layer for subtracting the list of input data. * The layer must be connected with at least two layers. |
|
Multiply layer | Layer for multiplying the list of input data. * The layer must be connected with at least two layers. |
|
Normalization / Regularization | Dropout | Randomly selecting and not using some of input data for preventing overfitting. |
Batch normalization | Normalization of output data by using the mean and the standard deviation of input data in a mini batch. | |
Layer group | Encoder | A group of layers used for learning data representation in an autoencoder. |
Decoder | A group of layers used for reconstructing data based on data features in an autoencoder. | |
1D Convolutional encoder | A encoder layer group with 1D convolutional layers. | |
1D Convolutional decoder | A decoder layer group with 1D convolutional layers. | |
2D Convolutional decoder | A encoder layer group with 2D convolutional layers. | |
2D Convolutional decoder | A decoder layer group with 2D convolutional layers. | |
3D Convolutional decoder | A encoder layer group with 3D convolutional layers. | |
3D Convolutional decoder | A decoder layer group with 3D convolutional layers. | |
Generator | A group of layers for generating similar data as input data in a GAN model. | |
Discriminator | A group of layers for discriminating real data from faked ones in a GAN model. | |
Advanced layer | Noise layer | Layer for introducing additive zero-centered Gaussian noise. |
Sampling layer | Layer for adding random noise to input data. | |
Bottleneck layer | Layer for generating encoded data in an autoencoder. | |
Pretrained layer | Xception | Model for instantiating the Xception architecture. The default input size for this model is 299x299. * More description about Xception in this link. |
VGG | Model for instantiating the VGG16 architecture. The default input size for this model is 224x224. * More description about VGG in this link. |
|
ResNet | Model for instantiating the ResNet50 architecture. The default input size for this model is 224x224. * More description about ResNet in this link. |
|
Inception | Model for instantiating the Inception V3 architecture. The default input size for this model is 299x299. * More description about Inception in this link. |
|
MobileNet | Model for instantiating the MobileNetV3Large architecture. The default input size for this model is 224x224. * More description about MobileNet in this link. |
|
DenseNet | Model for instantiating the Densenet121 architecture. The default input size for this model is 224x224. * More description about DenseNet in this link. |
|
NASNet | Model for instantiating the NASNet model in the ImageNet mode. The default input size for this model is 331x331. * More description about NASNet in this link. |
|
Custom layer | Custom layer | Layer that can be customized by users. |
Function | Description |
Sigmoid | A sigmoid function is an S-shaped activation function producing numbers between 0 and 1. |
Softmax | A softmax function is a function producing a vector of values that sum to 1. |
Tanh | A Tanh (hyperbolic tangent) activation function is an S-shaped activation function producing numbers between -1 to 1. Recurrent networks commonly use the Tanh activation function. |
ReLU | A ReLU activation function is a linear function producing the same input data if it is positive and zero otherwise. It is the most common activation function in deep learning models. |
Elu | An Elu activation function is a linear function producing the same input data if it is positive and negative values otherwise. The negative values are calculated by using an exponential function. |
# Sequence data
>SEQ1
ACGTTTGCCGGGTGGGGTTCGAAAC
>SEQ2
GCGGTTTGCGCTCTCTCTCTAAATT
>SEQ3
ATGACTCTAGTCTCTCTAGTCTAGT
# Label data
SEQ1 1
SEQ2 0
SEQ3 1
# Raw sequence data
>SEQ1
TTTGCTGTTCCTGCATGTAGTTTAAACGAGATTGCCAGCACCGGGTATCA
TTCACCATTTTTCTTTTCGTTAACTTGCCGTCAGCCTTTTCTTTGACCTC
...
CCTTTCCACCGGGCCTTTGAGAGGTCACAGGGTCTTGATGCTGTGGTCTT
CATCTGCAGGTGTCTGACTTCCAGCAACTGCTGGCCTGTGCCAGGGTGCA
# An example of label data in BED format defining the region of interest (ROI)
SEQ1 10000 10010 . 0 +
SEQ1 10020 10030 . 1 +
SEQ1 10030 10040 . 1 +
# Processed sequences split by DLEB code
>SEQ1:10000-10010
ACGTCCCGTA
>SEQ1:10020-10030
ACGAAATGTT
>SEQ1:10030-10040
TTGACTCTAT
# An example of Raw sequence data
>SEQ1
TTTGCTGTTCCTGCATGTAGTTTAAACGAGATTGCCAGCACCGGGTATCA
TTCACCATTTTTCTTTTCGTTAACTTGCCGTCAGCCTTTTCTTTGACCTC
...
CCTTTCCACCGGGCCTTTGAGAGGTCACAGGGTCTTGATGCTGTGGTCTT
CATCTGCAGGTGTCTGACTTCCAGCAACTGCTGGCCTGTGCCAGGGTGCA
# An example of label data in BED format defining the region of interest (ROI)
SEQ1 10000 15000 . 0 +
SEQ1 25500 28500 . 1 +
SEQ1 30500 31000 . 1 +
# An example of Processed sequences split by DLEB code
# Bin size: 2,500 / Step size :2,500
>SEQ1:10000-15000
ACTGGTG ... ACCCTTGGG
>SEQ1:12500-15000
ACTTGCTT ... ATTGGATCA
>SEQ1:25500-28000
TGGCTAG ... ATGATACTTA
# An example of alignment data
READ1 16 SEQ1 15944 255 36M * 0 0 TTATCACAATGTCATCCGCAGCTAATTTTGAGCCCA==>:;6;?>>;?8?937?;8;1A?@@>@@A?;8>?= XA:i:0 MD:Z:36 NM:i:0
READ2 16 SEQ1 16076 255 36M * 0 0 AGGTTTCAATAACATCTTTGTCCTCTATTACAACGG AA@@@@ABBABAAB@BBBBBABBABABABACBBCCB XA:i:0 MD:Z:36 NM:i:0
READ3 16 SEQ1 16542 255 36M * 0 0 GGCACCACTCACGATAACCTGGGCACCGGTGTTCCT 55;5=:4;;=5?=>;=6>A;=;:>4?>;=?=A??B@ XA:i:0 MD:Z:36 NM:i:0
# An example of label data in BED format defining ROI
SEQ1 10000 15000 . 1 *
SEQ1 25500 31000 . 0 *
SEQ1 30500 38000 . 1 *
# An example of label data
Img1.jpg 1
Img2.jpg 0
Img3.jpg 1
Img4.jpg 0
# An example of text data
Feature1 Feature2 Feature3 Feature4 Feature5
Sample1 0.5 0.2 0.3 0.1 0.5
Sample1 0.3 0.5 0.1 0.5 0.3
Sample1 0.4 0.3 0.1 0.4 0.2
# An example of label data
Sample1 1
Sample2 0
Sample3 1
Optimizer | Description |
Stochastic gradient descent | The stochastic gradient descent (SGD) algorithm (momentum optimization) is the variant of the gradient descent (GD) algorithm which can find the local minimum of a loss function by following the opposite direction of a gradient at each iteration. Instead of using a whole training dataset in the GD algorithm, the SGD algorithm randomly selects a small portion of training dataset and uses them for calculating the gradient. The SGD algorithm converges in less time and requires less memory. |
AdaGrad | A learning rate is an important parameter that controls the amount of movement during the optimization steps. Too small learning rate increases the time to converge, whereas too large value makes the model converge too quickly to a sub-optimum. The AdaGrad algorithm automatically adjusts the learning rate as training goes on. Therefore, researchers do not have to manually tune the learning rate. However, it is computationally expensive because of the need to calculate the second order derivative. |
Adadelta | It is an extension of AdaGrad for removing the decaying learning rate problem in AdaGrad. |
RMSprop | The RMSProp algorithm fixes the problem of AdaGrad, which slows down a bit too fast and ends up never converging to a global optimum, by accumulating only the gradients from the most recent iterations. RMSprop was the preferred optimization algorithm of many researchers until the Adam optimizer was developed. |
Adam | Adam (Adaptive moment estimation) combines the ideas of the momentum optimization in SGD and RMSProp. Since Adam is an adaptive learning rate algorithm, it requires less tuning of the learning rate. |
Adamax | Adamax is an extension to the Adam algorithm. Adamax scales down the parameter updates based on the infinity norm. This can make Adamax more stable than Adam. |
Nesterov Adam | The Nesterov Adam (Nadam) optimizer is simply the Adam optimizer plus the Nesterov trick. It often converges slightly faster than Adam. |
Biological data | Preprocessed data |
Sequence data (FASTA) | One-hot encoded array |
Alignment data (BAM) | Read coverage array |
Signal data (bigWig) | Signal coverage array |
Image data (JPEG, PNG) | Decoded array |