torchdrug.models#
Knowledge Graph Reasoning Models#
TransE#
- class TransE(num_entity, num_relation, embedding_dim, max_score=12)[source]#
- TransE embedding proposed in Translating Embeddings for Modeling Multi-relational Data. - Parameters
- num_entity (int) – number of entities 
- num_relation (int) – number of relations 
- embedding_dim (int) – dimension of embeddings 
- max_score (float, optional) – maximal score for triplets 
 
 - forward(graph, h_index, t_index, r_index, all_loss=None, metric=None)[source]#
- Compute the score for each triplet. - Parameters
- graph (Graph) – fact graph 
- h_index (Tensor) – indexes of head entities 
- t_index (Tensor) – indexes of tail entities 
- r_index (Tensor) – indexes of relations 
- all_loss (Tensor, optional) – if specified, add loss to this tensor 
- metric (dict, optional) – if specified, output metrics to this dict 
 
 
 
DistMult#
- class DistMult(num_entity, num_relation, embedding_dim, l3_regularization=0)[source]#
- DistMult embedding proposed in Embedding Entities and Relations for Learning and Inference in Knowledge Bases. - Parameters
- num_entity (int) – number of entities 
- num_relation (int) – number of relations 
- embedding_dim (int) – dimension of embeddings 
- l3_regularization (float, optional) – weight for l3 regularization 
 
 - forward(graph, h_index, t_index, r_index, all_loss=None, metric=None)[source]#
- Compute the score for each triplet. - Parameters
- graph (Graph) – fact graph 
- h_index (Tensor) – indexes of head entities 
- t_index (Tensor) – indexes of tail entities 
- r_index (Tensor) – indexes of relations 
- all_loss (Tensor, optional) – if specified, add loss to this tensor 
- metric (dict, optional) – if specified, output metrics to this dict 
 
 
 
ComplEx#
- class ComplEx(num_entity, num_relation, embedding_dim, l3_regularization=0)[source]#
- ComplEx embedding proposed in Complex Embeddings for Simple Link Prediction. - Parameters
- num_entity (int) – number of entities 
- num_relation (int) – number of relations 
- embedding_dim (int) – dimension of embeddings 
- l3_regularization (float, optional) – weight for l3 regularization 
 
 - forward(graph, h_index, t_index, r_index, all_loss=None, metric=None)[source]#
- Compute the score for triplets. - Parameters
- graph (Graph) – fact graph 
- h_index (Tensor) – indexes of head entities 
- t_index (Tensor) – indexes of tail entities 
- r_index (Tensor) – indexes of relations 
- all_loss (Tensor, optional) – if specified, add loss to this tensor 
- metric (dict, optional) – if specified, output metrics to this dict 
 
 
 
SimplE#
- class SimplE(num_entity, num_relation, embedding_dim, l3_regularization=0)[source]#
- SimplE embedding proposed in SimplE Embedding for Link Prediction in Knowledge Graphs. - Parameters
- num_entity (int) – number of entities 
- num_relation (int) – number of relations 
- embedding_dim (int) – dimension of embeddings 
- l3_regularization (float, optional) – maximal score for triplets 
 
 - forward(graph, h_index, t_index, r_index, all_loss=None, metric=None)[source]#
- Compute the score for each triplet. - Parameters
- graph (Graph) – fact graph 
- h_index (Tensor) – indexes of head entities 
- t_index (Tensor) – indexes of tail entities 
- r_index (Tensor) – indexes of relations 
- all_loss (Tensor, optional) – if specified, add loss to this tensor 
- metric (dict, optional) – if specified, output metrics to this dict 
 
 
 
RotatE#
- class RotatE(num_entity, num_relation, embedding_dim, max_score=12)[source]#
- RotatE embedding proposed in RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space. - Parameters
- num_entity (int) – number of entities 
- num_relation (int) – number of relations 
- embedding_dim (int) – dimension of embeddings 
- max_score (float, optional) – maximal score for triplets 
 
 - forward(graph, h_index, t_index, r_index, all_loss=None, metric=None)[source]#
- Compute the score for each triplet. - Parameters
- graph (Graph) – fact graph 
- h_index (Tensor) – indexes of head entities 
- t_index (Tensor) – indexes of tail entities 
- r_index (Tensor) – indexes of relations 
- all_loss (Tensor, optional) – if specified, add loss to this tensor 
- metric (dict, optional) – if specified, output metrics to this dict 
 
 
 
NeuralLP#
- class NeuralLogicProgramming(num_relation, hidden_dim, num_step, num_lstm_layer=1)[source]#
- Neural Logic Programming proposed in Differentiable Learning of Logical Rules for Knowledge Base Reasoning. - Parameters
- num_relation (int) – number of relations 
- hidden_dim (int) – dimension of hidden units in LSTM 
- num_step (int) – number of recurrent steps 
- num_lstm_layer (int, optional) – number of LSTM layers 
 
 - forward(graph, h_index, t_index, r_index, all_loss=None, metric=None)[source]#
- Compute the score for triplets. - Parameters
- graph (Tensor) – fact graph 
- h_index (Tensor) – indexes of head entities 
- t_index (Tensor) – indexes of tail entities 
- r_index (Tensor) – indexes of relations 
- all_loss (Tensor, optional) – if specified, add loss to this tensor 
- metric (dict, optional) – if specified, output metrics to this dict 
 
 
 
- NeuralLP#
KBGAT#
- class KnowledgeBaseGraphAttentionNetwork(num_entity, num_relation, embedding_dim, hidden_dims, max_score=12, edge_input_dim=None, num_head=1, negative_slope=0.2, short_cut=False, batch_norm=False, activation='relu', concat_hidden=False, readout='sum')[source]#
- Knowledge Base Graph Attention Network proposed in Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs. - Parameters
- num_entity (int) – number of entities 
- num_relation (int) – number of relations 
- embedding_dim (int) – dimension of embeddings 
- hidden_dims (list of int) – hidden dimensions 
- max_score (float, optional) – maximal score for triplets 
- edge_input_dim (int, optional) – dimension of edge features 
- num_head (int, optional) – number of attention heads 
- negative_slope (float, optional) – negative slope of leaky relu activation 
- short_cut (bool, optional) – use short cut or not 
- batch_norm (bool, optional) – apply batch normalization or not 
- activation (str or function, optional) – activation function 
- concat_hidden (bool, optional) – concat hidden representations from all layers as output 
- readout (str, optional) – readout function. Available functions are - sumand- mean.
 
 - forward(graph, h_index, t_index, r_index, all_loss=None, metric=None)[source]#
- Compute the score for triplets. - Parameters
- graph (Graph) – fact graph 
- h_index (Tensor) – indexes of head entities 
- t_index (Tensor) – indexes of tail entities 
- r_index (Tensor) – indexes of relations 
- all_loss (Tensor, optional) – if specified, add loss to this tensor 
- metric (dict, optional) – if specified, output metrics to this dict 
 
 
 
- KBGAT#
- alias of - torchdrug.models.kbgat.KnowledgeBaseGraphAttentionNetwork
Graph Neural Networks#
ChebNet#
- class ChebyshevConvolutionalNetwork(input_dim, hidden_dims, edge_input_dim=None, k=1, short_cut=False, batch_norm=False, activation='relu', concat_hidden=False, readout='sum')[source]#
- Chebyshev convolutional network proposed in Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering. - Parameters
- input_dim (int) – input dimension 
- hidden_dims (list of int) – hidden dimensions 
- edge_input_dim (int, optional) – dimension of edge features 
- k (int, optional) – number of Chebyshev polynomials 
- short_cut (bool, optional) – use short cut or not 
- batch_norm (bool, optional) – apply batch normalization or not 
- activation (str or function, optional) – activation function 
- concat_hidden (bool, optional) – concat hidden representations from all layers as output 
- readout (str, optional) – readout function. Available functions are - sumand- mean.
 
 - forward(graph, input, all_loss=None, metric=None)[source]#
- Compute the node representations and the graph representation(s). - Parameters
- graph (Graph) – \(n\) graph(s) 
- input (Tensor) – input node representations 
- all_loss (Tensor, optional) – if specified, add loss to this tensor 
- metric (dict, optional) – if specified, output metrics to this dict 
 
- Returns
- node representations of shape \((|V|, d)\), graph representations of shape \((n, d)\) 
- Return type
- dict with - node_featureand- graph_featurefields
 
 
- ChebNet#
- alias of - torchdrug.models.chebnet.ChebyshevConvolutionalNetwork
GCN#
- class GraphConvolutionalNetwork(input_dim, hidden_dims, edge_input_dim=None, short_cut=False, batch_norm=False, activation='relu', concat_hidden=False, readout='sum')[source]#
- Graph Convolutional Network proposed in Semi-Supervised Classification with Graph Convolutional Networks. - Parameters
- input_dim (int) – input dimension 
- hidden_dims (list of int) – hidden dimensions 
- edge_input_dim (int, optional) – dimension of edge features 
- short_cut (bool, optional) – use short cut or not 
- batch_norm (bool, optional) – apply batch normalization or not 
- activation (str or function, optional) – activation function 
- concat_hidden (bool, optional) – concat hidden representations from all layers as output 
- readout (str, optional) – readout function. Available functions are - sumand- mean.
 
 - forward(graph, input, all_loss=None, metric=None)[source]#
- Compute the node representations and the graph representation(s). - Parameters
- graph (Graph) – \(n\) graph(s) 
- input (Tensor) – input node representations 
- all_loss (Tensor, optional) – if specified, add loss to this tensor 
- metric (dict, optional) – if specified, output metrics to this dict 
 
- Returns
- node representations of shape \((|V|, d)\), graph representations of shape \((n, d)\) 
- Return type
- dict with - node_featureand- graph_featurefields
 
 
- GCN#
GAT#
- class GraphAttentionNetwork(input_dim, hidden_dims, edge_input_dim=None, num_head=1, negative_slope=0.2, short_cut=False, batch_norm=False, activation='relu', concat_hidden=False, readout='sum')[source]#
- Graph Attention Network proposed in Graph Attention Networks. - Parameters
- input_dim (int) – input dimension 
- hidden_dims (list of int) – hidden dimensions 
- edge_input_dim (int, optional) – dimension of edge features 
- num_head (int, optional) – number of attention heads 
- negative_slope (float, optional) – negative slope of leaky relu activation 
- short_cut (bool, optional) – use short cut or not 
- batch_norm (bool, optional) – apply batch normalization or not 
- activation (str or function, optional) – activation function 
- concat_hidden (bool, optional) – concat hidden representations from all layers as output 
- readout (str, optional) – readout function. Available functions are - sumand- mean.
 
 - forward(graph, input, all_loss=None, metric=None)[source]#
- Compute the node representations and the graph representation(s). - Parameters
- graph (Graph) – \(n\) graph(s) 
- input (Tensor) – input node representations 
- all_loss (Tensor, optional) – if specified, add loss to this tensor 
- metric (dict, optional) – if specified, output metrics to this dict 
 
- Returns
- node representations of shape \((|V|, d)\), graph representations of shape \((n, d)\) 
- Return type
- dict with - node_featureand- graph_featurefields
 
 
- GAT#
GIN#
- class GraphIsomorphismNetwork(input_dim, hidden_dims, edge_input_dim=None, num_mlp_layer=2, eps=0, learn_eps=False, short_cut=False, batch_norm=False, activation='relu', concat_hidden=False, readout='sum')[source]#
- Graph Ismorphism Network proposed in How Powerful are Graph Neural Networks? - Parameters
- input_dim (int) – input dimension 
- hidden_dims (list of int) – hidden dimensions 
- edge_input_dim (int, optional) – dimension of edge features 
- num_mlp_layer (int, optional) – number of MLP layers 
- eps (int, optional) – initial epsilon 
- learn_eps (bool, optional) – learn epsilon or not 
- short_cut (bool, optional) – use short cut or not 
- batch_norm (bool, optional) – apply batch normalization or not 
- activation (str or function, optional) – activation function 
- concat_hidden (bool, optional) – concat hidden representations from all layers as output 
- readout (str, optional) – readout function. Available functions are - sumand- mean.
 
 - forward(graph, input, all_loss=None, metric=None)[source]#
- Compute the node representations and the graph representation(s). - Parameters
- graph (Graph) – \(n\) graph(s) 
- input (Tensor) – input node representations 
- all_loss (Tensor, optional) – if specified, add loss to this tensor 
- metric (dict, optional) – if specified, output metrics to this dict 
 
- Returns
- node representations of shape \((|V|, d)\), graph representations of shape \((n, d)\) 
- Return type
- dict with - node_featureand- graph_featurefields
 
 
- GIN#
MPNN#
- class MessagePassingNeuralNetwork(input_dim, hidden_dim, edge_input_dim, num_layer=1, num_gru_layer=1, num_mlp_layer=2, num_s2s_step=3, short_cut=False, batch_norm=False, activation='relu', concat_hidden=False)[source]#
- Message Passing Neural Network proposed in Neural Message Passing for Quantum Chemistry. - This implements the enn-s2s variant in the original paper. - Parameters
- input_dim (int) – input dimension 
- hidden_dim (int) – hidden dimension 
- edge_input_dim (int) – dimension of edge features 
- num_layer (int, optional) – number of hidden layers 
- num_gru_layer (int, optional) – number of GRU layers in each node update 
- num_mlp_layer (int, optional) – number of MLP layers in each message function 
- num_s2s_step (int, optional) – number of processing steps in set2set 
- short_cut (bool, optional) – use short cut or not 
- batch_norm (bool, optional) – apply batch normalization or not 
- activation (str or function, optional) – activation function 
- concat_hidden (bool, optional) – concat hidden representations from all layers as output 
 
 - forward(graph, input, all_loss=None, metric=None)[source]#
- Compute the node representations and the graph representation(s). - Parameters
- graph (Graph) – \(n\) graph(s) 
- input (Tensor) – input node representations 
- all_loss (Tensor, optional) – if specified, add loss to this tensor 
- metric (dict, optional) – if specified, output metrics to this dict 
 
- Returns
- node representations of shape \((|V|, d)\), graph representations of shape \((n, d)\) 
- Return type
- dict with - node_featureand- graph_featurefields
 
 
- MPNN#
NFP#
- class NeuralFingerprint(input_dim, output_dim, hidden_dims, edge_input_dim=None, short_cut=False, batch_norm=False, activation='relu', concat_hidden=False, readout='sum')[source]#
- Neural Fingerprints from Convolutional Networks on Graphs for Learning Molecular Fingerprints. - Parameters
- input_dim (int) – input dimension 
- output_dim (int) – fingerprint dimension 
- hidden_dims (list of int) – hidden dimensions 
- edge_input_dim (int, optional) – dimension of edge features 
- short_cut (bool, optional) – use short cut or not 
- batch_norm (bool, optional) – apply batch normalization or not 
- activation (str or function, optional) – activation function 
- concat_hidden (bool, optional) – concat hidden representations from all layers as output 
- readout (str, optional) – readout function. Available functions are - sumand- mean.
 
 - forward(graph, input, all_loss=None, metric=None)[source]#
- Compute the node representations and the graph representation(s). - Parameters
- graph (Graph) – \(n\) graph(s) 
- input (Tensor) – input node representations 
- all_loss (Tensor, optional) – if specified, add loss to this tensor 
- metric (dict, optional) – if specified, output metrics to this dict 
 
- Returns
- node representations of shape \((|V|, d)\), graph representations of shape \((n, d)\) 
- Return type
- dict with - node_featureand- graph_featurefields
 
 
- NFP#
RGCN#
- class RelationalGraphConvolutionalNetwork(input_dim, hidden_dims, num_relation, edge_input_dim=None, short_cut=False, batch_norm=False, activation='relu', concat_hidden=False, readout='sum')[source]#
- Relational Graph Convolutional Network proposed in Modeling Relational Data with Graph Convolutional Networks?. - Parameters
- input_dim (int) – input dimension 
- hidden_dims (list of int) – hidden dimensions 
- num_relation (int) – number of relations 
- edge_input_dim (int, optional) – dimension of edge features 
- short_cut (bool, optional) – use short cut or not 
- batch_norm (bool, optional) – apply batch normalization or not 
- activation (str or function, optional) – activation function 
- concat_hidden (bool, optional) – concat hidden representations from all layers as output 
- readout (str, optional) – readout function. Available functions are - sumand- mean.
 
 - forward(graph, input, all_loss=None, metric=None)[source]#
- Compute the node representations and the graph representation(s). - Require the graph(s) to have the same number of relations as this module. - Parameters
- graph (Graph) – \(n\) graph(s) 
- input (Tensor) – input node representations 
- all_loss (Tensor, optional) – if specified, add loss to this tensor 
- metric (dict, optional) – if specified, output metrics to this dict 
 
- Returns
- node representations of shape \((|V|, d)\), graph representations of shape \((n, d)\) 
- Return type
- dict with - node_featureand- graph_featurefields
 
 
GearNet#
- class GeometryAwareRelationalGraphNeuralNetwork(input_dim, hidden_dims, num_relation, edge_input_dim=None, num_angle_bin=None, short_cut=False, batch_norm=False, activation='relu', concat_hidden=False, readout='sum')[source]#
- Geometry Aware Relational Graph Neural Network proposed in Protein Representation Learning by Geometric Structure Pretraining. - Parameters
- input_dim (int) – input dimension 
- hidden_dims (list of int) – hidden dimensions 
- num_relation (int) – number of relations 
- edge_input_dim (int, optional) – dimension of edge features 
- num_angle_bin (int, optional) – number of bins to discretize angles between edges. The discretized angles are used as relations in edge message passing. If not provided, edge message passing is disabled. 
- short_cut (bool, optional) – use short cut or not 
- batch_norm (bool, optional) – apply batch normalization or not 
- activation (str or function, optional) – activation function 
- concat_hidden (bool, optional) – concat hidden representations from all layers as output 
- readout (str, optional) – readout function. Available functions are - sumand- mean.
 
 - forward(graph, input, all_loss=None, metric=None)[source]#
- Compute the node representations and the graph representation(s). - Parameters
- graph (Graph) – \(n\) graph(s) 
- input (Tensor) – input node representations 
- all_loss (Tensor, optional) – if specified, add loss to this tensor 
- metric (dict, optional) – if specified, output metrics to this dict 
 
- Returns
- node representations of shape \((|V|, d)\), graph representations of shape \((n, d)\) 
- Return type
- dict with - node_featureand- graph_featurefields
 
 
- GearNet#
- alias of - torchdrug.models.gearnet.GeometryAwareRelationalGraphNeuralNetwork
SchNet#
- class SchNet(input_dim, hidden_dims, edge_input_dim=None, cutoff=5, num_gaussian=100, short_cut=True, batch_norm=False, activation='shifted_softplus', concat_hidden=False)[source]#
- SchNet from SchNet: A continuous-filter convolutional neural network for modeling quantum interactions. - Parameters
- input_dim (int) – input dimension 
- hidden_dims (list of int) – hidden dimensions 
- edge_input_dim (int, optional) – dimension of edge features 
- cutoff (float, optional) – maximal scale for RBF kernels 
- num_gaussian (int, optional) – number of RBF kernels 
- short_cut (bool, optional) – use short cut or not 
- batch_norm (bool, optional) – apply batch normalization or not 
- activation (str or function, optional) – activation function 
- concat_hidden (bool, optional) – concat hidden representations from all layers as output 
 
 - forward(graph, input, all_loss=None, metric=None)[source]#
- Compute the node representations and the graph representation(s). - Require the graph(s) to have node attribute - node_position.- Parameters
- graph (Graph) – \(n\) graph(s) 
- input (Tensor) – input node representations 
- all_loss (Tensor, optional) – if specified, add loss to this tensor 
- metric (dict, optional) – if specified, output metrics to this dict 
 
- Returns
- node representations of shape \((|V|, d)\), graph representations of shape \((n, d)\) 
- Return type
- dict with - node_featureand- graph_featurefields
 
 
Protein Sequence Encoders#
ESM#
- class EvolutionaryScaleModeling(path, model='ESM-1b', readout='mean')[source]#
- The protein language model, Evolutionary Scale Modeling (ESM) proposed in Biological Structure and Function Emerge from Scaling Unsupervised Learning to 250 Million Protein Sequences. - Parameters
- path (str) – path to store ESM model weights 
- model (str, optional) – model name. Available model names are - ESM-1b,- ESM-1vand- ESM-1b-regression.
- readout (str, optional) – readout function. Available functions are - sumand- mean.
 
 - forward(graph, input, all_loss=None, metric=None)[source]#
- Compute the residue representations and the graph representation(s). - Parameters
- graph (Protein) – \(n\) protein(s) 
- input (Tensor) – input node representations 
- all_loss (Tensor, optional) – if specified, add loss to this tensor 
- metric (dict, optional) – if specified, output metrics to this dict 
 
- Returns
- residue representations of shape \((|V_{res}|, d)\), graph representations of shape \((n, d)\) 
- Return type
- dict with - residue_featureand- graph_featurefields
 
 
ProteinCNN#
- class ProteinConvolutionalNetwork(input_dim, hidden_dims, kernel_size=3, stride=1, padding=1, activation='relu', short_cut=False, concat_hidden=False, readout='max')[source]#
- Protein Shallow CNN proposed in Is Transfer Learning Necessary for Protein Landscape Prediction?. - Parameters
- input_dim (int) – input dimension 
- hidden_dims (list of int) – hidden dimensions 
- kernel_size (int, optional) – size of convolutional kernel 
- stride (int, optional) – stride of convolution 
- padding (int, optional) – padding added to both sides of the input 
- activation (str or function, optional) – activation function 
- short_cut (bool, optional) – use short cut or not 
- concat_hidden (bool, optional) – concat hidden representations from all layers as output 
- readout (str, optional) – readout function. Available functions are - sum,- mean,- maxand- attention.
 
 - forward(graph, input, all_loss=None, metric=None)[source]#
- Compute the residue representations and the graph representation(s). - Parameters
- graph (Protein) – \(n\) protein(s) 
- input (Tensor) – input node representations 
- all_loss (Tensor, optional) – if specified, add loss to this tensor 
- metric (dict, optional) – if specified, output metrics to this dict 
 
- Returns
- residue representations of shape \((|V_{res}|, d)\), graph representations of shape \((n, d)\) 
- Return type
- dict with - residue_featureand- graph_featurefields
 
 
ProteinResNet#
- class ProteinResNet(input_dim, hidden_dims, kernel_size=3, stride=1, padding=1, activation='gelu', short_cut=False, concat_hidden=False, layer_norm=False, dropout=0, readout='attention')[source]#
- Protein ResNet proposed in Evaluating Protein Transfer Learning with TAPE. - Parameters
- input_dim (int) – input dimension 
- hidden_dims (list of int) – hidden dimensions 
- kernel_size (int, optional) – size of convolutional kernel 
- stride (int, optional) – stride of convolution 
- padding (int, optional) – padding added to both sides of the input 
- activation (str or function, optional) – activation function 
- short_cut (bool, optional) – use short cut or not 
- concat_hidden (bool, optional) – concat hidden representations from all layers as output 
- layer_norm (bool, optional) – apply layer normalization or not 
- dropout (float, optional) – dropout ratio of input features 
- readout (str, optional) – readout function. Available functions are - sum,- meanand- attention.
 
 - forward(graph, input, all_loss=None, metric=None)[source]#
- Compute the residue representations and the graph representation(s). - Parameters
- graph (Protein) – \(n\) protein(s) 
- input (Tensor) – input node representations 
- all_loss (Tensor, optional) – if specified, add loss to this tensor 
- metric (dict, optional) – if specified, output metrics to this dict 
 
- Returns
- residue representations of shape \((|V_{res}|, d)\), graph representations of shape \((n, d)\) 
- Return type
- dict with - residue_featureand- graph_featurefields
 
 
ProteinLSTM#
- class ProteinLSTM(input_dim, hidden_dim, num_layers, activation='tanh', layer_norm=False, dropout=0)[source]#
- Protein LSTM proposed in Evaluating Protein Transfer Learning with TAPE. - Parameters
- input_dim (int) – input dimension 
- hidden_dim (int, optional) – hidden dimension 
- num_layers (int, optional) – number of LSTM layers 
- activation (str or function, optional) – activation function 
- layer_norm (bool, optional) – apply layer normalization or not 
- dropout (float, optional) – dropout ratio of input features 
 
 - forward(graph, input, all_loss=None, metric=None)[source]#
- Compute the residue representations and the graph representation(s). - Parameters
- graph (Protein) – \(n\) protein(s) 
- input (Tensor) – input node representations 
- all_loss (Tensor, optional) – if specified, add loss to this tensor 
- metric (dict, optional) – if specified, output metrics to this dict 
 
- Returns
- residue representations of shape \((|V_{res}|, d)\), graph representations of shape \((n, d)\) 
- Return type
- dict with - residue_featureand- graph_featurefields
 
 
ProteinBERT#
- class ProteinBERT(input_dim, hidden_dim=768, num_layers=12, num_heads=12, intermediate_dim=3072, activation='gelu', hidden_dropout=0.1, attention_dropout=0.1, max_position=8192)[source]#
- Protein BERT proposed in Evaluating Protein Transfer Learning with TAPE. - Parameters
- input_dim (int) – input dimension 
- hidden_dim (int, optional) – hidden dimension 
- num_layers (int, optional) – number of Transformer blocks 
- num_heads (int, optional) – number of attention heads 
- intermediate_dim (int, optional) – intermediate hidden dimension of Transformer block 
- activation (str or function, optional) – activation function 
- hidden_dropout (float, optional) – dropout ratio of hidden features 
- attention_dropout (float, optional) – dropout ratio of attention maps 
- max_position (int, optional) – maximum number of positions 
 
 - forward(graph, input, all_loss=None, metric=None)[source]#
- Compute the residue representations and the graph representation(s). - Parameters
- graph (Protein) – \(n\) protein(s) 
- input (Tensor) – input node representations 
- all_loss (Tensor, optional) – if specified, add loss to this tensor 
- metric (dict, optional) – if specified, output metrics to this dict 
 
- Returns
- residue representations of shape \((|V_{res}|, d)\), graph representations of shape \((n, d)\) 
- Return type
- dict with - residue_featureand- graph_featurefields
 
 
Statistic Feature Engineering#
- class Statistic(type='DDE', hidden_dims=(512,))[source]#
- The statistic feature engineering for protein sequence proposed in Harnessing Computational Biology for Exact Linear B-cell Epitope Prediction. - Parameters
- type (str, optional) – statistic feature. Available feature is - DDE.
- hidden_dims (list of int, optional) – hidden dimensions 
 
 - forward(graph, input, all_loss=None, metric=None)[source]#
- Compute the residue representations and the graph representation(s). - Parameters
- graph (Protein) – \(n\) protein(s) 
- input (Tensor) – input node representations 
- all_loss (Tensor, optional) – if specified, add loss to this tensor 
- metric (dict, optional) – if specified, output metrics to this dict 
 
- Returns
- graph representations of shape \((n, d)\) 
- Return type
- dict with - graph_featurefield
 
 
Physicochemical Feature Engineering#
- class Physicochemical(path, type='moran', nlag=30, hidden_dims=(512,))[source]#
- The physicochemical feature engineering for protein sequence proposed in Prediction of Membrane Protein Types based on the Hydrophobic Index of Amino Acids. - Parameters
- path (str) – path to store feature file 
- type (str, optional) – physicochemical feature. Available features are - moran,- gearyand- nmbroto.
- nlag (int, optional) – maximum position interval to compute features 
- hidden_dims (list of int, optional) – hidden dimensions 
 
 - forward(graph, input, all_loss=None, metric=None)[source]#
- Compute the residue representations and the graph representation(s). - Parameters
- graph (Protein) – \(n\) protein(s) 
- input (Tensor) – input node representations 
- all_loss (Tensor, optional) – if specified, add loss to this tensor 
- metric (dict, optional) – if specified, output metrics to this dict 
 
- Returns
- graph representations of shape \((n, d)\) 
- Return type
- dict with - graph_featurefield
 
 
Normalizing Flows#
GraphAutoregressiveFlow#
- class GraphAutoregressiveFlow(model, prior, use_edge=False, num_layer=6, num_mlp_layer=2, dequantization_noise=0.9)[source]#
- Graph autoregressive flow proposed in GraphAF: a Flow-based Autoregressive Model for Molecular Graph Generation. - Parameters
- model (nn.Module) – graph representation model 
- prior (nn.Module) – prior distribution 
- use_edge (bool, optional) – use edge or not 
- num_flow_layer (int, optional) – number of conditional flow layers 
- num_mlp_layer (int, optional) – number of MLP layers in each conditional flow 
- dequantization_noise (float, optional) – scale of dequantization noise 
 
 - forward(graph, input, edge=None, all_loss=None, metric=None)[source]#
- Compute the log-likelihood for the input given the graph(s). - Parameters
- graph (Graph) – \(n\) graph(s) 
- input (Tensor) – discrete data of shape \((n,)\) 
- edge (Tensor, optional) – edge list of shape \((n, 2)\). If specified, additionally condition on the edge for each input. 
- all_loss (Tensor, optional) – if specified, add loss to this tensor 
- metric (dict, optional) – if specified, output metrics to this dict 
 
 
 - sample(graph, edge=None, all_loss=None, metric=None)[source]#
- Sample discrete data based on the given graph(s). - Parameters
- graph (Graph) – \(n\) graph(s) 
- edge (Tensor, optional) – edge list of shape \((n, 2)\). If specified, additionally condition on the edge for each input. 
- all_loss (Tensor, optional) – if specified, add loss to this tensor 
- metric (dict, optional) – if specified, output metrics to this dict 
 
 
 
- GraphAF#
Self-supervised Models#
InfoGraph#
- class InfoGraph(model, num_mlp_layer=2, activation='relu', loss_weight=1, separate_model=False)[source]#
- InfoGraph proposed in InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization. - Parameters
- model (nn.Module) – node & graph representation model 
- num_mlp_layer (int, optional) – number of MLP layers in mutual information estimators 
- activation (str or function, optional) – activation function 
- loss_weight (float, optional) – weight of both unsupervised & transfer losses 
- separate_model (bool, optional) – separate supervised and unsupervised encoders. If true, the unsupervised loss will be applied on a separate encoder, and a transfer loss is applied between the two encoders. 
 
 - forward(graph, input, all_loss=None, metric=None)[source]#
- Compute the node representations and the graph representation(s). Add the mutual information between graph and nodes to the loss. - Parameters
- graph (Graph) – \(n\) graph(s) 
- input (Tensor) – input node representations 
- all_loss (Tensor, optional) – if specified, add loss to this tensor 
- metric (dict, optional) – if specified, output metrics to this dict 
 
- Returns
- node representations of shape \((|V|, d)\), graph representations of shape \((n, d)\) 
- Return type
- dict with - node_featureand- graph_featurefields
 
 
MultiviewContrast#
- class MultiviewContrast(model, crop_funcs, noise_funcs, num_mlp_layer=2, activation='relu', tau=0.07)[source]#
- Multiview Contrast proposed in Protein Representation Learning by Geometric Structure Pretraining. - Parameters
- model (nn.Module) – node & graph representation model 
- crop_funcs (list of nn.Module) – list of cropping functions 
- noise_funcs (list of nn.Module) – list of noise functions 
- num_mlp_layer (int, optional) – number of MLP layers in mutual information estimators 
- activation (str or function, optional) – activation function 
- tau (float, optional) – temperature in InfoNCE loss 
 
 - forward(graph, input, all_loss=None, metric=None)[source]#
- Compute the graph representations of two augmented views. Each view is generated by randomly picking a cropping function and a noise function. Add the mutual information between two augmented views to the loss. - Parameters
- graph (Graph) – \(n\) graph(s) 
- input (Tensor) – input node representations 
- all_loss (Tensor, optional) – if specified, add loss to this tensor 
- metric (dict, optional) – if specified, output metrics to this dict 
 
- Returns
- node representations of shape \((|V|, d)\), graph representations of shape \((n, d)\) for two augmented views respectively 
- Return type
- dict with - node_feature1,- node_feature2,- graph_feature1and- graph_feature2fields