The target of this week is to 1) train a causative relation classification model based on LSTM, by using the augmented Train Set; 2) evaluate this classification model using Test Set.
Train the classification model
Imagine you want to classify what kind of event is happening at every point in a movie. It’s unclear how a traditional neural network could use its reasoning about previous events in the film to inform later ones. Recurrent neural networks (RNN) address this issue. They are networks with loops in them, allowing information to persist 1.
Long Short Term Memory (LSTM) is a special kind of RNN, capable of learning long-term dependencies. It is firstly introduced by Hochreiter & Schmidhuber (1997)2. LSTM is explicitly designed to avoid the long-term dependency problem.
Getting feature array of inputting text
Before proceeding, we define the feature arrays into three parts of each entry.
def get_feature_arrays(df: pd.DataFrame) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
"""Get np arrays of upto max_length tokens and person idxs."""
bet = df.text_between
left = df.ele1_left_tokens
right = df.ele2_right_tokens
def pad_or_truncate(l, max_length=40):
return l[:max_length] + [""] * (max_length - len(l))
left_tokens = np.array(list(map(pad_or_truncate, left)))
bet_tokens = np.array(list(map(pad_or_truncate, bet)))
right_tokens = np.array(list(map(pad_or_truncate, right)))
return left_tokens, bet_tokens, right_tokens
Parameters of LSTM
import tensorflow as tf
from tensorflow.keras.layers import (
Bidirectional,
Concatenate,
Dense,
Embedding,
Input,
LSTM,)
def bilstm(
tokens: tf.Tensor,
rnn_state_size: int = 64,
num_buckets: int = 40000,
embed_dim: int = 36,):
ids = tf.strings.to_hash_bucket(tokens, num_buckets)
embedded_input = Embedding(num_buckets, embed_dim)(ids)
return Bidirectional(LSTM(rnn_state_size, activation=tf.nn.relu))(
embedded_input, mask=tf.strings.length(tokens))
def get_model(
rnn_state_size: int = 64, num_buckets: int = 40000, embed_dim: int = 12
) -> tf.keras.Model:
"""
Return LSTM model for predicting label probabilities.
Args:
rnn_state_size: LSTM state size.
num_buckets: Number of buckets to hash strings to integers.
embed_dim: Size of token embeddings.
Returns:
model: A compiled LSTM model.
"""
left_ph = Input((None,), dtype="string")
bet_ph = Input((None,), dtype="string")
right_ph = Input((None,), dtype="string")
left_embs = bilstm(left_ph, rnn_state_size, num_buckets, embed_dim)
bet_embs = bilstm(bet_ph, rnn_state_size, num_buckets, embed_dim)
right_embs = bilstm(right_ph, rnn_state_size, num_buckets, embed_dim)
layer = Concatenate(1)([left_embs, bet_embs, right_embs])
layer = Dense(64, activation=tf.nn.relu)(layer)
layer = Dense(32, activation=tf.nn.relu)(layer)
probabilities = Dense(2, activation=tf.nn.softmax)(layer)
model = tf.keras.Model(inputs=[bet_ph, left_ph, right_ph], outputs=probabilities)
model.compile(tf.train.AdagradOptimizer(0.1), "categorical_crossentropy")
return model
### Training our End classification Model (LSTM)
X_train = get_feature_arrays(df_train)
model = get_model()
batch_size = 64
model.fit(X_train, probs_train_filtered, batch_size=batch_size, epochs=get_n_epochs())
Evaluate the classification model
### Evaluating the trained model(LSTM) on the test set
X_test = get_feature_arrays(df_test)
probs_test = model.predict(X_test)
preds_test = probs_to_preds(probs_test)
The results show:
Test accuracy when trained with soft labels: 0.9967462832074092
Test precision when trained with soft labels: 0.0
Test recall when trained with soft labels: 0.0
Test F1 when trained with soft labels: 0.0
Test ROC-AUC when trained with soft labels: 0.5355835641424763
We found that the accuracy is really high, but the precision, recall and F-1 are rather low. In the future plan, we plan to decrease the number of negative samples and apply with a simple classification model (i.e. logistic regression).