AI Tools from Intel
Find answers to your toolkit installation, configuration, and get-started questions.

The problem of saving and reading bigdl.Module

clare_cn
Beginner
219 Views

I am currently using Bigdl Module Build Model. However, I encountered a problem that may have been caused by BatchNormalization.

I used the following code to construct the model:

def createMixModel(userInputDim: Int, itemInputDim: Int, shareDims: Array[Int], dcnCrossLayers: Int, hiddenDims: Array[Int]): Module[Float] = {
val l2Regularizer: Regularizer[Float] = L2Regularizer(0.05)
val historyInput = Input[Float](inputShape = Shape(3, 4))
val userInput = Input[Float](inputShape = Shape(userInputDim))
val itemInput = Input[Float](inputShape = Shape(itemInputDim))

// DIN
val expandedUserHistory = TimeDistributed[Float](Dense[Float](itemInputDim).asInstanceOf[KerasLayer[Activity, Tensor[Float], Float]]).inputs(historyInput)
val expandedItem = RepeatVector(3).inputs(itemInput)
val attentionScores = TimeDistributed[Float](Merge[Float](mode = "dot").asInstanceOf[KerasLayer[Activity, Tensor[Float], Float]]).inputs(expandedUserHistory, expandedItem)
val expandedAttentionScores = Reshape(Array(3, itemInputDim)).inputs(TimeDistributed[Float](RepeatVector[Float](itemInputDim).asInstanceOf[KerasLayer[Activity, Tensor[Float], Float]]).inputs(attentionScores))
val weightedUserHistory = Merge[Float](mode = "mul").inputs(expandedUserHistory, expandedAttentionScores)
val dinOutput = GlobalAveragePooling1D[Float]().inputs(weightedUserHistory)

// DCN
var userItemInput = BatchNormalization[Float]().inputs(Merge[Float](mode = "concat").inputs(itemInput, userInput))
for (dim <- shareDims) {
userItemInput = Activation[Float]("relu").inputs(Dense[Float](dim).inputs(userItemInput))
}

var deepLayer = userItemInput
for (dim <- hiddenDims) {
deepLayer = Activation[Float]("relu").inputs(Dense[Float](dim).inputs(deepLayer))
}

var crossInput = userItemInput
val x0 = userItemInput
for (_ <- 1 to dcnCrossLayers) {
val dotProduct = Merge[Float](mode = "mul").inputs(crossInput, x0)
val linear = Dense[Float](shareDims.last, bias = false).inputs(dotProduct)
val added = Merge[Float](mode = "sum").inputs(linear, crossInput)
crossInput = added
}

val dcnOutput = Merge[Float](mode = "concat").inputs(deepLayer, crossInput)

// ESMM
var crInput = BatchNormalization[Float]().inputs(Merge[Float](mode = "concat").inputs(dinOutput, dcnOutput))
for (dim <- shareDims) {
crInput = Activation[Float]("relu").inputs(Dense[Float](outputDim = dim).inputs(crInput))
}

var ctrLayer = crInput
for (dim <- hiddenDims) {
ctrLayer = Activation[Float]("relu").inputs(Dense[Float](outputDim = dim, wRegularizer = l2Regularizer).inputs(ctrLayer))
}
val ctrOutput = Activation[Float]("sigmoid").inputs(Dense[Float](1).inputs(ctrLayer))

var cvrLayer = crInput
for (dim <- hiddenDims) {
cvrLayer = Activation[Float]("relu").inputs(Dense[Float](outputDim = dim, wRegularizer = l2Regularizer).inputs(cvrLayer))
}
val cvrOutput = Activation[Float]("sigmoid").inputs(Dense[Float](1).inputs(cvrLayer))

var rtiLayer = crInput
for (dim <- hiddenDims) {
rtiLayer = Activation[Float]("relu").inputs(Dense[Float](outputDim = dim, wRegularizer = l2Regularizer).inputs(rtiLayer))
}
val rtiOutput = Activation[Float]("sigmoid").inputs(Dense[Float](1).inputs(rtiLayer))

val model = Model[Float](input = Array(historyInput, userInput, itemInput), output = Array(ctrOutput, cvrOutput, rtiOutput))
model
}


Then train the model:

val optimizer = Optimizer(
model = model,
dataset = trainData,
criterion = criterion
).setOptimMethod(new Adam[Float]())
.setEndWhen(Trigger.or(Trigger.maxEpoch(maxEpoch), Trigger.minLoss(minLoss)))
val trainedModel = optimizer.optimize()

If I use the trainedModel directly for prediction, the prediction results look normal
But if I save and reload the model, the predicted results will always be 1.0.

trainedModel.saveModel(path = s"hdfs://xxx", weightPath = null, overWrite = true)
val model = Module.loadModule[Float](s"hdfs://xxx")


I suspect it's a BatchNormalization issue because I've tried removing it and it's normal, but the prediction results are not very good.

May I ask what specific problem this is? Thank you very much!

0 Kudos
2 Replies
clare_cn
Beginner
138 Views

The problem has been resolved. The dinOutput issue caused a gradient explosion, and runningVar resulted in Infinity
A layer of BatchNormalization on the original dinOutput solves the problem
But what is puzzling is why this problem is not thrown out when using the trained model for prediction directly, which led me to always think it was a problem of model saving and loading

0 Kudos
Ying_H_Intel
Employee
80 Views

Hi,


thank you a lot for report the issue here and catch the . As BigDL were running as opensource , please feel free to submit the issue there Issues · intel-analytics/ipex-llm, and BigDL developer team will work there


thanks


0 Kudos
Reply