NodePiece - Tokenizing Knowledge Graphs

less than 1 minute read


A new blogpost on our recent research idea: if nodes in a graph are “words”, can we design a fixed-size vocab of “sub-word” units and go beyond shallow embedding? We propose NodePiece, a compositional tokenization approach for dramatic KG vocabulary size reduction, and find that in some tasks, you don’t even need trainable node embeddings! Furthermore, NodePiece is inductive by design and can encode unseen nodes using the same backbone vocabulary.