We developed two network visualizations which display the possible links between different editions of Robinson Crusoe. An edge was created between two books if the Doc2Vec document embeddings shared a cosine similarity greater than
0.9. Red indicates books published in Great Britain whereas blue indicates those from the US. The size of the node represents betweenness centrality. The network visuals below lack node labels and directed edges, so you must view the interactive versions. Click on the hyperlinks (dm & dbow) below to see them.
Doc2Vec has two different approaches to creating document embeddings:
Distributed Bag of Words (DBOW)
creates only document vectors and is better at capturing the context or topic of a document.
Distributed Memory (DM)
creates document vectors in conjunction with word vectors, and takes word order into consideration by iterating through each word and looking at all the words within a sliding “context window”.