rdf2vecgpu.embedders package¶
Submodules¶
rdf2vecgpu.embedders.word2vec module¶
- class OrderAwareCBOW(*args, **kwargs)[source]¶
Bases:
LightningModule- forward(context_words, context_distances, center, negative, neg_distances=None)[source]¶
- Parameters:
context_words – (batch_size, context_size) context word indices
context_distances – (batch_size, context_size) relative distances from center to each context word
center – (batch_size,) center word indices
negative – (batch_size, neg_samples) negative sample indices
neg_distances – (batch_size, neg_samples) distances for negative samples (not used in CBOW)
- class OrderAwareSkipgram(*args, **kwargs)[source]¶
Bases:
LightningModule- forward(center, context, distances, negative, neg_distances=None)[source]¶
- Parameters:
center – (batch_size,) center word indices
context – (batch_size,) context word indices
distances – (batch_size,) relative distances from center to context
negative – (batch_size, neg_samples) negative sample indices
neg_distances – (batch_size, neg_samples) distances for negative samples
rdf2vecgpu.embedders.word2vec_loader module¶
- class CBOWDataModule(*args, **kwargs)[source]¶
Bases:
LightningDataModuleDataloading optimised for a GPU-resident CBOW table.
- Parameters:
context_tensor (torch.Tensor) – 2-D CUDA tensor of shape (n_samples, ctx_size), where each row is the flattened context words for one target.
center_tensor (torch.Tensor) – 1-D CUDA tensor of length n_samples, the target (centre) word indices.
batch_size (int) – Number of (context_vec, center) samples per optimisation step.
- class OrderAwareCBOWDataModule(*args, **kwargs)[source]¶
Bases:
LightningDataModuleDataloading optimised for a GPU-resident order-aware CBOW table. Collates batches of (context_words, context_distances, center_word).
- Parameters:
context_tensor (torch.Tensor)
context_distance_tensor (torch.Tensor)
center_tensor (torch.Tensor)
batch_size (int)
- class OrderAwareSkipGramDataModule(*args, **kwargs)[source]¶
Bases:
LightningDataModuleDataloading optimised for a GPU‑resident order-aware skip‑gram table. Collates batches of (center, context, distance).
- Parameters:
center_tensor (torch.Tensor)
context_tensor (torch.Tensor)
distance_tensor (torch.Tensor)
batch_size (int)
- class ParquetSkipGramDataModule(*args, **kwargs)[source]¶
Bases:
LightningDataModuleDataModule that reads skip-gram pairs from partitioned parquet.
Designed for multi-GPU DDP training: each rank reads its own subset of parquet files and loads them onto its local GPU.
- Parameters:
parquet_path (str) – Directory containing partitioned parquet files (written by dask).
batch_size (int) – Number of (centre, context) pairs per optimisation step.
- class SkipGramDataModule(*args, **kwargs)[source]¶
Bases:
LightningDataModuleDataloading optimised for a GPU‑resident skip‑gram table.
- Parameters:
center_tensor (torch.Tensor) – 1‑D CUDA tensors with the same length.
context_tensor (torch.Tensor) – 1‑D CUDA tensors with the same length.
batch_size (int) – Number of (centre, context) pairs per optimisation step.